IT is at an inflection point. Moore’s Law, which predicts that the number of transistors on a microchip will double every year, is slowing down due to the physical limitations of putting more transistors on affordable microchips. These increases in computing power are slowing as demand increases for high-performance computers capable of supporting increasingly complex artificial intelligence models. This drawback has led engineers to explore new methods to expand the computing capabilities of their machines, but the solution remains unclear.
Photonic computing is a potential remedy to the increasing computational demands of machine learning models. Instead of using transistors and wires, these systems use photons (microscopic particles of light) to perform computational operations in the analog domain. Lasers produce these small beams of energy that travel at the speed of light like a spaceship flying at warp speed in a science fiction movie. When photonic computing cores are added to programmable accelerators like a network interface card (NIC and its augmented counterpart, SmartNIC), the resulting hardware can be plugged in to turbocharge a standard computer.
MIT researchers have now harnessed the potential of photonics to accelerate modern computing by demonstrating its machine learning capabilities. Dubbed “Lightning,” their photonic-electronic reconfigurable SmartNIC helps deep neural networks – machine learning models that mimic the way the brain processes information – perform inference tasks such as image recognition and language generation in chatbots such as ChatGPT. The prototype’s innovative design enables impressive speeds, creating the first photonic computing system capable of meeting machine learning inference demands in real time.
Despite their potential, one of the major challenges in implementing photonic computing devices is that they are passive, meaning they do not have memory or instructions to control data flows, unlike to their electronic counterparts. Previous photonic computing systems faced this bottleneck, but Lightning removes this obstacle to ensure the smooth movement of data between electronic and photonic components.
“Photonic computing has shown significant benefits in accelerating large linear calculation tasks such as matrix multiplication, while it needs electronics to take care of the rest: memory access, nonlinear calculations and conditional logic. This creates a significant amount of data to be exchanged between photonics and electronics to perform real-world computing tasks, such as a request for machine learning inference,” says Zhizhen Zhong, a postdoctoral fellow in the associate professor’s group at MIT. Manya Ghobadi at MIT Computer Science. and Artificial Intelligence Laboratory (CSAIL). “Controlling this flow of data between photonics and electronics was the Achilles heel of previous cutting-edge photonics computing work. Even if you have a super-fast photonic computer, you need enough data to power it without blocking. Otherwise, you have a supercomputer idling without doing any reasonable calculations.
Ghobadi, an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) and a CSAIL member, and his colleagues in the group are the first to identify and solve this problem. To achieve this feat, they combined the speed of photonics and the data flow control capabilities of electronic computers.
Before Lightning, photonic and electronic computing systems operated independently and spoke different languages. The team’s hybrid system tracks required computational operations on the data path using a reconfigurable count-action abstraction, which connects photonics to a computer’s electronic components. This programming abstraction functions as a unified language between the two, controlling access to the data streams passing through it. The information carried by electrons is translated into light in the form of photons, which operate at the speed of light to facilitate the accomplishment of an inference task. Then, the photons are converted back into electrons to transmit the information to the computer.
By seamlessly connecting photonics to electronics, the new counting-action abstraction makes Lightning’s fast, real-time computing frequency possible. Previous attempts used a stop-and-go approach, meaning the data would be hampered by much slower control software that made all decisions about its movements. “Building a photonic computing system without abstraction from counting programming is like trying to steer a Lamborghini without knowing how to drive,” says Ghobadi, lead author of the paper. “What would you do? You probably have a driving manual in one hand, then press the clutch, then check the manual, then release the brake, then check the manual, and so on. It’s about an intermittent operation because for every decision you have to consult a higher level entity to tell you what to do. But that’s not how we drive; we learn to drive and then use muscle memory without consulting the manual or the rules for driving. Our count-action programming abstraction acts like muscle memory in Lightning, transparently driving the system’s electrons and photons at runtime.
An environmentally friendly solution
Machine learning services performing inference-based tasks, such as ChatGPT and BERT, currently require heavy computing resources. Not only are they expensive, but some estimates show that ChatGPT requires 3 million dollars per month to operate – but they are also damaging to the environment, potentially emitting more double the average person’s carbon dioxide. Lightning uses photons that travel faster than electrons in wires, while generating less heatallowing it to calculate at a faster frequency while being more energy efficient.
To measure this, the Ghobadi group compared its device to standard graphics processing units, data processing units, SmartNICs and other accelerators by synthesizing a Lightning chip. The team observed that Lightning was more energy efficient when performing inference queries. “Our synthesis and simulation studies show that Lightning reduces the power consumption of machine learning inference by orders of magnitude compared to state-of-the-art accelerators,” says Mingran Yang, a graduate student in Ghobadi’s lab and co-author of the article. By being a more cost-effective and faster option, Lightning presents a potential upgrade for data centers to reduce the carbon footprint of their machine learning model while speeding up inference response time for users.
Other authors of the paper are MIT CSAIL postdoctoral fellow Homa Esfahanizadeh and undergraduate student Liam Kronman, as well as MIT EECS associate professor Dirk Englund and three recent department graduates: Jay Lang ’22, MEng ’23; Christian Williams ’22, MEng ’23; and Alexander Sludds ’18, MEng ’19, PhD ’23. Their research was supported, in part, by the DARPA FastNICs program, the ARPA-E ENLITENED program, the DAF-MIT AI Accelerator, the U.S. Army Research Office through the Institute for Soldier Nanotechnologies, grants from the National Science Foundation (NSF), the NSF Center for Quantum Networks, and Sloan Fellowship.
The group will present its findings this month to the Association for Computing Machinery’s Special Interest Group on Data Communications (SIGCOMM).