A single photograph offers a glimpse into the creator's world: their interests and feelings about a subject or space. But what about the creators behind the technologies that help make these images possible?
Jonathan Ragan-Kelley, an associate professor in MIT's Department of Electrical Engineering and Computer Science, is one such person, who designed everything from tools for visual effects in films to the widely used Halide programming language. used in industry for photo editing and processing. As a researcher at MIT-IBM's Watson AI Lab and Computer Science and Artificial Intelligence Lab, Ragan-Kelley specializes in high-performance, domain-specific programming languages and machine learning that enable 2D and 3D graphics, visual effects and computational photography. .
“The main goal of our research is to develop new programming languages that make it easier to write programs that run really efficiently on the increasingly complex hardware that's in your computer today,” says Ragan-Kelley. “If we want to continue to increase the computing power that we can actually harness for real-world applications – from graphics and visual computing to AI – we need to change the way we program. »
Find a happy medium
Over the past two decades, chip designers and programming engineers have seen a slowdown Moore's law and a marked shift from general-purpose computing on processors to more varied and specialized computing and processing units such as GPUs and accelerators. This transition comes with a trade-off: the ability to run general-purpose code somewhat slowly on processors, for faster, more efficient hardware that requires the code to be heavily adapted and mapped to programs and custom compilers. Newer hardware with improved programming can better support applications such as high-bandwidth cellular radio interfaces, decoding of highly compressed video for streaming, and graphics and video processing on power-constrained cell phone cameras , to name just a few applications.
“Our job is largely about unleashing the power of the best hardware we can build to deliver as much performance and computational efficiency as possible for these types of applications, in a way that traditional programming languages don't allow. not. »
To achieve this, Ragan-Kelley divides her work into two directions. First, it sacrifices generality to capture the structure of particular and important computing problems and exploits it for better computing efficiency. This can be seen in the image processing language Halide, which he co-developed and which helped transform the image editing industry in programs like Photoshop. Additionally, because it is specifically designed to quickly handle dense, regular arrays of numbers (tensors), it also works well for neural network calculations. The second goal targets automation, specifically how compilers map programs to hardware. One of these projects with the MIT-IBM Watson AI Lab leverages Exo, a language developed in Ragan-Kelley's group.
Over the years, researchers have worked hard to automate coding with compilers, which can be a black box; however, there is still a great need for explicit control and tuning by performance engineers. Ragan-Kelley and his group develop methods that overlap each technique, balancing tradeoffs to achieve efficient, resource-efficient programming. At the heart of many high-performance programs such as video game engines or cell phone camera processing are cutting-edge systems that are largely manually optimized by human experts in detailed low-level languages like C, C++ and the assembler. Here, engineers make specific choices about how the program will run on the hardware.
Ragan-Kelley notes that programmers can opt for “very careful, very unproductive, very dangerous low-level code,” which could introduce bugs, or for “safer, more productive, higher-level programming interfaces,” which do not have the ability to make fine adjustments in a compiler on how the program is executed and generally provide lower performance. His team is therefore trying to find common ground. “We're trying to figure out how to control the key issues that human performance engineers want to be able to control,” says Ragan-Kelley. “So we're trying to create a new class of languages that we call user-schedulable languages that provide safer, higher-level handles to control what the compiler does or control how the program is optimized.
Unlocking Hardware: High-Level and Underserved Methods
Ragan-Kelley and his research group are tackling this problem through two streams of work: applying modern machine learning and AI techniques to automatically generate optimized schedules, interfacing with the compiler, to achieve better compiler performance. Another uses “exocompilation” on which he works in the laboratory. He describes this method as a way to “turn the compiler inside out”, with a compiler skeleton with controls for human guidance and customization. Additionally, its team can add its custom schedulers, which can help target specialized hardware like IBM Research's machine learning accelerators. Applications of this work run the gamut: computer vision, object recognition, speech synthesis, image synthesis, speech recognition, text generation (large language models), and more.
One of his major projects with the laboratory goes even further, approaching the work from a systems perspective. In work led by his advisor and lab intern William Brandon, in collaboration with lab researcher Rameswar Panda, Ragan-Kelley's team is rethinking large language models (LLMs), finding ways to slightly modify the calculation and programming architecture of the model so that Transformer AI-based models can run more efficiently on AI hardware without sacrificing accuracy. According to Ragan-Kelley, their work deviates from standard ways of thinking in significant ways, with potentially large gains in cost reduction, improved capabilities, and/or reduction of the LLM to require less memory and function on smaller computers.
It's this more forward-thinking, in computational and hardware efficiency, that Ragan-Kelley excels at and sees value in, particularly in the long term. “I think there are areas (of research) that need to be explored further, but are well-established, or obvious, or enough of conventional wisdom that many people are already pursuing them or pursuing them,” says -he. “We're trying to find ideas that have both high leverage to have a practical impact on the world and at the same time are things that wouldn't necessarily happen or that I think are poorly leveraged compared to their potential by the rest of the community. »
The course he now teaches, 6.106 (Software Performance Engineering), is an example. About 15 years ago, the move from single to multiple processors caused many university programs to begin teaching parallelism. But, as Ragan-Kelley explains, MIT realized the importance of students understanding not only parallelism, but also optimizing memory and using specialized hardware to achieve the best possible performance.
“By changing the way we program, we can unlock the computing potential of new machines and enable users to continue to rapidly develop new applications and ideas that can exploit this increasingly complex and demanding hardware.” »