Talking to retail executives in 2010, Rama Ramakrishnan came to two conclusions. First, although retail systems that provide personalized recommendations to customers get a lot of attention, these systems often deliver little in the way of results for retailers. Second, for many businesses, most customers only shopped once or twice a year, so businesses didn’t know much about them.
“But by very carefully noting a customer’s interactions with a retailer or e-commerce site, we can create a very nice and detailed composite picture of what that person does and what they’re interested in,” says Ramakrishnan, teacher. of practice at the MIT Sloan School of Management. “Once you have that, you can apply proven algorithms from machine learning.”
These achievements led Ramakrishnan to found CQuotient, a startup whose software has now become the basis of Salesforce’s widely adopted AI e-commerce platform. “On Black Friday alone, CQuotient technology likely sees and interacts with over a billion shoppers in a single day,” he says.
After a highly successful entrepreneurial career, Ramakrishnan returned in 2019 to MIT Sloan, where he had earned a master’s and doctorate in operations research in the 1990s. He teaches students “not only how these amazing technologies work, but also how to use these technologies and use them pragmatically in the real world,” he says.
Additionally, Ramakrishnan enjoys participating in MIT executive education. “It’s a great opportunity for me to pass on the things I’ve learned, but also, just as importantly, to find out what these senior executives are thinking, to guide them and push them in the right direction,” says- he.
For example, executives are rightly concerned about the need for massive amounts of data to train machine learning systems. It can now guide them to a multitude of pre-trained models for specific tasks. “The ability to take these pre-trained AI models and adapt them very quickly to your particular business problem is an incredible advancement,” says Ramakrishnan.
Rama Ramakrishnan – Using AI in Real-World Applications for Smart Working
Video: MIT Industrial Liaison Program
Understanding AI Categories
“AI aims to give computers the ability to perform cognitive tasks that typically only humans can accomplish,” he says. Understanding the history of this complex and supercharged landscape makes harnessing technologies easier.
The traditional approach to AI, which essentially solved problems by applying if/then rules learned from humans, has proven useful for relatively few tasks. “One of the reasons is that we can do many things effortlessly, but if we are asked to explain how we do them, we cannot actually explain how we do them,” comments Ramakrishnan. Additionally, these systems can be confused by new situations that do not match the rules written into the software.
Machine learning takes a radically different approach, with software fundamentally learning by example. “You give it lots of examples of inputs and outputs, questions and answers, tasks and answers, and you make the computer automatically learn how to go from input to output,” explains- he. Credit scoring, lending decision making, disease prediction, and demand forecasting are among the many tasks conquered by machine learning.
But machine learning only worked well when the input data was structured, such as in a spreadsheet. “If the input data was unstructured, such as images, videos, audio files, ECGs or x-rays, it was not very efficient to move from that data to an intended output,” explains Ramakrishnan. This means that humans had to manually structure unstructured data to train the system.
Around 2010, deep learning began to overcome this limitation, providing the ability to work directly with unstructured input data, he explains. Based on a long-standing AI strategy known as neural networks, deep learning has become practical due to the global flood of data, the availability of extraordinarily powerful parallel processing hardware called processing units graphics processing (originally invented for video games) and advances in algorithms and mathematics.
Finally, in deep learning, generative AI software that emerged last year can create unstructured results, such as human-sounding text, images of dogs, and three-dimensional models. Large language models (LLMs) such as OpenAI’s ChatGPT flow from text inputs to text outputs, while text-to-image models such as OpenAI’s DALL-E can produce realistic-looking images.
Rama Ramakrishnan – Taking note of little data to improve customer service
Video: MIT Industrial Liaison Program
What Generative AI Can (and Can’t) Do
Trained on the incredibly vast textual resources of the Internet, “the fundamental ability of an LLM is to predict the most likely and most plausible next word,” says Ramakrishnan. “Then it associates the word with the original sentence, predicts the next word again, and continues to do so.”
“To the surprise of many, including many researchers, an LLM can do very complicated things,” he says. “He can compose beautifully coherent poetry, write episodes of Seinfeld, and solve certain types of reasoning problems. It’s truly remarkable how predicting the next word can lead to these amazing abilities.
“But you always have to keep in mind that what it’s doing is not so much finding the right answer to your question as finding a plausible answer to your question,” Ramakrishnan emphasizes. Its content may be factually inaccurate, irrelevant, toxic, biased or offensive.
This places responsibility on users to ensure that the output is correct, relevant and useful for the task at hand. “You need to make sure there is a way to check for errors in the result and correct them before it is published,” he says.
Intense research is underway to find techniques to address these shortcomings, adds Ramakrishnan, who expects many innovative tools to do just that.
Finding the right corporate roles for LLMs
Given the astonishing advancements in LLMs, how should the industry consider applying the software to tasks such as content generation?
First, Ramakrishnan advises, consider the costs: “Is it a much less expensive effort to proofread a draft than to create the whole thing?” Second, if the LLM makes a mistake that goes unnoticed and the erroneous content is leaked to the outside world, can you live with the consequences?
“If you have an application that meets both of these criteria, then it makes sense to do a pilot project to see if these technologies can actually help you with this particular task,” says Ramakrishnan. He emphasizes the need to view the pilot project as an experiment rather than a normal IT project.
Currently, software development is the most mature enterprise LLM application. “ChatGPT and other LLMs are text input and output, and software is just text output,” he says. “Programmers can switch from incoming text in English to outgoing text in Python, as well as from English to English or from English to German. There are many tools that help you write code using these technologies.
Of course, programmers must ensure that the result does its job correctly. Fortunately, software development already provides an infrastructure for testing and verifying code. “It’s a great place,” he says, “where it’s much cheaper to let technology write code for you, because you can check it very quickly.”
Another major use of the LLM is content generation, such as writing marketing copy or e-commerce product descriptions. “Again, it can be much cheaper to proofread the draft of ChatGPT than to write the whole thing,” says Ramakrishnan. “However, companies need to be very careful that there is a human in the loop.”
LLMs are also spreading rapidly as internal tools for searching corporate documents. Unlike conventional search algorithms, an LLM chatbot can provide a conversational search experience because it remembers every question you ask. “But then again, it sometimes makes things up,” he says. “When it comes to chatbots for external customers, we are still in the early stages, due to the risk of saying the wrong thing to the customer. »
Overall, Ramakrishnan notes, we live in a remarkable time to grapple with the rapidly evolving potentials and pitfalls of AI. “I help companies understand how to take these very transformative technologies and implement them, to make products and services much smarter, employees much more productive and processes much more efficient,” he says.