For today's 'Five Minutes With' show, we sat down with Gemma Jennings, Product Manager on the Applied team, who led a session on Vision Language Models at AI Summit – one of the world's largest AI events for business.
At DeepMind…
I'm part of the Applied team, which helps bring DeepMind technology to the outside world through Alphabet and Google products and solutions, such as with WaveNet and Google Assistant, Maps and Search. As a product manager, I act as a bridge between the two organizations, working closely with both teams to understand research and how people can use it. Ultimately, we want to be able to answer the question: how can we use this technology to improve the lives of people around the world?
I am particularly excited about our portfolio of sustainability work. We've already helped reduce the amount of energy needed to cool Google's data centers, but we can do much more to make a greater transformative impact on sustainability.
Before DeepMind…
I worked at John Lewis Partnership, a British department store with deep roots in its DNA. I've always loved being part of a company with societal meaning, so DeepMind's mission to solve intelligence problems to advance science and benefit humanity really resonated with me. I was intrigued to learn how this philosophy would play out within a research-driven organization – and within Google, one of the largest companies in the world. Adding this to my academic background in experimental psychology, neuroscience, and statistics, DeepMind checked all the boxes.
The AI Summit…
This is my first in-person conference in almost three years, so I'm really looking forward to meeting people in the same industry as me and hearing what other organizations are working on.
I look forward to attending some conferences in the quantum computing field to learn more. It has the potential to drive the next big paradigm shift in computing power, opening up new use cases for the application of AI around the world and enabling us to work on larger, more complex problems. more complex.
My work involves many deep learning methods and it's always exciting to hear about the different ways people are using this technology. Currently, these types of models require training on large amounts of data, which can be expensive, time-consuming, and resource-intensive given the amount of computation required. So where do we go from here? And what does the future of deep learning look like? These are the kinds of questions I seek to answer.
I presented…
Image recognition using deep neural networks, our recent published research on Vision Language Models (VLM). For my presentation, I discussed recent advances in merging large language models (LLMs) with powerful visual representations to advance the state of the art in image recognition.
This fascinating research has many potential real-world uses. It could, one day, serve as an assistant to support classroom and informal learning in schools, or help blind or visually impaired people see the world around them, transforming their daily lives.
I want people to leave the session…
With a better understanding of what happens after the research breakthrough is announced. There is so much amazing research going on, but we need to think about what comes next, for example, what global problems could we help solve? And how can we use our research to create products and services that have purpose?
The future is bright, and I look forward to discovering new ways to apply our groundbreaking research to benefit millions of people around the world.