1.5 Flash excels in summarization, chat applications, image and video captioning, extracting data from long documents and tables, and much more. This is because it was trained by 1.5 Pro through a process called “distillation”, in which the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.
Learn more about 1.5 Flash on the Gemini Technology Pageand discover 1.5 Flash availability and pricing. We will share more details soon in an updated Gemini 1.5 technical report.
Significant improvement in version 1.5 Pro
Over the past few months, we've significantly improved version 1.5 Pro, our best model in terms of overall performance across a wide range of tasks.
Beyond expanding his pop-up to 2 million tokens, we improved his code generation, logical reasoning and planning, multi-turn conversation, and audio and image understanding with to data and algorithmic advances. We see strong improvements against public and internal benchmarks for each of these tasks.
1.5 Pro can now follow increasingly complex and nuanced instructions, including those that specify product-level behavior involving role, format, and style. We've improved control over template responses for specific use cases, like creating a chat agent's persona and response style or automating workflows through multiple function calls. And we allowed users to drive the behavior of the model by setting system instructions.
We added audio understanding in the Gemini API And Google AI Studio, so 1.5 Pro can now reason about image and audio for videos uploaded to Google AI Studio. And we are now integrating version 1.5 Pro into Google products, including Advanced Gemini and in Workspace applications.
Learn more about version 1.5 Pro on the site Gemini Technology Page. More details will be available soon in our updated Gemini 1.5 technical report.
Gemini Nano includes multimodal inputs
Gemini Nano expands beyond text-only entries to include images as well. Starting with Pixel, applications using Gemini Nano with multimodality will be able to understand the world the way people do, not only through text, but also through sight, hearing and spoken language.
Learn more about Gemini 1.0 Nano on Android.