Research
In July 2022, we released AlphaFold protein structure predictions for almost every cataloged protein known to science. Read the latest blog here.
Today, I am incredibly proud and excited to announce that DeepMind is making a significant contribution to humanity's understanding of biology.
When we announced AlphaFold 2 Last December, it was hailed as a solution to the 50-year-old problem of protein misfolding. Last week we published the scientific article And source code explaining how we created this very innovative system, and today we share high quality predictions for the shape of every protein in the human body, as well as proteins from 20 other organisms that scientists rely on for their research.
As researchers seek cures for diseases and search for solutions to other major problems facing humanity – including antibiotic resistance, microplastic pollution and climate change – they will benefit from new insights into protein structure . Proteins are like tiny, exquisite biological machines. Just as the structure of a machine tells you what it does, the structure of a protein helps us understand its function. Today we share a wealth of information it doubles humanity's understanding of the human proteomeand reveals protein structures found in 20 other biologically important organisms, from E. coli to yeast, and from fruit flies to mice.
This will be one of the largest datasets since the human genome was mapped.
Ewan Birney, Deputy CEO of EMBL and Director of EMBL-EBI
As a powerful tool that supports the efforts of researchers, we believe this is the most significant contribution AI has made to the advancement of scientific knowledge to date and is an excellent example of the benefits that AI can bring to humanity. This knowledge will underpin many exciting future advances in our understanding of biology and medicine. Thanks to five years of tireless work and a lot of ingenuity on the part of the AlphaFold team, and close collaboration in recent months with our partners in EMBL European Bioinformatics Institute (EMBL-EBI)we are able to share this enormous and valuable resource with the world.
Proteins are exquisite biological machines, their three-dimensional structures often aesthetic and functional as the building blocks of life.
This latest work is based on advertisement We did this last December, at the CASP14 conference, when DeepMind unveiled a radical new version of our AlphaFold system, which was recognized by the evaluation organizers as a solution to the 50-year-old grand challenge of to understand the 3D structure of proteins. Experimentally determining protein structures is a time-consuming and painstaking task, but AlphaFold demonstrated that AI can accurately predict the shape of a protein, at scale and in minutes, down to atomic precision. HAS CASPWe are committed to sharing our methods and providing broad access to this body of knowledge.
Improvements to median prediction accuracy in the Free Modeling category for the top team in each CASP, measured as the best of 5 GDTs.
This month we completed the enormous hard work to deliver on this commitment. We have published two peer-reviewed articles in Nature (1,2) And AlphaFold open source code. Today, in partnership with EMBL-EBIwe are incredibly proud to launch the AlphaFold Protein Structure Databasewhich provides the most comprehensive and accurate picture of the human proteome to date, more than doubling humanity's accumulated knowledge of high-precision human protein structures.
In addition to the human proteome (all ~20,000 proteins expressed by the human genome), we offer open access to the proteomes of 20 other biologically important organisms, totaling more than 350,000 protein structures. Research on these organisms has been the subject of countless research articles and many major breakthroughs, and has resulted in a deeper understanding of life itself. In the coming months, we plan to significantly expand coverage to almost every sequenced protein known to science – more than 100 million works covering most of the world UniProt reference database. It’s a veritable protein almanac of the world. And the system and database will be periodically updated as we continue to invest in future improvements to AlphaFold.
Most excitingly, in the hands of scientists around the world, this new protein almanac will enable and accelerate research that will advance our understanding of these building blocks of life. Already, through our first collaborations, we have seen promising signals from researchers using AlphaFold in their own work. For example, the Drugs for Neglected Diseases Initiative (DNDi) advanced its research into life-saving remedies for diseases that disproportionately affect the poorest regions of the world, and the Enzyme Innovation Center from the University of Portsmouth (CEI) is using AlphaFold to help design faster enzymes to recycle some of our most polluting single-use plastics. For scientists who rely on experimental determination of protein structure, AlphaFold's predictions have helped accelerate their research. As another example, a team from University of Colorado Boulder shows promise in using AlphaFold's predictions to study antibiotic resistance, while a group from University of California San Francisco used them to increase their understanding of the biology of SARS-CoV-2. And this is just the beginning of what we hope will be a revolution in structural bioinformatics. With AlphaFold in the world, there is now a treasure trove of data waiting to be transformed into future advancements.
AlphaFold opens new research horizons and it is inspiring to see powerful cutting-edge AI enabling work on diseases that are concentrated almost exclusively in poor populations.
Ben Perry, Head of Open Innovation Discovery, Drugs for Neglected Diseases Initiative (DNDi)
For DeepMind's AlphaFold team, this work represents the culmination of five years of enormous effort, including overcoming numerous setbacks creatively, resulting in a host of sophisticated new algorithmic innovations that were all necessary to finally resolve the problem. It builds on the discoveries of generations of scientists, from the early pioneers of protein imaging and crystallography to the thousands of prediction scientists and structural biologists who have since spent years experimenting with proteins. Our dream is that AlphaFold, by providing this fundamental understanding, will help countless other scientists in their work and open up entirely new avenues of scientific discovery.
What took us months and years, AlphaFold was able to do in a weekend.
Professor John McGeehan, Professor of Structural Biology and Center Director, Center for Enzyme Innovation (CEI) at the University of Portsmouth
At DeepMind, our thesis has always been that artificial intelligence can dramatically accelerate breakthroughs in many scientific fields, and thus advance humanity. We built AlphaFold and the AlphaFold Protein Structure Database supporting and elevating the efforts of scientists around the world in the important work they do. We believe AI has the potential to revolutionize the way science is done in the 21st century, and we look forward to the discoveries AlphaFold could help the scientific community unlock next.
To learn more, visit Nature to read our peer-reviewed articles describing our complete methodand the human proteome. You can learn more about them in our technical blog. If you want to explore our system, here is the open source code for AlphaFold And Colab notebook to execute individual sequences. To explore our structures, EMBL-EBI, the world leader in biological data, hosts them in a searchable database which is open and free to all.
We would love to hear your feedback and understand how AlphaFold was helpful in your research. Share your stories on alphafold@deepmind.com.