A team of researchers from the University of Washington collaborated to address challenges in protein sequence design using a deep learning-based protein sequence design method, LigandMPNN. The model targets enzymes and small molecule binder and sensor designs. Existing physics approaches like Rosetta and deep learning-based models like ProteinMPNN are unable to explicitly model atoms and non-protein molecules, hindering the precise design of protein sequences that interact with small molecules, nucleotides, and metals.
The mentioned methods neglect the explicit consideration of atoms and non-protein molecules, which is crucial for the design of enzymes, protein-DNA/RNA interactions, and protein-small molecule and protein-metal linkers. The proposed solution, LigandMPNN, builds on the ProteinMPNN architecture but explicitly integrates the full non-protein atomic context. LigandMPNN introduces protein-ligand graphs, leveraging neural networks to model interactions and encode the geometries of ligand atoms. The modification leads to LigandMPNN to generate sequences and side chain conformations tailored to specific non-protein contexts.
LigandMPNN uses a graph-based approach, treating protein residues as nodes and incorporating nearest neighbor edges based on Cα-Cα distances. The model introduces protein-ligand graphs to capture interactions, with protein residues and ligand atoms as nodes and edges representing geometric relationships. The ligand graph enhances information transfer to the protein via ligand-protein edges.
The experiment demonstrated that LigandMPNN and its side chain packaging have better performance compared to Rosetta and ProteinMPNN, with higher sequence recovery for residues interacting with small molecules, nucleotides and metals with 20-30%. more precision and shows its effectiveness in detailed structural design. LigandMPNN also beats existing models in speed and efficiency. LigandMPNN is approximately 250 times faster than Rosetta.
In conclusion, LigandMPNN fills a critical gap in existing protein sequence design methods by explicitly including atoms and non-protein molecules. LigandMPNN’s graph-based approach exhibits a notable performance improvement, leading to higher sequence recovery and superior side-chain compaction accuracy around small molecules, nucleotides, and metals. LigandMPNN has achieved outstanding performance in designing small molecules and DNA-binding proteins with high affinity and specificity, which would greatly facilitate protein engineering.
Check Paper. All credit for this research goes to the researchers of this project. Also don’t forget to follow us on Twitter And Google News. Join our SubReddit 36k+ ML, 41,000+ Facebook communities, Discord ChannelAnd LinkedIn Groops.
If you like our work, you will love our bulletin..
Don’t forget to join our Telegram channel
Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from Indian Institute of Technology (IIT), Kharagpur. She is passionate about technology and has a keen interest in the scope of software applications and data science. She is always reading about developments in different areas of AI and ML.