According to Nature news, the ESMfold model developed by Meta recently predicted 617 million unknown protein structures of bacteria, viruses, and other microorganisms.
These protein structures are different from any known structures and do not appear in experimentally determined protein structure databases or the Alphafold database.
A related paper is posted on the bioRxiv preprint website.
Compared with DeepMind’s Alphafold, the ESMfold model greatly improves the prediction speed, and
In just 2 weeks, over 600 million protein structures were predicted (for comparison, Alphafold takes several minutes on average to generate a prediction).
However, the accuracy of ESMfold prediction results has also been controversial.
In July this year, Alphafold2 published predictions for 214 million known protein 3D structures.
35% of these predictions were highly accurate (up to the accuracy of laboratory-determined structures), and 80% of the protein structures were reliable enough to be used in subsequent studies.
However, according to the META team’s paper, only about one-third of the 617 million predictions from ESMfold were considered “high confience” predictions.
The researchers believe that these predictions are consistent with the actual structure and can sometimes provide detailed features at the atomic scale.
Some scientists have also suggested that the hundreds of millions of low-confidence predictions provided by ESMFold are highly problematic.
Some of these predictions may lack a clear structure, and others may belong to misclassified non-coding DNA.