Protein structure-informed bacteriophage genome annotation with Phold
Files
(Published version)
Date
2026
Authors
Bouras, G.
Grigson, S.R.
Mirdita, M.
Heinzinger, M.
Papudeshi, B.
Mallawaarachchi, V.
Green, R.
Kim, R.S.
Mihalia, V.
Psaltis, A.J.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Journal article
Citation
Nucleic Acids Research (NAR), 2026; 54(1):gkaf1448-1-gkaf1448-19
Statement of Responsibility
George Bouras, Susanna R Grigson, Milot Mirdita, Michael Heinzinger, Bhavya Papudeshi, Vijini Mallawaarachchi, Renee Green, Rachel Seongeun Kim, Victor Mihalia, Alkis James Psaltis, Peter-John Wormald, Sarah Vreugde, Martin Steinegger, Robert A Edwards
Conference Name
Abstract
Bacteriophage (phage) genome annotation is essential for understanding their functional potential and suitability for use as therapeutic agents. Here, we introduce Phold, an annotation framework utilizing protein structural information that combines the ProstT5 protein language model and structural alignment tool Foldseek. Phold assigns annotations using a database of over 1.36 million predicted phage protein structures with high-quality functional labels. Benchmarking reveals that Phold outperforms existing sequence-based homology approaches in functional annotation sensitivity whilst maintaining speed, consistency, and scalability. Applying Phold to diverse cultured and metagenomic phage genomes shows it consistently annotates over 50% of genes on an average phage and 40% on an average archaeal virus. Comparisons of phage protein structures to other protein structures across the tree of life reveal that phage proteins commonly have structural homology to proteins shared across the tree of life, particularly those that have nucleic acid metabolism and enzymatic functions. Phold is available as free and open-source software at https://github.com/gbouras13/phold.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© The Author(s) 2026. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https:// creativecommons.org/ licenses/by/ 4.0/ ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.