Development of a comprehensive cheese metagenome catalogue reveals potential markers of origin and quality

Date:

Abstract: Cheese microbiomes shape unique signatures reflecting diverse manufacturing, geographical origins, and potential health benefits. Yet, their diversity and functional links to terroir and bioactive potential remain underexplored. High–throughput shotgun metagenomic sequencing raw reads from curatedFoodMetagenomicData (n=1,044), publicly available dataset (n=233), and newly sequenced Protected Designation of Origin and Protected Geographical Indication cheeses (n=316) were merged and analysed. Furthermore, a metadata collection and standardisation (N_metadata=18) allowed the harmonisation of diverse information about cheeses. Thus, a comprehensive metadata–curated catalogue of cheese microbiomes by collecting a total of 1,593 shotgun metagenomic samples was built. Our findings revealed different values of α– and β–diversity depending on the metadata considered, as well as different distribution patterns of microbes and key functional genes associated with flavour, quality, and health benefits. Explainable machine learning models further elucidated the cheese microbiomes classification among metadata, also suggesting potential biomarkers. Additionally, phylogenetic analyses were employed to investigate strain–level microbial diversity, while functional phylogenomic approaches linked microbial taxa to functional cheese patterns. In summary, our analysis yielded >4,000 metagenome–assembled genomes, significantly expanding existing cheese microbiome repositories by approximately 60%, including previously undescribed taxa. This curated cheese microbiome catalogue advances our understanding of cheese microbial ecology, offering novel insights into microbial contributions to cheese terroir and safety, and supporting future applications for precision fermentation and health–oriented food innovation.