Polygenic Score Catalog increases diversity and usability of genetic data
A paper highlighting changes to the Polygenic Score (PGS) Catalog for genetic disease risk predictions for individuals of diverse genetic backgrounds has been published in leading journal Nature Genetics.
In the work, which was supported by NIHR Cambridge BRC and other funders, researchers added data from multi-ancestry and non-European populations and introduced a new software tool to make calculating PGS easier and more reproducible.
This work was undertaken as part of the Cambridge Baker Systems Genomics Initiative, a research partnership between the Baker Heart and Diabetes Institute and Cambridge University to significantly expand access big data and corresponding expertise to target approaches in disease prediction and personalised medicine.
Other collaborators involved in the work include the GWAS Catalog from the EMBL’s European Bioinformatics Institute (EMBL-EBI).
Data Science and Population Health Theme Lead Professor Mike Inouye – who is also Head of the Cambridge Baker Systems Genomics Initiative and Munz Chair of Cardiovascular Prediction and Prevention – said: “The PGS Catalog is the largest open database for polygenic scores with around 27,000 users from over 140 countries in the past year alone.
“These scores estimate an individual’s genetic predisposition to a specific trait or disease by summarising the effect of many different genetic variants across the genome.
“Polygenic scores are particularly useful for predicting complex health conditions such as heart disease, diabetes and certain cancers, where multiple genetic variants contribute to the overall risk.
“Integrating these scores into clinical practice could help scientists and clinicians understand the genetic influences on health, potentially leading to better prevention strategies and tailored treatments.”
About the PGS
The Polygenic Score (PGS) Catalog is an open-source resource created to make this genetic prediction method more accessible to the research community. It includes over 4500 scores predicting over 600 traits (including common diseases and lab measurements) and is heavily used in research and in industry with applications such as disease prediction. Launched in 2019, it has attracted 27,000 users from over 140 countries in the past year alone.
Due to lack of genetic data from populations of non-European ancestry, data in early releases of PGS Catalog mostly consisted of scores using data from individuals of European ancestry. Now more PGS have been added from studies using African, Asian, and often multi-ancestry data to develop and evaluate PGSs.
About the PGS Catalog Calculator
The PGS Catalog Calculator is a new addition to the PGS Catalog. This open-source software tool automates the process of calculating PGS, allowing users to apply polygenic scores to new genomic data, simplifying tasks such as genotype data formatting and variant matching.
The Calculator also implements methods for genetic similarity analysis and ancestry adjustment, an important step towards ensuring that calculated polygenic scores are more interpretable across populations. This could help to streamline the use of PGSs in research and clinical studies.