Tag Archive for: Data Science and Population Health

Polygenic Score Catalog increases diversity and usability of genetic data

Credit: Karen Arnott/EMBL-EBI

A paper highlighting changes to the Polygenic Score (PGS) Catalog for genetic disease risk predictions for individuals of diverse genetic backgrounds has been published in leading journal Nature Genetics.

In the work, which was supported by NIHR Cambridge BRC and other funders, researchers added data from multi-ancestry and non-European populations and introduced a new software tool to make calculating PGS easier and more reproducible.

This work was undertaken as part of the Cambridge Baker Systems Genomics Initiative, a research partnership between the Baker Heart and Diabetes Institute and Cambridge University to significantly expand access big data and corresponding expertise to target approaches in disease prediction and personalised medicine.

Other collaborators involved in the work include the GWAS Catalog from the EMBL’s European Bioinformatics Institute (EMBL-EBI).

Data Science and Population Health Theme Lead Professor Mike Inouye – who is also Head of the Cambridge Baker Systems Genomics Initiative and Munz Chair of Cardiovascular Prediction and Prevention – said: “The PGS Catalog is the largest open database for polygenic scores with around 27,000 users from over 140 countries in the past year alone.

“These scores estimate an individual’s genetic predisposition to a specific trait or disease by summarising the effect of many different genetic variants across the genome.

“Polygenic scores are particularly useful for predicting complex health conditions such as heart disease, diabetes and certain cancers, where multiple genetic variants contribute to the overall risk.

“Integrating these scores into clinical practice could help scientists and clinicians understand the genetic influences on health, potentially leading to better prevention strategies and tailored treatments.”

About the PGS

The Polygenic Score (PGS) Catalog is an open-source resource created to make this genetic prediction method more accessible to the research community. It includes over 4500 scores predicting over 600 traits (including common diseases and lab measurements) and is heavily used in research and in industry with applications such as disease prediction. Launched in 2019, it has attracted 27,000 users from over 140 countries in the past year alone.

Due to lack of genetic data from populations of non-European ancestry, data in early releases of PGS Catalog mostly consisted of scores using data from individuals of European ancestry. Now more PGS have been added from studies using African, Asian, and often multi-ancestry data to develop and evaluate PGSs.

About the PGS Catalog Calculator

The PGS Catalog Calculator is a new addition to the PGS Catalog. This open-source software tool automates the process of calculating PGS, allowing users to apply polygenic scores to new genomic data, simplifying tasks such as genotype data formatting and variant matching.

The Calculator also implements methods for genetic similarity analysis and ancestry adjustment, an important step towards ensuring that calculated polygenic scores are more interpretable across populations. This could help to streamline the use of PGSs in research and clinical studies.

‘Gene misbehaviour’ widespread in healthy population

Photo by National Cancer Institute on unsplash.com

NIHR Cambridge BRC researchers have been involved in a major study published this week in the American Journal of Human Genetics that shows that gene misbehaviour – where genes are active when they should be switched off – is a common phenomenon in the healthy human population.

Data Science and Population Health Theme Lead Professor Michael Inouye, fellow researchers Adam Butterworth, Emanuele Di Angelantonio, John Danesh and former colleague Dirk Paul at AstraZeneca took part in the research led by the Wellcome Sanger Institute that studied the activity of inactive genes in a large, healthy population for the first time.

While gene misexpression has previously been linked to several rare diseases, it is not known how often or why this may happen in the general population – but this study showed that misexpression is widespread across samples and involved more than half of the genes that should be inactive.

The surprising finding sheds new light on how our genetic code operates – and the approach could be used in future research to investigate, diagnose and develop treatments for various complex diseases caused by misexpression.

Study author Dr Katie Burnham at the Wellcome Sanger Institute said:

“The work of this pioneering large-scale study is testament to the incredible ‘genomics ecosystem’ in Cambridge that brought together experts from the Sanger Institute, the University of Cambridge and AstraZeneca.

“The findings open avenues for research into gene misexpression across different tissues, to understand its role in various diseases and potential treatments.”

In this study, researchers analysed blood samples from 4,568 healthy individuals from the INTERVAL study 3. They used advanced RNA sequencing techniques to measure gene activity and whole genome sequencing to identify genetic changes behind irregular gene activity.

Dr. Anne Forde, Patient and Public Involvement and Engagement Manager at the Cardiovascular Epidemiology Unit, University of Cambridge, said: “This research was based on the 50,000 population cohort recruited for INTERVAL on blood donation frequency, and we are grateful to the NIHR and the Cambridge BRC for your ongoing support and collaboration.”

  • ‘Misexpression of inactive genes in whole blood is associated with nearby rare structural variants’, T. Vanderstichele et al. (2024), American Journal of Human Genetics.

 

Prestigious NIHR Research Professorship awarded to Cambridge researcher

Congratulations to co-theme lead for the NIHR Cambridge BRC Data Science and Population Health theme, Professor Angela Wood, who has been awarded an NIHR Research Professorship.

Professor Angela Wood, Professor of Health Data Science, whose research focuses on the use of population health data for the prevention of numerous chronic diseases, is one of six leading researchers in England who has been given this prestigious award.

The NIHR Research Professorships scheme funds and supports research leaders of the future. It aims to strengthen and benefit health, public health and care research leadership.

The researchers will receive five-year awards of up to £2 million as well as a package of extensive support, including three support posts and access to a leadership and development programme.

Professor Wood said: “I’m really delighted and honoured to receive the NIHR Research Professorship Award. My goal is to advance the prevention of multiple chronic diseases by harnessing population-wide health data. I will develop new methods to identify who will benefit the most from specific interventions and when is the best time to start them.” 

Since 2011, 66 people have been successful in gaining the competitive award. Many have gone on to become senior research leaders. This includes Professor Lucy Chappell, Chief Scientific Adviser to the Department of Health and Social Care and CEO of the NIHR.

Professor Waljit Dhillo, Dean of the NIHR Academy and Scientific Director for Research Capacity and Capabilities, said: “I am delighted and honoured to welcome the latest group of outstanding researchers to the NIHR Research Professorship scheme. I look forward to seeing the difference their research will make to the lives of people and communities across the UK.

“The NIHR Research Professorship is one of the most prestigious awards we offer. Their expertise in health and care research will help improve people’s health and wellbeing.”

NIHR Research Professors

The five other researchers to receive a 2023 NIHR research professorship are:

  • Professor Samuele Cortese, University of Southampton. Their research topic is: Personalising the pharmacological treatment of Attention Deficit Hyperactivity Disorder (ADHD) in children
  • Professor Daniela Ferreira, Professor of Vaccinology, University of Oxford. Their research topic is: Understanding nasal immunity to improve vaccine protection against respiratory infections
  • Professor Shonit Punwani, University College London. Their research topic is: Smarter identification and management of Early prostate cancer: improving Lives and outcomEs through Clinical Translation of novel magnetic resonance imaging (SELeCT)
  • Professor Matthew Ridd, University of Bristol. Their research topic is: Transforming Outcomes for Paediatric allergy in primary care (TOPIC)
  • Professor Reecha Sofat, University of Liverpool. Their research topic is: CAUsal Inference Methods to Inform MedicineS ReguLation and Guidance: CAUSAL

Cambridge researchers listed among world’s most influential researchers

Congratulations to our NIHR Cambridge BRC theme leads who have been named in the Clarivate listings of the ‘world’s most influential researchers’.

The researchers were selected on their exceptional research influence and highly cited research papers that rank in the top 1% by citations on the global database, Web of Science, over the last decade.  

The full list of Cambridge researchers can be found on the Clarivate website.

NameTitleAssociated theme
Professor Ed BullmoreHonorary Consultant Psychiatrist and Head of the Department of PsychiatryMental Health
Professor John DaneshProfessor of EpidemiologyPopulation and quantitative sciences
Professor Ravindra K. GuptaProfessor of Clinical MicrobiologyAntimicrobial resistance
Professor David Rowitch Developmental neuroscientist and Head of Department of Paediatrics Women’s health and paediatrics

World first for AI and machine learning to treat Covid patients worldwide

In a groundbreaking study, Addenbrooke’s Hospital in Cambridge along with 20 other hospitals from across the world and healthcare technology leader, NVIDIA, have used artificial intelligence (AI) to predict Covid patients’ oxygen needs on a global scale.

The research was sparked by the pandemic and set out to build an AI tool to predict how much extra oxygen a Covid-19 patient may need in the first days of hospital care, using data from across four continents. 

The technique, known as federated learning, used an algorithm to analyse chest x-rays and electronic health data from hospital patients with Covid symptoms. 

To maintain strict patient confidentiality, the patient data was fully anonymised and an algorithm was sent to each hospital so no data was shared or left its location.  

Once the algorithm had ‘learned’ from the data, the analysis was brought together to build an AI tool which could predict the oxygen needs of hospital Covid patients anywhere in the world.

Published today (15 September) in Nature Medicine, the study dubbed EXAM (for EMR CXR AI Model), is one of the largest, most diverse clinical federated learning studies to date. 

To check the accuracy of EXAM, it was tested out in a number of hospitals across five continents, including Addenbrooke’s Hospital.  The results showed it predicted the oxygen needed within 24 hours of a patient’s arrival in the emergency department, with a sensitivity of 95 per cent and a specificity of over 88 per cent. 

“Federated learning has transformative power to bring AI innovation to the clinical workflow,” said Professor Fiona Gilbert (pictured right), NIHR Cambridge BRC Imaging theme lead, who led the study in Cambridge and is honorary consultant radiologist at Addenbrooke’s Hospital and chair of radiology at the University of Cambridge School of Clinical Medicine. 

“Our continued work with EXAM demonstrates that these kinds of global collaborations are repeatable and more efficient, so that we can meet clinicians’ needs to tackle complex health challenges and future epidemics.”

Professor Fiona Gilbert - Imaging theme lead

First author on the study, Dr Ittai Dayan, from Mass General Bingham in the US where the EXAM algorithm was developed said: “Usually in AI development, when you create an algorithm on one hospital’s data, it doesn’t work well at any other hospital. By developing the EXAM model using federated learning and objective, multimodal data from different continents, we were able to build a generalizable model that can help frontline physicians worldwide.”

Bringing together collaborators across North and South America, Europe and Asia, the EXAM study took just two weeks of AI ‘learning’  to achieve high-quality predictions.

“Federated Learning allowed researchers to collaborate and set a new standard for what we can do globally, using the power of AI, ” said Dr. Mona G. Flores, Global Head for Medical AI at NVIDIA. “This will advance AI not just for healthcare but across all industries looking to build robust models without sacrificing privacy.”

The outcomes of around 10,000 Covid patients from across the world were analysed in the study, including 250 who came to Addenbrooke’s Hospital in the first wave of the pandemic in March/April 2020. 

The research was supported by the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre (BRC). 

Work on the EXAM model has continued. Mass General Brigham and the NIHR Cambridge BRC are working with NVIDIA Inception startup Rhino Health, cofounded by Dr. Dayan, to run prospective studies using EXAM. 

Professor Gilbert added: “Creating software to match the performance of our best radiologists is complex, but a truly transformative aspiration. The more we can securely integrate data from different sources using federated learning and collaboration, and have the space needed to innovate, the faster academics can make those transformative goals a reality.”

© Copyright - NIHR Cambridge Biomedical Research Centre 2025