Clinical potential of whole-genome data linked to mortality statistics in patients with breast cancer in the UK: a retrospective analysis

October 8, 2025

Publication: The Lancet Oncology

07 October 2025

Daniella Black, Helen Ruth Davies, Gene Ching Chiek Koh, Lucia Chmelova, Marko Cubric, Georgia Chalivelaki Chan, Andrea Degasperi, Jan Czarnecki, Ping Jing Toong, Yasin Memari, James Whitworth, Salome Jingchen Zhao, Yogesh Kumar, Shadi Basyuni, Giuseppe Rinaldi, Scott Shooter, Vladyslav Dembrovskyi, Rosie Davies, Maria Chatzou Dunford, Ellen Copson, Carlo Palmieri, Åke Borg, John Ambrose, Catey Bunce, Alona Sosinsky, Prabhu Arumugam, Matthew Arthur Brown, Johan Staaf, Nicholas Turner, Serena Nik-Zainal

Background

Breast cancer is the most frequently diagnosed cancer in women. Survival is generally considered favourable, yet some patients remain at risk of early death. We aimed to assess whether comprehensive whole-genome sequencing (WGS) linked to mortality data could add prognostic value to existing clinical measures and identify patients who might respond to targeted therapeutics.

Methods

In this integrative, retrospective analysis, 2445 breast cancer tumours were analysed (any stage and molecular subtype) collected from 2403 patients recruited through 13 National Health Service Genomic Medicine Centres or hospitals in England affiliated to the 100 000 Genomes Project (100kGP) between 2012 and 2018. 2208 (90%) cases were linked with clinical data; mortality data were obtained for 1188 patients. Following high-depth WGS of tumour and matched normal DNA, comprehensive WGS profiling was performed, seeking driver mutations, mutational signatures, and compound algorithmic scores for homologous recombination repair deficiency (HRD), mismatch repair deficiency, and tumour mutational burden. Data from 1803 additional patients with breast cancer from three independent cohorts were used to validate various findings. To evaluate the prognostic value of WGS features, univariable and multivariable Cox regression on data from patients was performed with stage I–III, ER-positive, HER2-negative breast cancer with a cancer-specific mortality endpoint (around 5-year follow-up).

Findings

Among 2445 tumours in the 100kGP breast cancer cohort, genomic characteristics with immediate personalised medicine potential in 656 (26·8%) was observed, including features reporting HRD (298 [12·2%] total cases and 76 [6·3%] ER-positive, HER2-negative cases), highly individualised driver events, mutations underpinning resistance to endocrine therapy, and mutational signatures indicating therapeutic vulnerabilities. 373 (15·2%) cases had WGS features with potential for translational research, including compromised base excision repair and non-homologous end-joining dependency. Structural variation burden (hazard ratio 3·9 [95 CI% 2·4–6·2]; p<0·0001), high levels of APOBEC signatures (2·5 [1·6–4·1]; p<0·0001), and TP53 drivers (3·9 [2·4–6·2]; p<0·0001) were independently prognostic of customary clinical measures (age at diagnosis, stage, and grade) in patients with ER-positive, HER2-negative breast cancer. A prognosticator was developed for ER-positive, HER2-negative breast cancer capable of identifying patients who require either increased intervention or therapy de-escalation, validating the framework in the independent Swedish Cancerome Analysis Network-Breast (SCAN-B) dataset.

Interpretation

Breast cancer genomes are rich in predictive and prognostic value. A two-step model is proposed for effective clinical application. First, the identification of candidates for targeted therapies or clinical trials using highly individualised genomic markers. Second, for patients without such features, the implementation of enhanced prognostication using genomic features alongside existing clinical decision-making factors.

View Publication

The long-term effects of chemotherapy on normal blood cells

July 11, 2025

Publication: Nature Genetics

10 July 2025

Emily Mitchell, My H. Pham, Anna Clay, Rashesh Sanghvi, Nicholas Williams, Sandra Pietsch, Joanne I. Hsu, Hyunchul Jung, Aditi Vedi, Sarah Moody, Jingwei Wang, Daniel Leonganmornlert, Michael Spencer Chapman, Ellie Dunstone, Anna Santarsieri, Alex Cagan, Heather E. Machado, E. Joanna Baxter, George Follows, Daniel J. Hodson, Ultan McDermott, Gary J. Doherty, Inigo Martincorena, Laura Humphreys, Krishnaa Mahbubani, Kourosh Saeb Parsy, Koichi Takahashi, Margaret A. Goodell, David Kent, Elisa Laurenti, Peter J. Campbell, Raheleh Rahbari, Jyoti Nangalia & Michael R. Stratton.

Abstract

Several chemotherapeutic agents act by increasing DNA damage in cancer cells, triggering cell death. However, there is limited understanding of the extent and long-term consequences of collateral DNA damage in normal tissues. To investigate the impact of chemotherapy on mutation burdens and the cell population structure of normal tissue, we sequenced blood cell genomes from 23 individuals aged 3–80 years who were treated with a range of chemotherapy regimens. Substantial additional somatic mutation loads with characteristic mutational signatures were imposed by some chemotherapeutic agents, but the effects were dependent on the drug and blood cell types. Chemotherapy induced premature changes in the cell population structure of normal blood, similar to those caused by normal aging. The results show the long-term biological consequences of cytotoxic agents to which a substantial fraction of the population is exposed as part of disease management, raising mechanistic questions and highlighting opportunities for the mitigation of adverse effects.

View Publication

A redefined indel taxonomy reveals insights into mutational signatures

April 16, 2025

Publication: Nature Genetics

Gene Ching Chiek Koh, Arjun Scott Nanda, Giuseppe Rinaldi, Soraya Boushaki, Andrea Degasperi, Cherif Badja, Andrew Marcel Pregnall, Salome Jingchen Zhao, Lucia Chmelova, Daniella Black, Laura Heskin, João Dias, Jamie Young, Yasin Memari, Scott Shooter, Jan Czarnecki, Matthew Arthur Brown, Helen Ruth Davies, Xueqing Zou & Serena Nik-Zainal

10 April 2025

In cancer genetics, small insertions and deletions (called InDels) have not been as widely researched as substitutions (both causes of cancer). Researchers created identical ‘CRISPR-edited’ human cell models of ones which were damaged and then replicated (damaged included mismatched repairs and replicative enzymes). The trail that led to these mutations were uncovered and current research was unable to show the cancerous mutations apart from more general mutations.

To address this, a technique called InDel was developed that was able to pick up unusual genetic sequences and very long long genetic sequences that meant they could be classified into 89 subtypes. By using the information collected in the 100K Genomes Project, 37 InDel sequences were found, 27 of these were new. In addition to this new finding, a new mechanism called PRRDetect was developed which allowed tumours to be ‘classified’ possibly having implications for immunotherapy, a way of treating cancerous tumours.

View Publication

Identification of plasma proteomic markers underlying polygenic risk of type 2 diabetes and related comorbidities

March 4, 2025

Publication: Nature

Douglas P. Loesch, Manik Garg, Dorota Matelska, Dimitrios Vitsios, Xiao Jiang, Scott C. Ritchie, Benjamin B Sun, Heiko Runz, Christopher D. Whelan, Ruey R. Holman, Robert J. Mentz, Filipe A. Moura, Stephen D. Wiviott, Marc S Sabatine, Miriam S Udler, Ingrid A. Gause-Nilsson, Slavé Petrovski, Jan Oscarsson, Abhishek Nag, Dirk S. Paul & Michael Inouye.

03 March 2025

Genomics can provide insight into the etiology of type 2 diabetes and its comorbidities, but assigning functionality to non-coding variants remains challenging. Polygenic scores, which aggregate variant effects, can uncover mechanisms when paired with molecular data. Here, we test polygenic scores for type 2 diabetes and cardiometabolic comorbidities for associations with 2,922 circulating proteins in the UK Biobank. The genome-wide type 2 diabetes polygenic score associates with 617 proteins, of which 75% also associate with another cardiometabolic score. Partitioned type 2 diabetes scores, which capture distinct disease biology, associate with 342 proteins (20% unique). In this work, we identify key pathways (e.g., complement cascade), potential therapeutic targets (e.g., FAM3D in type 2 diabetes), and biomarkers of diabetic comorbidities (e.g., EFEMP1 and IGFBP2) through causal inference, pathway enrichment, and Cox regression of clinical trial outcomes. Our results are available via an interactive portal (https://public.cgr.astrazeneca.com/t2d-pgs/v1/).

View Publication

Genome-wide characterization ofcirculating metabolic biomarkers

March 7, 2024

Publication: Nature

Minna K. Karjalainen, Savita Karthikeyan, Clare Oliver-Williams, Eeva Sliz, Elias Allara, Wing Tung Fung, Praveen Surendran, Weihua Zhang, Pekka Jousilahti, Kati Kristiansson, Veikko Salomaa, Matt Goodwin, David A. Hughes, Michael Boehnke, Lilian Fernandes Silva, Xianyong Yin, Anubha Mahajan, Matt J. Neville, Natalie R. van Zuydam, Renée de Mutsert, Ruifang Li-Gao, Dennis O. Mook-Kanamori, Ayse Demirkan, Jun Liu, China Kadoorie Biobank Collaborative Group, Estonian Biobank Research Team, FinnGen, …Johannes Kettunen

6 March 2024

Summary

Genome-wide association analyses using high-throughput metabolomics platforms have led to novel insights into the biology of human metabolism. This detailed knowledge of the genetic determinants of systemic metabolism has been pivotal for uncovering how genetic pathways influence biological mechanisms and complex diseases. Researchers present a genome-wide association study for 233 circulating metabolic traits quantified by nuclear magnetic resonance spectroscopy in up to 136,016 participants from 33 cohorts.

View publication

Genetics of circulating inflammatory proteins identifies drivers of immune-mediated disease risk and therapeutic targets

July 18, 2023

Publication: Nature Immunology

Jing Hua Zhao, David Stacey, Niclas Eriksson, Erin Macdonald-Dunlop, Asa K Hedman et al

18 July 2023

Aberrant inflammatory responses play a role in pathogenesis of many diseases, including autoimmune conditions, cardiovascular diseases and cancers. In this study of genetic influences on inflammation-related proteins, an international team conducted a genome-wide association study of 91 plasma proteins in ~15,000 participants within the SCALLOP Consortium.

Having identified 180 gene-protein associations, they integrated with gene expression and disease genetics to provide insights into disease aetiology, implicating FGF5 in hypertension and cardiovascular disease, and lymphotoxin-α in multiple sclerosis.

The team identified both shared and distinct effects of specific proteins across immune mediated diseases, including directionally discordant functions for CD40 in rheumatoid arthritis versus multiple sclerosis and inflammatory bowel disease, and a role for CXCL5 in the aetiology of ulcerative colitis UC but not Crohns disease.

These results provide a powerful resource to understand the role of chronic inflammation in a wide range of diseases and facilitate future drug target prioritisation.

View publication

Substantial somatic genomic variation and selection for BCOR mutations in human induced pluripotent stem cells

August 12, 2022

Publication: Nature Genetics

Foad J. Rouhani, Xueqing Zou, Petr Danecek, Cherif Badja, Tauanne Dias Amarante, Gene Koh, Qianxin Wu, Yasin Memari, Richard Durbin, Inigo Martincorena, Andrew R. Bassett, Daniel Gaffney & Serena Nik-Zainal

11 August 2022

Summary

DNA damage caused by factors such as ultraviolet radiation affect nearly three-quarters of all stem cell lines derived from human skin cells, say Cambridge researchers, who argue that whole genome sequencing is essential for confirming if cell lines are usable. Read the full news story.

View publication

Refinements and considerations for trio whole-genome sequence analysis when investigating Mendelian diseases presenting in early childhood

May 26, 2022

Publication: HGG Advances

Courtney E. French, Helen Dolling, Karyn Mégy, Alba Sanchis-Juan, Ajay Kumar, Isabelle Delon, Matthew Wakeling, Lucy Mallin, Shruti, Agrawal, Topun Austin, Florence Walston, Soo-Mi Park, Alasdair, Parker, Chinthika Piyasena, Kimberley Bradbury, Sian Ellard, David H.Rowitch, LucyRaymond

24 May 2022

Summary

More than a third of severely sick babies referred for rapid whole genome sequencing received a vital genetic diagnosis. Results from the latest Cambridge genomic study supported by NIHR Cambridge BRC and NIHR BioResource, confirm rapid whole genome sequencing (WGS) as an effective early test to aid diagnosis in severely ill children. Read the full story.

View publication

The NHS England 100,000 Genomes Project – Feasibility and utility of centralised genome sequencing for children with cancer

April 26, 2022

Publication: British Journal of Cancer

Jamie Trotman, Ruth Armstrong, Helen Firth, Claire Trayers, James Watkins, Kieren Allinson, James C. Nicholson, G. A. Amos Burke, Sam Behjati, Matthew J. Murray, Catherine E. Hook, Patrick Tarpey

22 April 2022

Summary

As part of the national 100,000 Genome Project, researchers recruited from 36 children, across 23 different solid tumour types. Whole genome sequencing (WGS) data from paired tumour (fresh-frozen tissue) and matched normal (blood) samples was analysed. The results for each case were clinically reviewed at the Cambridge paediatric oncology Genomic Tumour Advisory Board (GTAB), and formal report of the results was written.

View publication

A systematic CRISPR screen defines mutational mechanisms underpinning signatures caused by replication errors and endogenous DNA damage

May 27, 2021

Publication: Nature Cancer

Xueqing Zou, Gene Ching Chiek Koh, Arjun Scott Nanda, Andrea Degasperi, Katie Urgo, Theodoros I. Roumeliotis, Chukwuma A. Agu, Cherif Badja, Sophie Momen, Jamie Young, Tauanne Dias Amarante, Lucy Side, Glen Brice, Vanesa Perez-Alonso, Daniel Rueda, Celine Gomez, Wendy Bushell, Rebecca Harris, Jyoti S. Choudhary, Genomics England Research Consortium, Josef Jiricny,
William C. Skarnes & Serena Nik-Zainal

26 April 2021

Summary

A new way to identify tumours that could be sensitive to particular immunotherapies has been developed using data from thousands of NHS cancer patient samples sequenced through the 100,000 Genomes Project. The MMRDetect clinical algorithm makes it possible to identify tumours that have ‘mismatch repair deficiencies’ and then improve the personalisation of cancer therapies to exploit those weaknesses.

View publication

The NIHR Cambridge BRC is part of the NIHR and hosted by Cambridge University Hospitals NHS Foundation Trust in partnership with the University of Cambridge. We are at the heart of the Cambridge Biomedical Campus, Europe’s largest health research area.

Contact Us

Cambridge Biomedical Research Centre
Box 277
Hills Road
Cambridge
CB2 0QQ

Tel: 01223 348490

Retweet on Twitter NIHR Cambridge Biomedical Research Centre Retweeted

Research at CPFT @cpft_research ·

13 Apr

🗓️ Join this free regional research event to learn about the @NIHRresearch Insight programme on Wednesday 13 May, 10am to 5pm.
🔎 Find out how to apply to start #YourPathInResearch and meet researchers in the #EastofEngland.
Register your free place here ➡️https://www.eventbrite.co.uk/e/insight-research-conference-2026-tickets-1981735621591?aff=oddtdtcreator

Reply on Twitter 2043739168040259752 Retweet on Twitter 2043739168040259752 2 Like on Twitter 2043739168040259752 3 Twitter 2043739168040259752

Retweet on Twitter NIHR Cambridge Biomedical Research Centre Retweeted

National Institute for Health and Care Research @nihrresearch ·

31 Mar

We've launched lots of exciting funding and development opportunities in the last week!

Find them below 👇 or visit our website to see the full list of opportunities available: https://www.nihr.ac.uk/funding-opportunities

Reply on Twitter 2039009860092608900 Retweet on Twitter 2039009860092608900 9 Like on Twitter 2039009860092608900 13 Twitter 2039009860092608900

NIHR Cambridge Biomedical Research Centre @cambridgebrc ·

1 Apr

Want to find out more about what health and care research is and why we do it?
Then why not book onto this free webinar by Dr Joseph Lanario - Thursday 16 April | 6-7pm | Online
Booking link 👉https://bit.ly/4bNuFvO
@BirminghamBRC @NIHRresearch

Reply on Twitter 2039429544654856495 Retweet on Twitter 2039429544654856495 Like on Twitter 2039429544654856495 Twitter 2039429544654856495

Retweet on Twitter NIHR Cambridge Biomedical Research Centre Retweeted

National Institute for Health and Care Research @nihrresearch ·

30 Mar

⏳ Final call: Applications are closing soon for Non-Executive Members of our new Council!

Help shape the strategic direction of health and care research in the UK. Apply by 10 April, 1pm ➡️https://www.nihr.ac.uk/honorary-and-specialty/nihr-council-non-executive-council-members

Reply on Twitter 2038633731376709888 Retweet on Twitter 2038633731376709888 4 Like on Twitter 2038633731376709888 7 Twitter 2038633731376709888

NIHR Cambridge Biomedical Research Centre @cambridgebrc ·

24 Mar

Congratulations to Professor Grant Stewart who has been newly appointed as an #NIHR Senior Investigator https://bit.ly/4bI0owR

Reply on Twitter 2036449688056979830 Retweet on Twitter 2036449688056979830 Like on Twitter 2036449688056979830 4 Twitter 2036449688056979830

NIHR Cambridge Biomedical Research Centre @cambridgebrc ·

21 Mar

Come and see us! We are at @CAST_Education (CAST) from 11am - 4pm for the @Cambridge_Fest Hands On Science Day!
👉 Learn more about research by taking part in our activities!
👉 full programme here: http://www.festival.cam.ac.uk
@NIHRresearch @CUH_NHS

Reply on Twitter 2035329198064091487 Retweet on Twitter 2035329198064091487 Like on Twitter 2035329198064091487 3 Twitter 2035329198064091487

View more Tweets

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Background

Methods

Findings

Interpretation

Contact Us

Useful Links