Plasma proteomic associations with genetics and health in the UK Biobank

Benjamin B Sun, Joshua Chiou, Matthew Traylor, Christian Benner, Yi-Hsiang Hsu, Tom G Richardson, Praveen Surendran, Anubha Mahajan, Chloe Robins, Steven G Vasquez-Grinnell, Liping Hou, Erika M Kvikstad, Oliver S Burren, Jonathan Davitte, Kyle L Ferber, Christopher E Gillies, Åsa K Hedman, Sile Hu, Tinchi Lin, Rajesh Mikkilineni, Rion K Pendergrass, Corran Pickering, Bram Prins, Denis Baird, Chia-Yen Chen, Lucas D Ward, Aimee M Deaton, Samantha Welsh, Carissa M Willis, Nick Lehner, Matthias Arnold, Maria A Wörheide, Karsten Suhre, Gabi Kastenmüller, Anurag Sethi, Madeleine Cule, Anil Raj; Alnylam Human Genetics; AstraZeneca Genomics Initiative; Biogen Biobank Team; Bristol Myers Squibb; Genentech Human Genetics; GlaxoSmithKline Genomic Sciences; Pfizer Integrative Biology; Population Analytics of Janssen Data Sciences; Regeneron Genetics Center; Lucy Burkitt-Gray, Eugene Melamud, Mary Helen Black, Eric B Fauman, Joanna M M Howson, Hyun Min Kang, Mark I McCarthy, Paul Nioi, Slavé Petrovski, Robert A Scott, Erin N Smith, Sándor Szalma, Dawn M Waterworth, Lyndon J Mitnaul, Joseph D Szustakowski, Bradford W Gibson, Melissa R Miller, Christopher D Whelan.
Nature. 2023-10-04;622(7982):329-338.
The Pharma Proteomics Project is a precompetitive biopharmaceutical consortium characterizing the plasma proteomic profiles of 54,219 UK Biobank participants. Here we provide a detailed summary of this initiative, including technical and biological validations, insights into proteomic disease signatures, and prediction modelling for various demographic and health indicators. We present comprehensive protein quantitative trait locus (pQTL) mapping of 2,923 proteins that identifies 14,287 primary genetic associations, of which 81% are previously undescribed, alongside ancestry-specific pQTL mapping in non-European individuals. The study provides an updated characterization of the genetic architecture of the plasma proteome, contextualized with projected pQTL discovery rates as sample sizes and proteomic assay coverages increase over time. We offer extensive insights into trans pQTLs across multiple biological domains, highlight genetic influences on ligand-receptor interactions and pathway perturbations across a diverse collection of cytokines and complement networks, and illustrate long-range epistatic effects of ABO blood group and FUT2 secretor status on proteins with gastrointestinal tissue-enriched expression. We demonstrate the utility of these data for drug discovery by extending the genetic proxied effects of protein targets, such as PCSK9, on additional endpoints, and disentangle specific genes and proteins perturbed at loci associated with COVID-19 susceptibility. This public-private partnership provides the scientific community with an open-access proteomics resource of considerable breadth and depth to help to elucidate the biological mechanisms underlying proteo-genomic discoveries and accelerate the development of biomarkers, predictive models and therapeutics.

Related data

Data summary
Proteo-genomic results and summary association data are available through an interactive portal.
Data summary
Underlying NPX measures are available through the UK Biobank Research Analysis Portal.
Data summary
UKB has catalogued the dataset in Category 1839, under ‘Field 30900’, described in greater detail online.