Skip to content

Systematic Mendelian randomization and colocalization analyses of the plasma proteome and blood transcriptome to prioritize drug targets for complex disease

Jie Zheng[1], Ben M. Brumpton[2], Paola G. Bronson[3], Yi Liu [1], Philip Haycock[1], Benjamin Elsworth[1], Valeriia Haberland[1], Denis Baird[1], Venexia Walker[1], Jamie W. Robinson[1], Sally John[4], Bram Prins [5], Heiko Runz [3], , Matthew R Nelson[6], Mark Hurle[6], Gibran Hemani[1], Bjørn Olav Åsvold[2], Adam Butterworth[7], George Davey Smith[1][8], Robert A. Scott[9], Tom R. Gaunt[1][8]

1 MRC Integrative Epidemiology Unit (IEU), Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK 2 Department of Public Health and Nursing, K.G. Jebsen Center for Genetic Epidemiology, Norwegian University of Science and Technology, Trondheim, Norway 3 Human Target Validation Core, Translational Biology, Biogen, 250 Binney Street, Cambridge, MA 02142, USA 4 Translational Biology, Biogen, 250 Binney Street, Cambridge, MA 02142, USA 5 Human Genetics, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, United Kingdom 6 Genetics, GlaxoSmithKline, Collegeville, PA, USA 7 MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK 8 NIHR Bristol Biomedical Research Centre, Bristol, UK 9 GlaxoSmithKline, Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2NY, UK

Keywords: Mendelian randomization, Phenome-wide association, proteome, transcriptome, Pharmacologic therapy.

Abstract for the conference (Word limit: 2500 characters)

Background: The plasma proteome and blood transcriptome are potential sources of therapeutic targets. Genetic studies of these molecular traits enable systematic comparison of genetic architecture between protein and gene expression; and estimate causal associations of these traits on human diseases. Here, we estimated the effects of 1,740 plasma proteins and 16,058 blood transcripts on 576 phenotypes in Europeans using two-sample Mendelian randomization (MR) followed by single and multi-trait colocalization (coloc/moloc). We report the findings of 9.46 million gene-phenotype associations in an accessible database: EpiGraphDB (

Methods: We compared 4,094 cis/trans pQTLs and eQTLs associated with the same gene (1,194 genes). Only 317 (7.7%) of the pQTLs and eQTLs were in LD (r2>0.1), of which 314 were in the cis region. Although the pQTLs and eQTLs differed, their estimated causal effects on the disease were highly correlated (r=0.98) for the 29 associations with strongest MR and colocalization evidence from pQTL and eQTL (PMR<5.3x10-9, coloc probability>0.8). We found 6 associations with multi-trait coloc evidence, including an approved drug target (PLAU) with a novel indication (Crohn’s disease). Mediation analyses of these 6 findings suggest that PLAU gene expression influenced PLAU protein expression, ultimately affecting susceptibility to Crohn’s disease.

Results: Of 1,718 protein-phenotype associations with MR evidence, 1,215 (70.7%) had coloc evidence. Of 19,775 gene expression-phenotype associations with MR evidence, 12,787 (64.7%) had coloc evidence. This work improves on previous omics studies by making careful use of multiple cis and trans instruments and identifying additional gene-phenotype associations. Retrospective evaluation of 268 developed drugs (Pharmaprojects) showed that target-indication pairs with proteomic MR and coloc support were more likely to succeed (OR=26.4; 95%CI=4-580). Further, we validated 8 marketed drugs, prioritised 8 targets in current drug trials, and identified 53 repurposing opportunities.

Conclusions: This is the first systematic MR and coloc analysis of the plasma proteome and blood transcriptome. Our results suggest that genetic confounding due to LD may be widespread in phenome-wide association studies of molecular traits. We identified novel gene-phenotype associations and provide evidence that proteomic MR/coloc support of drug targets increases probability of success for drug discovery.