Prioritization of autoimmune disease-associated genetic variants that perturb regulatory element activity in T cells

Kousuke Mouri, Michael H. Guo, Carl G. de Boer, Michelle M. Lissner, Ingrid A. Harten, Gregory A. Newby, Hannah A. DeBerg, Winona F. Platt, Matteo Gentili, David R. Liu, Daniel J. Campbell, Nir Hacohen, Ryan Tewhey & John P. Ray.
Nat Genet. 2022-05-01;54(5):603-612.
Abstract
Genome-wide association studies (GWASs) have uncovered hundreds of autoimmune disease-associated loci; however, the causal genetic variants within each locus are mostly unknown. Here, we perform high-throughput allele-specific reporter assays to prioritize disease-associated variants for five autoimmune diseases. By examining variants that both promote allele-specific reporter expression and are located in accessible chromatin, we identify 60 putatively causal variants that enrich for statistically fine-mapped variants by up to 57.8-fold. We introduced the risk allele of a prioritized variant (rs72928038) into a human T cell line and deleted the orthologous sequence in mice, both resulting in reduced BACH2 expression. Naive CD8 T cells from mice containing the deletion had reduced expression of genes that suppress activation and maintain stemness and, upon acute viral infection, displayed greater propensity to become effector T cells. Our results represent an example of an effective approach for prioritizing variants and studying their physiologically relevant effects.
Consortium data used in this publication
Data generated in this study from all manuscript figures are available in NCBI GEO (GSE197539). The 1000 Genomes Phase 3 reference panel was obtained from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. DHS data across 733 samples were obtained from https://zenodo.org/record/3838751#.X_IA7-lKg6U. Histone chromatin immunoprecipitation sequencing data were downloaded from ENCODE (encodeproject.org); the specific files utilized are listed in Supplementary Table 20. CAGE-based enhancer annotations were downloaded from https://fantom.gsc.riken.jp/5/datafiles/latest/extra/Enhancers/. ChromHMM was obtained from https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/core_K27ac/jointModel/final/. HOCOMOCO TF position-weighted matrices were obtained from https://hocomoco11.autosome.ru/downloads_v10. ATAC-seq allelic skew data were obtained from Calderon et al. (https://www.nature.com/articles/s41588-019-0505-9; Supplementary Table 1, “significant_ASCs” tab). Chromatin accessibility QTLs were downloaded from Gate et al. (https://www.nature.com/articles/s41588-018-0156-2; Supplementary Table 6). DeltaSVM precomputed weights for naive CD4 T cells and Jurkat cells were obtained from http://www.beerlab.org/deltasvm_models/downloads/deltasvm_models_e2e.tar.gz. The EMBL GWAS catalog (https://www.ebi.ac.uk/gwas/) was accessed on 10 August 2020. T1D GWAS fine-mapping results were obtained from Onengut-Gumuscu et al. (https://www.nature.com/articles/ng.3245; Supplementary Table 1). pcHiC data were obtained from Javierre et al. (https://osf.io/u8tzp/). The ImmunoSigDB immunologic signatures database (v7.2) was downloaded from http://www.gsea-msigdb.org/gsea/msigdb/. Tscm Bach2 guide RNA perturbed mouse RNA-seq data were obtained from NCBI GEO (GSE152379). The GRCm38 mouse transcriptome index for Kallisto RNA-seq alignments were obtained from https://github.com/pachterlab/kallisto-transcriptome-indices/releases. The Bach218del (stock 35028) mouse strain is available at The Jackson Laboratory.
Datasets
DSR308TXA