Advanced search
Start date
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Expression estimation and eQTL mapping for HLA genes with a personalized pipeline

Full text
Aguiar, Vitor R. C. [1] ; Cesar, Jonatas [1] ; Delaneau, Olivier [2, 3] ; Dermitzakis, Emmanouil T. [2] ; Meyer, Diogo [1]
Total Authors: 5
[1] Univ Sao Paulo, Inst Biosci, Dept Genet & Evolutionary Biol, Sao Paulo - Brazil
[2] Univ Geneva, Dept Genet Med & Dev, Med Sch, Geneva - Switzerland
[3] Univ Lausanne, Dept Computat Biol, Lausanne - Switzerland
Total Affiliations: 3
Document type: Journal article
Source: PLOS GENETICS; v. 15, n. 4 APR 2019.
Web of Science Citations: 2

The HLA (Human Leukocyte Antigens) genes are well-documented targets of balancing selection, and variation at these loci is associated with many disease phenotypes. Variation in expression levels also influences disease susceptibility and resistance, but little information exists about the regulation and population-level patterns of expression. This results from the difficulty in mapping short reads originated from these highly polymorphic loci, and in accounting for the existence of several paralogues. We developed a computational pipeline to accurately estimate expression for HLA genes based on RNA-seq, improving both locus-level and allele-level estimates. First, reads are aligned to all known HLA sequences in order to infer HLA genotypes, then quantification of expression is carried out using a personalized index. We use simulations to show that expression estimates obtained in this way are not biased due to divergence from the reference genome. We applied our pipeline to the GEUVADIS dataset, and compared the quantifications to those obtained with reference transcriptome. Although the personalized pipeline recovers more reads, we found that using the reference transcriptome produces estimates similar to the personalized pipeline (r 0.87) with the exception of HLA-DQA1. We describe the impact of the HLA-personalized approach on downstream analyses for nine classical HLA loci (HLA-A, HLA-C, HLA-B, HLA-DRA, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DPA1, HLA-DPB1). Although the influence of the HLA-personalized approach is modest for eQTL mapping, the p-values and the causality of the eQTLs obtained are better than when the reference transcriptome is used. We investigate how the eQTLs we identified explain variation in expression among lineages of HLA alleles. Finally, we discuss possible causes underlying differences between expression estimates obtained using RNA-seq, antibody-based approaches and qPCR. Author summary The level at which a gene is expressed can have important influence on the phenotype of an organism, including its predisposition to develop diseases. One way to estimate gene expression is by quantifying the abundance of RNA. RNA-seq has become the method of choice to provide such estimates at the genomewide scale. However, the application of RNA-seq to HLA genes key players in the immune adaptive response has remained a rarely explored approach. This is due to the problem of mapping bias, which causes deficient read alignment at genes which are very polymorphic and different from the reference genome. This has motivated approaches that replace the single reference genome with personalized sequences, comprised of the individual's specific HLA genotype. Here we explore the use of computational frameworks to obtain reliable expression levels for HLA genes from RNA-seq datasets. We present a pipeline in which the quantification of HLA expression is carried out using methods which account for HLA diversity, avoiding the biases of standard approaches. We then evaluate the impact of this form of quantifying HLA expression on downstream analyses. The pipeline also allows us to integrate information on eQTLs with expression levels at the HLA allele-level, which can help disentangle different contributions to disease phenotypes and help understand the regulatory architecture at the HLA region. (AU)

FAPESP's process: 15/19990-6 - Uncovering rare variants in HLA genes: evolutionary analysis and impact on genetic load
Grantee:Jônatas Eduardo da Silva César
Support type: Scholarships in Brazil - Post-Doctorate
FAPESP's process: 13/22007-7 - Analysis of HLA gene expression and investigation of the genetic architecture of regulatory control
Grantee:Diogo Meyer
Support type: Scholarships abroad - Research
FAPESP's process: 16/24734-1 - A bioinformatic tool to reliably estimate expression and map eQTLs of HLA genes in multiple datasets
Grantee:Vitor Rezende da Costa Aguiar
Support type: Scholarships abroad - Research Internship - Post-doctor
FAPESP's process: 12/18010-0 - Balancing selection in the human genome: detection, causes and consequences
Grantee:Diogo Meyer
Support type: Regular Research Grants
FAPESP's process: 14/12123-2 - Expression and eQTL mapping of HLA genes: analyses based on large-scale RNAseq assays
Grantee:Vitor Rezende da Costa Aguiar
Support type: Scholarships in Brazil - Post-Doctorate