Preprints

Nonsense-mediated mRNA decay (NMD) is a quality-control pathway that degrades mRNA bearing premature termination codons (PTCs) resulting from mutation or mis-splicing, and that additionally participates in gene regulation of unmutated transcripts. We analyzed ∼10,000 exomes and ∼27,000 transcriptomes from human tumors and healthy tissues to quantify individual-level NMD efficiency, and assess its variability between tissues and between individuals. This was done by monitoring allele-specific expression of germline PTCs, and independently supported by mRNA levels of endogenous NMD target transcripts. Nervous system and reproductive system tissues have lower NMD efficiency than other tissues such as the digestive tract. Next, there is considerable systematic inter-individual variability in NMD efficiency, and we identify two underlying mechanisms. First, in cancers there are somatic copy number alterations that robustly associate with NMD efficiency, prominently the commonly-occurring gain at chromosome 1q that encompasses two core NMD genes: SMG5 and SMG7 and additional functionally interacting genes such as PMF1 and GON4L. Second, loss-of-function germline variants in various genes such as the KDM6B chromatin modifier can associate with higher or lower NMD efficiency in individuals, affecting different tissues thereof. Variable NMD efficiency should have clinical implications as it modulates positive selection upon somatic nonsense mutations in tumor suppressor genes, and is associated with survival of cancer patients, with relevance to predicting immunotherapy responses across cancer types.

Tumors often show an initial response to chemotherapy, but then develop resistance, leading to relapse and poor prognosis. We hypothesized that a genomic comparison of mutations in pre-treated versus treatment-naive tumors would serve to identify genes that confer resistance. A challenge in such an analysis is that therapy alters mutation burdens and signatures, confounding association studies and complicating identifying causal, selected mutations. We developed DiffInvex, a framework for identifying changes in selection acting on individual genes in somatic genomes. Crucially, DiffInvex draws on a mutation rate baseline that accounts for these shifts in neutral mutagenesis during cancer evolution. We applied DiffInvex to 9,953 cancer whole-genomes from 29 cancer types from 8 studies, containing both WGS of treatment-naive tumors and tumors pre-treated by various drugs, identifying genes where point mutations are under conditional positive or negative selection for a certain chemotherapeutic, suggesting resistance mechanisms occurring via point mutation. DiffInvex confirmed well-known chemoresistance-driver mutations in EGFR, ESR1, KIT and AR genes as being under conditional positive selection, with additional cancer types identified for EGFR and KIT. Additionally, DiffInvex identified 11 genes with treatment-associated selection for different classes of therapeutics. In most cases, these genes were common cancer genes including PIK3CA, APC, MAP2K4 and MAP3K1. This suggests that tumor resistance to therapy via mutation often occurs via selective advantages conferred by known driver genes, rather than via mutations in specialized resistance genes. Various gene-chemotherapy associations were further supported in tests for functional impact of mutations, again implemented in a conditional selection setting, as well as replicating in independent panel or exome sequencing data. In addition to nominating drug resistance genes that could be targeted by future therapeutics, DiffInvex can also be applied to diverse analysis in cancer evolution, such as comparing normal and tumoral tissues, or analyzing subclonal evolution, identifying changes in selection over time.

Allele-specific expression (ASE) is the differential abundance in levels of mRNAs that originated from the paternal and maternal copies of a gene. Such allelic imbalances can contribute to phenotypic variation and influence disease traits, including cancer. There is common ASE in tumors that results from somatic copy-number alterations (CNAs) at the DNA level, but there also exist other causes of ASE: cis-acting genetic or epigenetic variation that can lead to differential expression between the two alleles. However, the latter, non-CNA mechanisms of ASE remain understudied in cancer, as well as their role in tumor evolution and impact on clinical outcomes. By integrating a wide variety of genomic and transcriptomic pan-cancer data from the TCGA project, we show that ASE favoring the preferential expression of the mutant allele in some driver genes is subject to positive selection, and that these events are associated with worse overall survival across all cancer types. We found that the impact of ASE triggered by non-CNA causes is substantial, and we propose that some instances of cis-ASE are explained by the epigenetic changes affecting alleles differently. Furthermore, as a second mechanism, we find that splicing-altering mutations are selected in various cancer genes and result in ASE. We anticipate that the study and understanding of the role of mutant allele imbalances at the mRNA level can help understand epigenetic changes during cancer evolution, as well as identify new prognostic markers and therapeutic approaches that target altered allelic expression in tumors.