Publications
Selected publications in reverse chronological order.
2024
- Mapping enhancer-gene regulatory interactions from single-cell dataMaya U Sheth, Wei-Lin Qiu, X Rosa Ma, and 15 more authorsbioRxivorg, Nov 2024
Mapping enhancers and their target genes in specific cell types is crucial for understanding gene regulation and human disease genetics. However, accurately predicting enhancer-gene regulatory interactions from single-cell datasets has been challenging. Here, we introduce a new family of classification models, scE2G, to predict enhancer-gene regulation. These models use features from single-cell ATAC-seq or multiomic RNA and ATAC-seq data and are trained on a CRISPR perturbation dataset including >10,000 evaluated element-gene pairs. We benchmark scE2G models against CRISPR perturbations, fine-mapped eQTLs, and GWAS variant-gene associations and demonstrate state-of-the-art performance at prediction tasks across multiple cell types and categories of perturbations. We apply scE2G to build maps of enhancer-gene regulatory interactions in heterogeneous tissues and interpret noncoding variants associated with complex traits, nominating regulatory interactions linking INPP4B and IL15 to lymphocyte counts. The scE2G models will enable accurate mapping of enhancer-gene regulatory interactions across thousands of diverse human cell types.
- Molecular convergence of risk variants for congenital heart defects leveraging a regulatory map of the human fetal heartX Rosa Ma, Stephanie D Conley, Michael Kosicki, and 34 more authorsmedRxiv, Nov 2024
Congenital heart defects (CHD) arise in part due to inherited genetic variants that alter genes and noncoding regulatory elements in the human genome. These variants are thought to act during fetal development to influence the formation of different heart structures. However, identifying the genes, pathways, and cell types that mediate these effects has been challenging due to the immense diversity of cell types involved in heart development as well as the superimposed complexities of interpreting noncoding sequences. As such, understanding the molecular functions of both noncoding and coding variants remains paramount to our fundamental understanding of cardiac development and CHD. Here, we created a gene regulation map of the healthy human fetal heart across developmental time, and applied it to interpret the functions of variants associated with CHD and quantitative cardiac traits. We collected single-cell multiomic data from 734,000 single cells sampled from 41 fetal hearts spanning post-conception weeks 6 to 22, enabling the construction of gene regulation maps in 90 cardiac cell types and states, including rare populations of cardiac conduction cells. Through an unbiased analysis of all 90 cell types, we find that both rare coding variants associated with CHD and common noncoding variants associated with valve traits converge to affect valvular interstitial cells (VICs). VICs are enriched for high expression of known CHD genes previously identified through mapping of rare coding variants. Eight CHD genes, as well as other genes in similar molecular pathways, are linked to common noncoding variants associated with other valve diseases or traits via enhancers in VICs. In addition, certain common noncoding variants impact enhancers with activities highly specific to particular subanatomic structures in the heart, illuminating how such variants can impact specific aspects of heart structure and function. Together, these results implicate new enhancers, genes, and cell types in the genetic etiology of CHD, identify molecular convergence of common noncoding and rare coding variants on VICs, and suggest a more expansive view of the cell types instrumental in genetic risk for CHD, beyond the working cardiomyocyte. This regulatory map of the human fetal heart will provide a foundational resource for understanding cardiac development, interpreting genetic variants associated with heart disease, and discovering targets for cell-type specific therapies.
- Single cell variant to enhancer to gene map for coronary artery diseaseJunedh M Amrute, Paul C Lee, Ittai Eres, and 31 more authorsmedRxiv, Nov 2024
AbstractAlthough genome wide association studies (GWAS) in large populations have identified hundreds of variants associated with common diseases such as coronary artery disease (CAD), most disease-associated variants lie within non-coding regions of the genome, rendering it difficult to determine the downstream causal gene and cell type. Here, we performed paired single nucleus gene expression and chromatin accessibility profiling from 44 human coronary arteries. To link disease variants to molecular traits, we developed a meta-map of 88 samples and discovered 11,182 single-cell chromatin accessibility quantitative trait loci (caQTLs). Heritability enrichment analysis and disease variant mapping demonstrated that smooth muscle cells (SMCs) harbor the greatest genetic risk for CAD. To capture the continuum of SMC cell states in disease, we used dynamic single cell caQTL modeling for the first time in tissue to uncover QTLs whose effects are modified by cell state and expand our insight into genetic regulation of heterogenous cell populations. Notably, we identified a variant in theCOL4A1/COL4A2CAD GWAS locus which becomes a caQTL as SMCs de-differentiate by changing a transcription factor binding site for EGR1/2. To unbiasedly prioritize functional candidate genes, we built a genome-wide single cell variant to enhancer to gene (scV2E2G) map for human CAD to link disease variants to causal genes in cell types. Using this approach, we found several hundred genes predicted to be linked to disease variants in different cell types. Next, we performed genome-wide Hi-C in 16 human coronary arteries to build tissue specific maps of chromatin conformation and link disease variants to integrated chromatin hubs and distal target genes. Using this approach, we show that rs4887091 within theADAMTS7CAD GWAS locus modulates function of a super chromatin interactome through a change in a CTCF binding site. Finally, we used CRISPR interference to validate a distal gene,AMOTL2, liked to a CAD GWAS locus. Collectively we provide a disease-agnostic framework to translate human genetic findings to identify pathologic cell states and genes driving disease, producing a comprehensive scV2E2G map with genetic and tissue level convergence for future mechanistic and therapeutic studies.
- Publisher Correction: MYC activity at enhancers drives prognostic transcriptional programs through an epigenetic switchSimon T Jakobsen, Rikke A M Jensen, Maria S Madsen, and 7 more authorsNat. Genet., Aug 2024
- Adaptation to an acid microenvironment promotes pancreatic cancer organoid growth and drug resistanceArnaud Stigliani, Renata Ialchina, Jiayi Yao, and 7 more authorsCell Rep., Jul 2024
Harsh environments in poorly perfused tumor regions may select for traits driving cancer aggressiveness. Here, we investigated whether tumor acidosis interacts with driver mutations to exacerbate cancer hallmarks. We adapted mouse organoids from normal pancreatic duct (mN10) and early pancreatic cancer (mP4, KRAS-G12D mutation, ± p53 knockout) from extracellular pH 7.4 to 6.7, representing acidic niches. Viability was increased by acid adaptation, a pattern most apparent in wild-type (WT) p53 organoids, and exacerbated upon return to pH 7.4. This led to increased survival of acid-adapted organoids treated with gemcitabine and/or erlotinib, and, in WT p53 organoids, acid-induced attenuation of drug effects. New genetic variants became dominant during adaptation, yet they were unlikely to be its main drivers. Transcriptional changes induced by acid and drug adaptation differed overall, but acid adaptation increased the expression of gemcitabine resistance genes. Thus, adaptation to acidosis increases cancer cell viability after chemotherapy.
- MYC activity at enhancers drives prognostic transcriptional programs through an epigenetic switchSimon T Jakobsen, Rikke A M Jensen, Maria S Madsen, and 7 more authorsNat. Genet., Mar 2024
The transcription factor MYC is overexpressed in most cancers, where it drives multiple hallmarks of cancer progression. MYC is known to promote oncogenic transcription by binding to active promoters. In addition, MYC has also been shown to invade distal enhancers when expressed at oncogenic levels, but this enhancer binding has been proposed to have low gene-regulatory potential. Here, we demonstrate that MYC directly regulates enhancer activity to promote cancer type-specific gene programs predictive of poor patient prognosis. MYC induces transcription of enhancer RNA through recruitment of RNA polymerase II (RNAPII), rather than regulating RNAPII pause-release, as is the case at promoters. This process is mediated by MYC-induced H3K9 demethylation and acetylation by GCN5, leading to enhancer-specific BRD4 recruitment through its bromodomains, which facilitates RNAPII recruitment. We propose that MYC drives prognostic cancer type-specific gene programs through induction of an enhancer-specific epigenetic switch, which can be targeted by BET and GCN5 inhibitors.
2023
- Symmetric inheritance of parental histones governs epigenome maintenance and embryonic stem cell identityAlice Wenger, Alva Biran, Nicolas Alcaraz, and 11 more authorsNat. Genet., Sep 2023
Modified parental histones are segregated symmetrically to daughter DNA strands during replication and can be inherited through mitosis. How this may sustain the epigenome and cell identity remains unknown. Here we show that transmission of histone-based information during DNA replication maintains epigenome fidelity and embryonic stem cell plasticity. Asymmetric segregation of parental histones H3-H4 in MCM2-2A mutants compromised mitotic inheritance of histone modifications and globally altered the epigenome. This included widespread spurious deposition of repressive modifications, suggesting elevated epigenetic noise. Moreover, H3K9me3 loss at repeats caused derepression and H3K27me3 redistribution across bivalent promoters correlated with misexpression of developmental genes. MCM2-2A mutation challenged dynamic transitions in cellular states across the cell cycle, enhancing naïve pluripotency and reducing lineage priming in G1. Furthermore, developmental competence was diminished, correlating with impaired exit from pluripotency. Collectively, this argues that epigenetic inheritance of histone modifications maintains a correctly balanced and dynamic chromatin landscape able to support mammalian cell differentiation.
- Transcription factor expression is the main determinant of variability in gene co-activityLucas Duin, Robert Krautz, Sarah Rennie, and 1 more authorMol. Syst. Biol., Jul 2023
Many genes are co-expressed and form genomic domains of coordinated gene activity. However, the regulatory determinants of domain co-activity remain unclear. Here, we leverage human individual variation in gene expression to characterize the co-regulatory processes underlying domain co-activity and systematically quantify their effect sizes. We employ transcriptional decomposition to extract from RNA expression data an expression component related to co-activity revealed by genomic positioning. This strategy reveals close to 1,500 co-activity domains, covering most expressed genes, of which the large majority are invariable across individuals. Focusing specifically on domains with high variability in co-activity reveals that contained genes have a higher sharing of eQTLs, a higher variability in enhancer interactions, and an enrichment of binding by variably expressed transcription factors, compared to genes within non-variable domains. Through careful quantification of the relative contributions of regulatory processes underlying co-activity, we find transcription factor expression levels to be the main determinant of gene co-activity. Our results indicate that distal trans effects contribute more than local genetic variation to individual variation in co-activity domains.
- Pooled analysis of frontal lobe transcriptomic data identifies key mitophagy gene changes in Alzheimer’s disease brainTaoyu Mei, Yuan Li, Anna Orduña Dolado, and 4 more authorsFront. Aging Neurosci., Jun 2023
Background: The growing prevalence of Alzheimer’s disease (AD) is becoming a global health challenge without effective treatments. Defective mitochondrial function and mitophagy have recently been suggested as etiological factors in AD, in association with abnormalities in components of the autophagic machinery like lysosomes and phagosomes. Several large transcriptomic studies have been performed on different brain regions from AD and healthy patients, and their data represent a vast source of important information that can be utilized to understand this condition. However, large integration analyses of these publicly available data, such as AD RNA-Seq data, are still missing. In addition, large-scale focused analysis on mitophagy, which seems to be relevant for the aetiology of the disease, has not yet been performed. Methods: In this study, publicly available raw RNA-Seq data generated from healthy control and sporadic AD post-mortem human samples of the brain frontal lobe were collected and integrated. Sex-specific differential expression analysis was performed on the combined data set after batch effect correction. From the resulting set of differentially expressed genes, candidate mitophagy-related genes were identified based on their known functional roles in mitophagy, the lysosome, or the phagosome, followed by Protein-Protein Interaction (PPI) and microRNA-mRNA network analysis. The expression changes of candidate genes were further validated in human skin fibroblast and induced pluripotent stem cells (iPSCs)-derived cortical neurons from AD patients and matching healthy controls. Results: From a large dataset (AD: 589; control: 246) based on three different datasets (i.e., ROSMAP, MSBB, & GSE110731), we identified 299 candidate mitophagy-related differentially expressed genes (DEG) in sporadic AD patients (male: 195, female: 188). Among these, the AAA ATPase VCP, the GTPase ARF1, the autophagic vesicle forming protein GABARAPL1 and the cytoskeleton protein actin beta ACTB were selected based on network degrees and existing literature. Changes in their expression were further validated in AD-relevant human in vitro models, which confirmed their down-regulation in AD conditions. Conclusion: Through the joint analysis of multiple publicly available data sets, we identify four differentially expressed key mitophagy-related genes potentially relevant for the pathogenesis of sporadic AD. Changes in expression of these four genes were validated using two AD-relevant human in vitro models, primary human fibroblasts and iPSC-derived neurons. Our results provide foundation for further investigation of these genes as potential biomarkers or disease-modifying pharmacological targets.
- Transfer learning identifies sequence determinants of cell-type specific regulatory element accessibilityMarco Salvatore, Marc Horlacher, Annalisa Marsico, and 2 more authorsNAR Genom. Bioinform., Jun 2023
Dysfunction of regulatory elements through genetic variants is a central mechanism in the pathogenesis of disease. To better understand disease etiology, there is consequently a need to understand how DNA encodes regulatory activity. Deep learning methods show great promise for modeling of biomolecular data from DNA sequence but are limited to large input data for training. Here, we develop ChromTransfer, a transfer learning method that uses a pre-trained, cell-type agnostic model of open chromatin regions as a basis for fine-tuning on regulatory sequences. We demonstrate superior performances with ChromTransfer for learning cell-type specific chromatin accessibility from sequence compared to models not informed by a pre-trained model. Importantly, ChromTransfer enables fine-tuning on small input data with minimal decrease in accuracy. We show that ChromTransfer uses sequence features matching binding site sequences of key transcription factors for prediction. Together, these results demonstrate ChromTransfer as a promising tool for learning the regulatory code.
2022
- Current challenges in understanding the role of enhancers in diseaseJudith Barbara Zaugg, Pelin Sahlén, Robin Andersson, and 9 more authorsNat. Struct. Mol. Biol., Dec 2022
Enhancers play a central role in the spatiotemporal control of gene expression and tend to work in a cell-type-specific manner. In addition, they are suggested to be major contributors to phenotypic variation, evolution and disease. There is growing evidence that enhancer dysfunction due to genetic, structural or epigenetic mechanisms contributes to a broad range of human diseases referred to as enhanceropathies. Such mechanisms often underlie the susceptibility to common diseases, but can also play a direct causal role in cancer or Mendelian diseases. Despite the recent gain of insights into enhancer biology and function, we still have a limited ability to predict how enhancer dysfunction impacts gene expression. Here we discuss the major challenges that need to be overcome when studying the role of enhancers in disease etiology and highlight opportunities and directions for future studies, aiming to disentangle the molecular basis of enhanceropathies.
- Promoter sequence and architecture determine expression variability and confer robustness to genetic variantsHjorleifur Einarsson, Marco Salvatore, Christian Vaagenso, and 4 more authorsElife, Nov 2022
Genetic and environmental exposures cause variability in gene expression. Although most genes are affected in a population, their effect sizes vary greatly, indicating the existence of regulatory mechanisms that could amplify or attenuate expression variability. Here, we investigate the relationship between the sequence and transcription start site architectures of promoters and their expression variability across human individuals. We find that expression variability can be largely explained by a promoter’s DNA sequence and its binding sites for specific transcription factors. We show that promoter expression variability reflects the biological process of a gene, demonstrating a selective trade-off between stability for metabolic genes and plasticity for responsive genes and those involved in signaling. Promoters with a rigid transcription start site architecture are more prone to have variable expression and to be associated with genetic variants with large effect sizes, while a flexible usage of transcription start sites within a promoter attenuates expression variability and limits genotypic effects. Our work provides insights into the variable nature of responsive genes and reveals a novel mechanism for supplying transcriptional and mutational robustness to essential genes through multiple transcription start site regions within a promoter.
- Endogenous retroviruses co-opted as divergently transcribed regulatory elements shape the regulatory landscape of embryonic stem cellsStylianos Bakoulis, Robert Krautz, Nicolas Alcaraz, and 2 more authorsNucleic Acids Res., Feb 2022
Transposable elements are an abundant source of transcription factor binding sites, and favorable genomic integration may lead to their recruitment by the host genome for gene regulatory functions. However, it is unclear how frequent co-option of transposable elements as regulatory elements is, to which regulatory programs they contribute and how they compare to regulatory elements devoid of transposable elements. Here, we report a transcription initiation-centric, in-depth characterization of the transposon-derived regulatory landscape of mouse embryonic stem cells. We demonstrate that a substantial number of transposable element insertions, in particular endogenous retroviral elements, are associated with open chromatin regions that are divergently transcribed into unstable RNAs in a cell-type specific manner, and that these elements contribute to a sizable proportion of active enhancers and gene promoters. We further show that transposon subfamilies contribute differently and distinctly to the pluripotency regulatory program through their repertoires of transcription factor binding site sequences, shedding light on the formation of regulatory programs and the origins of regulatory elements.
2021
- hyperTRIBER: a flexible R package for the analysis of differential RNA editingSarah Rennie, Daniel Heidar Magnusson, and Robin AnderssonOct 2021
RNA editing by ADAR (adenosine deaminase acting on RNA) is gaining an increased interest in the field of post-transcriptional regulation. Fused to an RNA-binding protein (RBP) of interest, the catalytic activity of ADAR results in A-to-I RNA edits, whose identification will determine RBP-bound RNA transcripts. However, the computational tools available for their identification and differential RNA editing statistical analysis are limited or too specialised for general-purpose usage. Here we present hyperTRIBER, a flexible suite of tools, wrapped into a convenient R package, for the detection of differential RNA editing. hyperTRIBER is applicable to complex scenarios and experimental designs, and provides a robust statistical framework allowing for the control for coverage of reads at a given base, the total expression level and other co-variates. We demonstrate the capabilities of our approach on HyperTRIBE RNA-seq data for the detection of bound RNAs by the N6-methyladenosine (m6A) reader protein ECT2 in Arabidopsis roots. We show that hyperTRIBER finds edits with a high statistical power, even where editing proportions and RNA transcript expression levels are low, together demonstrating its usability and versatility for analysing differential RNA editing.
- Principles of mRNA targeting via the Arabidopsis m6A-binding protein ECT2Laura Arribas-Hernández, Sarah Rennie, Tino Köster, and 5 more authorsElife, Sep 2021
Specific recognition of N6-methyladenosine (m6A) in mRNA by RNA-binding proteins containing a YT521-B homology (YTH) domain is important in eukaryotic gene regulation. The Arabidopsis YTH-domain protein ECT2 is thought to bind to mRNA at URU(m6A)Y sites, yet RR(m6A)CH is the canonical m6A consensus site in all eukaryotes and ECT2 functions require m6A binding activity. Here, we apply iCLIP (individual-nucleotide resolution cross-linking and immunoprecipitation) and HyperTRIBE (targets of RNA-binding proteins identified by editing) to define high-quality target sets of ECT2, and analyze the patterns of enriched sequence motifs around ECT2 crosslink sites. Our analyses show that ECT2 does in fact bind to RR(m6A)CH. Pyrimidine-rich motifs are enriched around, but not at m6A-sites, reflecting a preference for N6-adenosine methylation of RRACH/GGAU islands in pyrimidine-rich regions. Such motifs, particularly oligo-U and UNUNU upstream of m6A sites, are also implicated in ECT2 binding via its intrinsically disordered region (IDR). Finally, URUAY-type motifs are enriched at ECT2 crosslink sites, but their distinct properties suggest function as sites of competition between binding of ECT2 and as yet unidentified RNA-binding proteins. Our study provides coherence between genetic and molecular studies of m6A-YTH function in plants, and reveals new insight into the mode of RNA recognition by YTH-domain-containing proteins.
- The YTHDF proteins ECT2 and ECT3 bind largely overlapping target sets and influence target mRNA abundance, not alternative polyadenylationLaura Arribas-Hernández, Sarah Rennie, Michael Schon, and 5 more authorsElife, Sep 2021
Gene regulation via N6-methyladenosine (m6A) in mRNA involves RNA-binding proteins that recognize m6A via a YT521-B homology (YTH) domain. The plant YTH domain proteins ECT2 and ECT3 act genetically redundantly in stimulating cell proliferation during organogenesis, but several fundamental questions regarding their mode of action remain unclear. Here, we use HyperTRIBE (targets of RNA-binding proteins identified by editing) to show that most ECT2 and ECT3 targets overlap, with only few examples of preferential targeting by either of the two proteins. HyperTRIBE in different mutant backgrounds also provides direct views of redundant and specific target interactions of the two proteins. We also show that contrary to conclusions of previous reports, ECT2 does not accumulate in the nucleus. Accordingly, inactivation of ECT2, ECT3 and their surrogate ECT4 does not change patterns of polyadenylation site choice in ECT2/3 target mRNAs, but does lead to lower steady state accumulation of target mRNAs. In addition, mRNA and microRNA expression profiles show indications of stress response activation in ect2/ect3/ect4 mutants, likely via indirect effects. Thus, previous suggestions of control of alternative polyadenylation by ECT2 are not supported by evidence, and ECT2 and ECT3 act largely redundantly to regulate target mRNA, including its abundance, in the cytoplasm.
- Genome-wide and sister chromatid-resolved profiling of protein occupancy in replicated chromatin with ChOR-seq and SCAR-seq | Nature ProtocolsNataliya Petryk, Nazaret Reverón-Gómez, Cristina González-Aguilera, and 3 more authorsSep 2021
Elucidating the mechanisms underlying chromatin maintenance upon genome replication is critical for the understanding of how gene expression programs and cell identity are preserved across cell divisions. Here, we describe two recently developed techniques, chromatin occupancy after replication (ChOR)-seq and sister chromatids after replication (SCAR)-seq, that profile chromatin occupancy on newly replicated DNA in mammalian cells in 5 d of bench work. Both techniques share a common strategy that includes pulse labeling of newly synthesized DNA and chromatin immunoprecipitation (ChIP), followed by purification and high-throughput sequencing. Whereas ChOR-seq quantitatively profiles the post-replicative abundance of histone modifications and chromatin-associated proteins, SCAR-seq distinguishes chromatin occupancy between nascent sister chromatids. Together, these two complementary techniques have unraveled key mechanisms controlling the inheritance of modified histones during replication and revealed locus-specific dynamics of histone modifications across the cell cycle. Here, we provide the experimental protocols and bioinformatic pipelines for these methods.
2020
- Comparative transcriptomics of primary cells in vertebratesTanvir Alam, Saumya Agrawal, Jessica Severin, and 22 more authorsGenome Res., Jul 2020
Gene expression profiles in homologous tissues have been observed to be different between species, which may be due to differences between species in the gene expression program in each cell type, but may also reflect differences in cell type composition of each tissue in different species. Here, we compare expression profiles in matching primary cells in human, mouse, rat, dog, and chicken using Cap Analysis Gene Expression (CAGE) and short RNA (sRNA) sequencing data from FANTOM5. While we find that expression profiles of orthologous genes in different species are highly correlated across cell types, in each cell type many genes were differentially expressed between species. Expression of genes with products involved in transcription, RNA processing, and transcriptional regulation was more likely to be conserved, while expression of genes encoding proteins involved in intercellular communication was more likely to have diverged during evolution. Conservation of expression correlated positively with the evolutionary age of genes, suggesting that divergence in expression levels of genes critical for cell function was restricted during evolution. Motif activity analysis showed that both promoters and enhancers are activated by the same transcription factors in different species. An analysis of expression levels of mature miRNAs and of primary miRNAs identified by CAGE revealed that evolutionary old miRNAs are more likely to have conserved expression patterns than young miRNAs. We conclude that key aspects of the regulatory network are conserved, while differential expression of genes involved in cell-to-cell communication may contribute greatly to phenotypic differences between species.
- Determinants of enhancer and promoter activities of regulatory elementsRobin Andersson, and Albin SandelinNat. Rev. Genet., Feb 2020
The proper activities of enhancers and gene promoters are essential for coordinated transcription within a cell. Although diverse methodologies have been developed to identify enhancers and promoters, most have tacitly assumed that these elements are distinct. However, studies have unexpectedly shown that regulatory elements may have both enhancer and promoter functions. Here we review these results, focusing on the factors that determine the promoter and/or enhancer activity of regulatory elements. We discuss emerging models that define regulatory elements by accessible DNA and their non-mutually-exclusive abilities to drive transcription initiation (promoter activity) and/or to enhance transcription at other such regions (enhancer activity).
- The Transcriptional Network That Controls Growth Arrest and Macrophage Differentiation in the Human Myeloid Leukemia Cell Line THP-1Iveta Gažová, Lucas Lefevre, Stephen J Bush, and 9 more authorsFrontiers in Cell and Developmental Biology, Feb 2020
The response of the human acute myeloid leukemia cell line THP-1 to phorbol esters has been widely-studied to test candidate leukemia therapies and as a model of cell cycle arrest and monocyte-macrophage differentiation. Here we have employed Cap Analysis of Gene Expression (CAGE) to analyse a dense time course of transcriptional regulation in THP-1 cells treated with phorbol myristate acetate (PMA) over 96 hours. PMA treatment greatly reduced the numbers of cells entering S phase and also blocked cells exiting G2/M. The PMA-treated cells became adherent and expression of mature macrophage-specific genes increased progressively over the duration of the time course. Within 1-2 hours PMA induced known targets of tumour protein p53 (TP53), notably CDKN1A, followed by gradual down-regulation of cell-cycle associated genes. Also within the first 2 hours, PMA induced immediate early genes including transcription factor genes encoding proteins implicated in macrophage differentiation (EGR2, JUN, MAFB) and down-regulated genes for transcription factors involved in immature myeloid cell proliferation (MYB, IRF8, GFI1). The dense time course revealed that the response to PMA was not linear and progressive. Rather, network-based clustering of the time course data highlighted a sequential cascade of transient up- and down-regulated expression of genes encoding feedback regulators, as well as transcription factors associated with macrophage differentiation and their inferred target genes. CAGE also identified known and candidate novel enhancers expressed in THP-1 cells and many novel inducible genes that currently lack functional annotation and/or had no previously known function in macrophages. The time course is available on the ZENBU platform allowing comparison to FANTOM4 and FANTOM5 data.
2019
- CAGEfightR: analysis of 5’-end data using R/BioconductorMalte Thodberg, Axel Thieffry, Kristoffer Vitting-Seerup, and 2 more authorsBMC Bioinformatics, Oct 2019
BACKGROUND: 5’-end sequencing assays, and Cap Analysis of Gene Expression (CAGE) in particular, have been instrumental in studying transcriptional regulation. 5’-end methods provide genome-wide maps of transcription start sites (TSSs) with base pair resolution. Because active enhancers often feature bidirectional TSSs, such data can also be used to predict enhancer candidates. The current availability of mature and comprehensive computational tools for the analysis of 5’-end data is limited, preventing efficient analysis of new and existing 5’-end data. RESULTS: We present CAGEfightR, a framework for analysis of CAGE and other 5’-end data implemented as an R/Bioconductor-package. CAGEfightR can import data from BigWig files and allows for fast and memory efficient prediction and analysis of TSSs and enhancers. Downstream analyses include quantification, normalization, annotation with transcript and gene models, TSS shape statistics, linking TSSs to enhancers via co-expression, identification of enhancer clusters, and genome-browser style visualization. While built to analyze CAGE data, we demonstrate the utility of CAGEfightR in analyzing nascent RNA 5’-data (PRO-Cap). CAGEfightR is implemented using standard Bioconductor classes, making it easy to learn, use and combine with other Bioconductor packages, for example popular differential expression tools such as limma, DESeq2 and edgeR. CONCLUSIONS: CAGEfightR provides a single, scalable and easy-to-use framework for comprehensive downstream analysis of 5’-end data. CAGEfightR is designed to be interoperable with other Bioconductor packages, thereby unlocking hundreds of mature transcriptomic analysis tools for 5’-end data. CAGEfightR is freely available via Bioconductor: bioconductor.org/packages/CAGEfightR .
- PLZF targets developmental enhancers for activation during osteogenic differentiation of human mesenchymal stem cellsShuchi Agrawal Singh, Mads Lerdrup, Ana-Luisa R Gomes, and 6 more authorsElife, Jan 2019
A key transcription-factor for osteogenic differentiation, PLZF, acts as a transcriptional activator by binding to active developmental enhancers and facilitates mediator recruitment, but is not involved in enhancer looping.
2018
- The GBAF chromatin remodeling complex binds H3K27ac and mediates enhancer transcriptionKirill Jefimov, Nicolas Alcaraz, Susan Kloet, and 6 more authorsbioRxiv, Oct 2018
H3K27ac is associated with regulatory active enhancers, but its exact role in enhancer function remains elusive. Using mass spectrometry-based interaction proteomics, we identified the Super Elongation Complex (SEC) and GBAF, a non-canonical GLTSCR1L- and BRD9-containing SWI/SNF chromatin remodeling complex, to be major interactors of H3K27ac. We systematically characterized the composition of GBAF and the conserved GLTSCR1/1L 9GiBAF9-domain, which we found to be responsible for GBAF complex formation and GLTSCR1L nuclear localization. Inhibition of the bromodomain of BRD9 revealed interaction between GLTSCR1L and H3K27ac to be BRD9-dependent and led to GLTSCR1L dislocation from its preferred binding sites at H3K27ac-associated enhancers. GLTSCR1L disassociation from chromatin resulted in genome-wide downregulation of enhancer transcription while leaving most mRNA expression levels unchanged, except for reduced mRNA levels from loci topologically linked to affected enhancers. Our results indicate that GBAF is an enhancer-associated chromatin remodeler important for transcriptional and regulatory activity of enhancers.
- MCM2 promotes symmetric inheritance of modified histones during DNA replicationNataliya Petryk, Maria Dalby, Alice Wenger, and 4 more authorsScience, Sep 2018
Parental histones with modifications are recycled to newly replicated DNA strands during genome replication, but do the two sister chromatids inherit modified histones equally? Yu et al. and Petryk et al. found in mouse and yeast, respectively, that modified histones are segregated to both DNA daughter strands in a largely symmetric manner (see the Perspective by Ahmad and Henikoff). However, the mechanisms ensuring this symmetric inheritance in yeast and mouse were different. Yeasts use subunits of DNA polymerase to prevent the lagging-strand bias of parental histones, whereas in mouse cells, the replicative helicase MCM2 counters the leading-strand bias. Science , this issue p. [1386][1], p. [1389][2]; see also p. [1311][3] During genome replication, parental histones are recycled to newly replicated DNA with their posttranslational modifications (PTMs). Whether sister chromatids inherit modified histones evenly remains unknown. We measured histone PTM partition to sister chromatids in embryonic stem cells. We found that parental histones H3-H4 segregate to both daughter DNA strands with a weak leading-strand bias, skewing partition at topologically associating domain (TAD) borders and enhancers proximal to replication initiation zones. Segregation of parental histones to the leading strand increased markedly in cells with histone-binding mutations in MCM2, part of the replicative helicase, exacerbating histone PTM sister chromatid asymmetry. This work reveals how histones are inherited to sister chromatids and identifies a mechanism by which the replication machinery ensures symmetric cell division. [1]: /lookup/doi/10.1126/science.aat8849 [2]: /lookup/doi/10.1126/science.aau0294 [3]: /lookup/doi/10.1126/science.aav0871
- The RNA exosome contributes to gene expression regulation during stem cell differentiationMarta Lloret-Llinares, Evdoxia Karadoulama, Yun Chen, and 7 more authorsNucleic Acids Res., Sep 2018
Abstract. Gene expression programs change during cellular transitions. It is well established that a network of transcription factors and chromatin modifiers r
- FANTOM5 transcribed enhancers in mm10Maria Dalby, Sarah Rennie, and Robin AnderssonSep 2018
Overview Transcribed enhancers were identified and their expression was quantified across all human FANTOM5 libraries, following the re-aligned FANTOM5 CAGE data upon mm10 (GRCm38) (obtained from http://fantom.gsc.riken.jp/5/datafiles/reprocessed/mm10_v1/basic/), and decomposition-based peak identification (obtained from https://zenodo.org/record/545682#.WPuNy1Pyv2Q) by Kawaji, Hideya. Description Transcribed enhancers were called based on bidirectional balanced RNA signatures as per Andersson et al (2014). Enhancers were only identified distal to known exons (+/-100bp region from boundaries) and transcription start sites (+/-300bp), defined by GENCODE vM7 annotation. In total, 44,138 enhancers were identified across 1,068 libraries and the expression was quantified. For details regarding the identification of transcribed enhancers from CAGE data, please see Andersson et al (2014) and blog post. Due to varying noise levels across FANTOM5 libraries and the intrinsic low expression levels of transcribed enhancers, library-specific noise levels were estimated to define of robust set of enhancers in each sample. In summary, for each library, expression was quantified in randomly sampled genomic regions distal to assembly gaps, DNase hypersensitive sites (ENCODE), known exons and gene TSSs (GENCODE vM7) to create a genomic background expression distribution. For each library, we called an enhancer active (used) if its expression was above the 99.9th quantile of the library’s genomic background expression distribution. The robust set of enhancers consist of those significantly expressed in at least one library. While this approach ensures less permissive enhancer calling in noisy libraries, for some libraries the noise threshold is zero meaning that a single CAGE tag is sufficient for calling an enhancer active. Furthermore, the possibility of detecting enhancer transcription is affected by sequencing depth, so the number of active enhancers per library might not be biologically meaningful to compare when sequencing depths differ. Data files Each predicted enhancer is described in BED12 format with two blocks denoting the merged regions of transcription initiation on the minus and plus strands. The thickStart and thickEnd columns denote the inferred mid position between blocks of transcription initiation events. Expression and usage matrices are tab delimited and the first row gives the FANTOM5 CNhs IDs and the first column the enhancer ID (same as column 4 in BED file). Usage matrices contain zeroes and ones (0:not used, 1:used). enhancers (BED12 format) enhancer expression matrix (tab delimited, first row: CNhs IDs, first column: enhancer ID (coordinate)) binary enhancer usage matrix (0:not used, 1:used, tab delimited, first row: CNhs IDs, first column: enhancer ID (coordinate))
- Transcription start site analysis reveals widespread divergent transcription in D. melanogaster and core promoter-encoded enhancer activitiesSarah Rennie, Maria Dalby, Marta Lloret-Llinares, and 4 more authorsNucleic Acids Res., Apr 2018
Mammalian gene promoters and enhancers share many properties. They are composed of a unified promoter architecture of divergent transcripton initiation and gene promoters may exhibit enhancer function. However, it is currently unclear how expression strength of a regulatory element relates to its enhancer strength and if the unifying architecture is conserved across Metazoa. Here we investigate the transcription initiation landscape and its associated RNA decay in Drosophila melanogaster. We find that the majority of active gene-distal enhancers and a considerable fraction of gene promoters are divergently transcribed. We observe quantitative relationships between enhancer potential, expression level and core promoter strength, providing an explanation for indirectly related histone modifications that are reflecting expression levels. Lowly abundant unstable RNAs initiated from weak core promoters are key characteristics of gene-distal developmental enhancers, while the housekeeping enhancer strengths of gene promoters reflect their expression strengths. The seemingly separable layer of regulation by gene promoters with housekeeping enhancer potential is also indicated by chromatin interaction data. Our results suggest a unified promoter architecture of many D. melanogaster regulatory elements, that is universal across Metazoa, whose regulatory functions seem to be related to their core promoter elements.
- Shared activity patterns arising at genetic susceptibility loci reveal underlying genomic and cellular architecture of human diseaseJ Kenneth Baillie, Andrew Bretherick, Christopher S Haley, and 32 more authorsPLoS Comput. Biol., Mar 2018
Genetic variants underlying complex traits, including disease susceptibility, are enriched within the transcriptional regulatory elements, promoters and enhancers. There is emerging evidence that regulatory elements associated with particular traits or diseases share similar patterns of transcriptional activity. Accordingly, shared transcriptional activity (coexpression) may help prioritise loci associated with a given trait, and help to identify underlying biological processes. Using cap analysis of gene expression (CAGE) profiles of promoter- and enhancer-derived RNAs across 1824 human samples, we have analysed coexpression of RNAs originating from trait-associated regulatory regions using a novel quantitative method (network density analysis; NDA). For most traits studied, phenotype-associated variants in regulatory regions were linked to tightly-coexpressed networks that are likely to share important functional characteristics. Coexpression provides a new signal, independent of phenotype association, to enable fine mapping of causative variants. The NDA coexpression approach identifies new genetic variants associated with specific traits, including an association between the regulation of the OCT1 cation transporter and genetic variants underlying circulating cholesterol levels. NDA strongly implicates particular cell types and tissues in disease pathogenesis. For example, distinct groupings of disease-associated regulatory regions implicate two distinct biological processes in the pathogenesis of ulcerative colitis; a further two separate processes are implicated in Crohn’s disease. Thus, our functional analysis of genetic predisposition to disease defines new distinct disease endotypes. We predict that patients with a preponderance of susceptibility variants in each group are likely to respond differently to pharmacological therapy. Together, these findings enable a deeper biological understanding of the causal basis of complex traits.
- Transcriptional decomposition reveals active chromatin architectures and cell specific regulatory interactionsSarah Rennie, Maria Dalby, Lucas Duin, and 1 more authorNat. Commun., Feb 2018
Transcriptional regulation is tightly coupled with chromosomal positioning and three-dimensional chromatin architecture. However, it is unclear what proportion of transcriptional activity is reflecting such organisation, how much can be informed by RNA expression alone and how this impacts disease. Here, we develop a computational transcriptional decomposition approach separating the proportion of expression associated with genome organisation from independent effects not directly related to genomic positioning. We show that positionally attributable expression accounts for a considerable proportion of total levels and is highly informative of topological associating domain activities and organisation, revealing boundaries and chromatin compartments. Furthermore, expression data alone accurately predict individual enhancer-promoter interactions, drawing features from expression strength, stabilities, insulation and distance. We characterise predictions in 76 human cell types, observing extensive sharing of domains, yet highly cell-type-specific enhancer-promoter interactions and strong enrichments in relevant trait-associated variants. Overall, our work demonstrates a close relationship between transcription and chromatin architecture.
- Loss-of-function variants in ADCY3 increase risk of obesity and type 2 diabetesNiels Grarup, Ida Moltke, Mette K Andersen, and 17 more authorsNat. Genet., Feb 2018
We have identified a variant in ADCY3 (encoding adenylate cyclase 3) associated with markedly increased risk of obesity and type 2 diabetes in the Greenlandic population. The variant disrupts a splice acceptor site, and carriers have decreased ADCY3 RNA expression. Additionally, we observe an enrichment of rare ADCY3 loss-of-function variants among individuals with type 2 diabetes in trans-ancestry cohorts. These findings provide new information on disease etiology relevant for future treatment strategies.
- Characterization of the enhancer and promoter landscape of inflammatory bowel disease from human colon biopsiesMette Boyd, Malte Thodberg, Morana Vitezic, and 24 more authorsNat. Commun., Jan 2018
Inflammatory bowel disease (IBD) is a chronic intestinal disorder, with two main types: Crohn’s disease (CD) and ulcerative colitis (UC), whose molecular pathology is not well understood. The majority of IBD-associated SNPs are located in non-coding regions and are hard to characterize since regulatory regions in IBD are not known. Here we profile transcription start sites (TSSs) and enhancers in the descending colon of 94 IBD patients and controls. IBD-upregulated promoters and enhancers are highly enriched for IBD-associated SNPs and are bound by the same transcription factors. IBD-specific TSSs are associated to genes with roles in both inflammatory cascades and gut epithelia while TSSs distinguishing UC and CD are associated to gut epithelia functions. We find that as few as 35 TSSs can distinguish active CD, UC, and controls with 85% accuracy in an independent cohort. Our data constitute a foundation for understanding the molecular pathology, gene regulation, and genetics of IBD.
2017
- Identification of Gene Transcription Start Sites and Enhancers Responding to Pulmonary Carbon Nanotube Exposure in VivoJette Bornholdt, Anne Thoustrup Saber, Berit Lilje, and 16 more authorsACS Nano, Apr 2017
- Update of the FANTOM web resource: high resolution transcriptome of diverse cell types in mammalsMarina Lizio, Jayson Harshbarger, Imad Abugessaisa, and 22 more authorsNucleic Acids Res., Jan 2017
Upon the first publication of the fifth iteration of the Functional Annotation of Mammalian Genomes collaborative project, FANTOM5, we gathered a series of primary data and database systems into the FANTOM web resource (http://fantom.gsc.riken.jp) to facilitate researchers to explore transcriptional regulation and cellular states. In the course of the collaboration, primary data and analysis results have been expanded, and functionalities of the database systems enhanced. We believe that our data and web systems are invaluable resources, and we think the scientific community will benefit for this recent update to deepen their understanding of mammalian cellular organization. We introduce the contents of FANTOM5 here, report recent updates in the web resource and provide future perspectives.
- Transcriptional Dynamics During Human Adipogenesis and Its Link to Adipose Morphology and DistributionAnna Ehrlund, Niklas Mejhert, Christel Björk, and 17 more authorsDiabetes, Jan 2017
White adipose tissue (WAT) can develop into several phenotypes with different pathophysiological impact on type 2 diabetes. To better understand the adipogenic process, the transcriptional events that occur during in vitro differentiation of human adipocytes were investigated and the findings linked to WAT phenotypes. Single-molecule transcriptional profiling provided a detailed map of the expressional changes of genes, enhancers, and long noncoding RNAs, where different types of transcripts share common dynamics during differentiation. Common signatures include early downregulated, transient, and late induced transcripts, all of which are linked to distinct developmental processes during adipogenesis. Enhancers expressed during adipogenesis overlap significantly with genetic variants associated with WAT distribution. Transiently expressed and late induced genes are associated with hypertrophic WAT (few but large fat cells), a phenotype closely linked to insulin resistance and type 2 diabetes. Transcription factors that are expressed early or transiently affect differentiation and adipocyte function and are controlled by several well-known upstream regulators such as glucocorticosteroids, insulin, cAMP, and thyroid hormones. Taken together, our results suggest a complex but highly coordinated regulation of adipogenesis.
2016
- Principles for RNA metabolism and alternative transcription initiation within closely spaced promotersYun Chen, Athma A Pai, Jan Herudek, and 8 more authorsNat. Genet., Sep 2016
Mammalian transcriptomes are complex and formed by extensive promoter activity. In addition, gene promoters are largely divergent and initiate transcription of reverse-oriented promoter upstream transcripts (PROMPTs). Although PROMPTs are commonly terminated early, influenced by polyadenylation sites, promoters often cluster so that the divergent activity of one might impact another. Here we found that the distance between promoters strongly correlates with the expression, stability and length of their associated PROMPTs. Adjacent promoters driving divergent mRNA transcription support PROMPT formation, but owing to polyadenylation site constraints, these transcripts tend to spread into the neighboring mRNA on the same strand. This mechanism to derive new alternative mRNA transcription start sites (TSSs) is also evident at closely spaced promoters supporting convergent mRNA transcription. We suggest that basic building blocks of divergently transcribed core promoter pairs, in combination with the wealth of TSSs in mammalian genomes, provide a framework with which evolution shapes transcriptomes.
- Regulating retrotransposon activity through the use of alternative transcription start sitesJenna Persson, Babett Steglich, Agata Smialowska, and 8 more authorsEMBO Rep., Feb 2016
Retrotransposons, the ancestors of retroviruses, have the potential for gene disruption and genomic takeover if not kept in check. Paradoxically, although host cells repress these elements by multiple mechanisms, they are transcribed and are even activated under stress conditions. Here, we describe a new mechanism of retrotransposon regulation through transcription start site (TSS) selection by altered nucleosome occupancy. We show that Fun30 chromatin remodelers cooperate to maintain a high level of nucleosome occupancy at retrotransposon-flanking long terminal repeat (LTR) elements. This enforces the use of a downstream TSS and the production of a truncated RNA incapable of reverse transcription and retrotransposition. However, in stressed cells, nucleosome occupancy at LTR elements is reduced, and the TSS shifts to allow for productive transcription. We propose that controlled retrotransposon transcription from a nonproductive TSS allows for rapid stress-induced activation, while preventing uncontrolled transposon activity in the genome.
- Transcriptome Analysis of Recurrently Deregulated Genes across Multiple Cancers Identifies New Pan-Cancer BiomarkersBogumil Kaczkowski, Yuji Tanaka, Hideya Kawaji, and 8 more authorsCancer Res., Jan 2016
Genes that are commonly deregulated in cancer are clinically attractive as candidate pan-diagnostic markers and therapeutic targets. To globally identify such targets, we compared Cap Analysis of Gene Expression profiles from 225 different cancer cell lines and 339 corresponding primary cell samples to identify transcripts that are deregulated recurrently in a broad range of cancer types. Comparing RNA-seq data from 4,055 tumors and 563 normal tissues profiled in the The Cancer Genome Atlas and FANTOM5 datasets, we identified a core transcript set with theranostic potential. Our analyses also revealed enhancer RNAs, which are upregulated in cancer, defining promoters that overlap with repetitive elements (especially SINE/Alu and LTR/ERV1 elements) that are often upregulated in cancer. Lastly, we documented for the first time upregulation of multiple copies of the REP522 interspersed repeat in cancer. Overall, our genome-wide expression profiling approach identified a comprehensive set of candidate biomarkers with pan-cancer potential, and extended the perspective and pathogenic significance of repetitive elements that are frequently activated during cancer progression. Cancer Res; 76(2); 216-26. ©2015 AACR.
- On-the-fly selection of cell-specific enhancers, genes, miRNAs and proteins across the human body using SlideBaseHans Ienasescu, Kang Li, Robin Andersson, and 15 more authorsDatabase, Jan 2016
Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for individual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS.Database URL: http://slidebase.binf.ku.dk.
2015
- Human Gene Promoters Are Intrinsically BidirectionalRobin Andersson, Yun Chen, Leighton Core, and 3 more authorsMol. Cell, Nov 2015
- The frequent evolutionary birth and death of functional promoters in mouse and humanRobert S Young, Yoshihide Hayashizaki, Robin Andersson, and 9 more authorsGenome Res., "1 " # "oct" 2015
- A unified architecture of transcriptional regulatory elementsRobin Andersson, Albin Sandelin, and Charles G DankoTrends Genet., Aug 2015
- Promoter or enhancer, what’s the difference? Deconstruction of established distinctions and presentation of a unifying modelRobin AnderssonBioessays, Mar 2015
Gene transcription is strictly controlled by the interplay of regulatory events at gene promoters and gene-distal regulatory elements called enhancers. Despite extensive studies of enhancers, we still have a very limited understanding of their mechanisms of action and their restricted spatio-temporal activities. A better understanding would ultimately lead to fundamental insights into the control of gene transcription and the action of regulatory genetic variants involved in disease. Here, I review and discuss pros and cons of state-of-the-art genomics methods to localize and infer the activity of enhancers. Among the different approaches, profiling of enhancer RNAs yields the highest specificity and may be superior in detecting in vivo activity. I discuss their apparent similarities to promoters, which challenge the established view of enhancers and promoters as distinct entities, and present a unifying model of regulatory elements in transcriptional regulation, in which activity, transcriptional output and regulatory function is context specific.
- Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cellsErik Arner, Carsten O Daub, Kristoffer Vitting-Seerup, and 107 more authorsScience, Feb 2015
Although it is generally accepted that cellular differentiation requires changes to transcriptional networks, dynamic regulation of promoters and enhancers at specific sets of genes has not been previously studied en masse. Exploiting the fact that active promoters and enhancers are transcribed, we simultaneously measured their activity in 19 human and 14 mouse time courses covering a wide range of cell types and biological stimuli. Enhancer RNAs, then messenger RNAs encoding transcription factors, dominated the earliest responses. Binding sites for key lineage transcription factors were simultaneously overrepresented in enhancers and promoters active in each cellular system. Our data support a highly generalizable model in which enhancer transcription is the earliest event in successive waves of transcriptional change during cellular differentiation or activation.
- Remodeling of retrotransposon elements during epigenetic induction of adult visual cortical plasticity by HDAC inhibitorsAndreas Lennartsson, Erik Arner, Michela Fagiolini, and 8 more authorsEpigenetics Chromatin, Jan 2015
BACKGROUND:The capacity for plasticity in the adult brain is limited by the anatomical traces laid down during early postnatal life. Removing certain molecular brakes, such as histone deacetylases (HDACs), has proven to be effective in recapitulating juvenile plasticity in the mature visual cortex (V1). We investigated the chromatin structure and transcriptional control by genome-wide sequencing of DNase I hypersensitive sites (DHSS) and cap analysis of gene expression (CAGE) libraries after HDAC inhibition by valproic acid (VPA) in adult V1. RESULTS:We found that VPA reliably reactivates the critical period plasticity and induces a dramatic change of chromatin organization in V1 yielding significantly greater accessibility distant from promoters, including at enhancer regions. VPA also induces nucleosome eviction specifically from retrotransposon (in particular SINE) elements. The transiently accessible SINE elements overlap with transcription factor-binding sites of the Fox family. Mapping of transcription start site activity using CAGE revealed transcription of epigenetic and neural plasticity-regulating genes following VPA treatment, which may help to re-program the genomic landscape and reactivate plasticity in the adult cortex. CONCLUSIONS:Treatment with HDAC inhibitors increases accessibility to enhancers and repetitive elements underlying brain-specific gene expression and reactivation of visual cortical plasticity.
2014
- Identification of TNF-α-responsive promoters and enhancers in the intestinal epithelial cell model Caco-2Mette Boyd, Mehmet Coskun, Berit Lilje, and 11 more authorsDNA Res., Dec 2014
The Caco-2 cell line is one of the most important in vitro models for enterocytes, and is used to study drug absorption and disease, including inflammatory bowel disease and cancer. In order to use the model optimally, it is necessary to map its functional entities. In this study, we have generated genome-wide maps of active transcription start sites (TSSs), and active enhancers in Caco-2 cells with or without tumour necrosis factor (TNF)-α stimulation to mimic an inflammatory state. We found 520 promoters that significantly changed their usage level upon TNF-α stimulation; of these, 52% are not annotated. A subset of these has the potential to confer change in protein function due to protein domain exclusion. Moreover, we locate 890 transcribed enhancer candidates, where ∼50% are changing in usage after TNF-α stimulation. These enhancers share motif enrichments with similarly responding gene promoters. As a case example, we characterize an enhancer regulating the laminin-5 γ2-chain (LAMC2) gene by nuclear factor (NF)-κB binding. This report is the first to present comprehensive TSS and enhancer maps over Caco-2 cells, and highlights many novel inflammation-specific promoters and enhancers.
- Nuclear stability and transcriptional directionality separate functionally distinct RNA speciesRobin Andersson, Peter Refsing Andersen, Eivind Valen, and 5 more authorsNat. Commun., Jul 2014
- Nucleosome regulatory dynamics in response to TGFβStefan Enroth, Robin Andersson, Madhusudhan Bysani, and 8 more authorsNucleic Acids Res., Jun 2014
Nucleosomes play important roles in a cell beyond their basal functionality in chromatin compaction. Their placement affects all steps in transcriptional regulation, from transcription factor (TF) binding to messenger ribonucleic acid (mRNA) synthesis. Careful profiling of their locations and dynamics in response to stimuli is important to further our understanding of transcriptional regulation by the state of chromatin. We measured nucleosome occupancy in human hepatic cells before and after treatment with transforming growth factor beta 1 (TGFβ1), using massively parallel sequencing. With a newly developed method, SuMMIt, for precise positioning of nucleosomes we inferred dynamics of the nucleosomal landscape. Distinct nucleosome positioning has previously been described at transcription start site and flanking TF binding sites. We found that the average pattern is present at very few sites and, in case of TF binding, the double peak surrounding the sites is just an artifact of averaging over many loci. We systematically searched for depleted nucleosomes in stimulated cells compared to unstimulated cells and identified 24 318 loci. Depending on genomic annotation, 44-78% of them were over-represented in binding motifs for TFs. Changes in binding affinity were verified for HNF4α by qPCR. Strikingly many of these loci were associated with expression changes, as measured by RNA sequencing.
- Transcriptional profiling of the human fibrillin/LTBP gene family, key regulators of mesenchymal cell functionsMargaret R Davis, Robin Andersson, Jessica Severin, and 7 more authorsMol. Genet. Metab., May 2014
- Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenanceAlexandre Fort, Kosuke Hashimoto, Daisuke Yamada, and 18 more authorsNat. Genet., Apr 2014
- Differential roles of epigenetic changes and Foxp3 expression in regulatory T cell-specific transcriptional regulationHiromasa Morikawa, Naganari Ohkura, Alexis Vandenbon, and 270 more authorsProceedings of the National Academy of Sciences, Apr 2014
Naturally occurring regulatory T (Treg) cells, which specifically express the transcription factor forkhead box P3 (Foxp3), are engaged in the maintenance of immunological self-tolerance and homeostasis. By transcriptional start site cluster analysis, we assessed here how genome-wide patterns of DNA methylation or Foxp3 binding sites were associated with Treg-specific gene expression. We found that Treg-specific DNA hypomethylated regions were closely associated with Treg up-regulated transcriptional start site clusters, whereas Foxp3 binding regions had no significant correlation with either up- or down-regulated clusters in nonactivated Treg cells. However, in activated Treg cells, Foxp3 binding regions showed a strong correlation with down-regulated clusters. In accordance with these findings, the above two features of activation-dependent gene regulation in Treg cells tend to occur at different locations in the genome. The results collectively indicate that Treg-specific DNA hypomethylation is instrumental in gene up-regulation in steady state Treg cells, whereas Foxp3 down-regulates the expression of its target genes in activated Treg cells. Thus, the two events seem to play distinct but complementary roles in Treg-specific gene expression.
- A promoter-level mammalian expression atlasFANTOM Consortium and the RIKEN PMI and CLST (DGT), Alistair R R Forrest, Hideya Kawaji, and 259 more authorsNature, Mar 2014
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ’housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
- An atlas of active enhancers across human cell types and tissuesRobin Andersson, Claudia Gebhard, Irene Miguel-Escalada, and 48 more authorsNature, Mar 2014
- CAGE-defined promoter regions of the genes implicated in Rett SyndromeMorana Vitezic, Nicolas Bertin, Robin Andersson, and 14 more authorsBMC Genomics, Jan 2014
2013
- Polyadenylation site–induced decay of upstream transcripts enforces promoter directionalityEvgenia Ntini, Aino I Järvelin, Jette Bornholdt, and 15 more authorsNat. Struct. Mol. Biol., Jul 2013
2012
- A strand specific high resolution normalization method for chip-sequencing data employing multiple experimental control measurementsStefan Enroth, Claes R Andersson, Robin Andersson, and 3 more authorsAlgorithms Mol. Biol., Jan 2012
2011
- Cancer associated epigenetic transitions identified by genome-wide histone methylation binding profiles in human colorectal cancer samples and paired normal mucosaStefan Enroth, Alvaro Rada-Iglesisas, Robin Andersson, and 5 more authorsBMC Cancer, Jan 2011
2010
- Frequent genetic differences between matched primary and metastatic breast cancer provide an approach to identification of biomarkers for disease progressionAndrzej B Popławski, Michał Jankowski, Stephen W Erickson, and 18 more authorsEur. J. Hum. Genet., "6 " # "jan" 2010
- SICTIN: Rapid footprinting of massively parallel sequencing dataStefan Enroth, Robin Andersson, Claes Wadelius, and 1 more authorBioData Min., Jan 2010
- Integrative epigenomic and genomic analysis of malignant pheochromocytomaJohanna Sandgren, Robin Andersson, Alvaro Rada-Iglesias, and 6 more authorsExp. Mol. Med., Jan 2010
2009
- Genome-wide high-resolution analysis of DNA copy number alterations in NF1-associated malignant peripheral nerve sheath tumors using 32K BAC arrayKiran K Mantripragada, Teresita Ståhl, Chris Patridge, and 8 more authorsGenes Chromosomes Cancer, Oct 2009
Neurofibromatosis Type I (NF1) is an autosomal dominant disorder characterized by the development of both benign and malignant tumors. The lifetime risk for developing a malignant peripheral nerve sheath tumor (MPNST) in NF1 patients is approximately 10% with poor survival rates. To date, the molecular basis of MPNST development remains unclear. Here, we report the first genome-wide and high-resolution analysis of DNA copy number alterations in MPNST using the 32K bacterial artificial chromosome microarray on a series of 24 MPNSTs and three neurofibroma samples. In the benign neurofibromas, apart from loss of one copy of the NF1 gene and copy number polymorphisms, no other changes were found. The profiles of malignant samples, however, revealed specific loss of chromosomal regions including 1p35-33, 1p21, 9p21.3, 10q25, 11q22-23, 17q11, and 20p12.2 as well as gain of 1q25, 3p26, 3q13, 5p12, 5q11.2-q14, 5q21-23, 5q31-33, 6p23-p21, 6p12, 6q15, 6q23-q24, 7p22, 7p14-p13, 7q21, 7q36, 8q22-q24, 14q22, and 17q21-q25. Copy number gains were more frequent than deletions in the MPNST samples (62% vs. 38%). The genes resident within common regions of gain were NEDL1 (7p14), AP3B1 (5q14.1), and CUL1 (7q36.1) and these were identified in >63% MPNSTs. The most frequently deleted locus encompassed CDKN2A, CDKN2B, and MTAP genes on 9p21.3 (33% cases). These genes have previously been implicated in other cancer conditions and therefore, should be considered for their therapeutic, prognostic, and diagnostic relevance in NF1 tumorigenesis.
- Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contactsPatrik Björkholm, Pawel Daniluk, Andriy Kryshtafovych, and 3 more authorsBioinformatics, May 2009
MOTIVATION: Correct prediction of residue-residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail. RESULTS: We propose a novel hidden Markov model (HMM)-based method for predicting residue-residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities incorporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 x L predictions (L = sequence length), our HMMs obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature. AVAILABILITY: http://predictioncenter.org/Services/FragHMMent/.
- Histone H3 lysine 27 trimethylation in adult differentiated colon associated to cancer DNA hypermethylationAlvaro Rada-Iglesias, Stefan Enroth, Robin Andersson, and 4 more authorsEpigenetics, Feb 2009
DNA hypermethylation of gene promoters is a common epigenetic alteration occurring in cancer cells. However, little is known about the mechanisms instructing these cancer-specific DNA hypermethylation events. Recent reports have suggested that genes bound by polycomb/Histone H3 lysine 27 trimethylation (H3K27me3) in embryonic stem (ES) cells are frequent targets for cancer-specific DNA hypermethylation. This polycomb-premarking is assumed to be restrained to ES cells, even though almost no polycomb/H3K27me3 binding profiles are available for differentiated tissues. We generated H3K27me3 profiles in human normal colon and they significantly overlapped with those of ES cells and genes hypermethylated in colorectal cancer (CRC). Moreover, colon H3K27me3 was more restricted to genes hypermethylated in CRC, while ES H3K27me3 was also common in genes hypermethylated in other tumors. Therefore, the suggested polycomb pre-marking of genes for cancer DNA hypermethylation is not necessarily limited to ES or early precursor cells but can occur later in differentiated tissues.
2008
- Somatic mosaicism for copy number variation in differentiated human tissuesArkadiusz Piotrowski, Carl E G Bruder, Robin Andersson, and 14 more authorsHum. Mutat., Sep 2008
- Profiling of copy number variations (CNVs) in healthy individuals from three ethnic groups using a human genome 32 K BAC-clone-based arrayTeresita Ståhl, Johanna Sandgren, Arkadiusz Piotrowski, and 16 more authorsHum. Mutat., Mar 2008
To further explore the extent of structural large-scale variation in the human genome, we assessed copy number variations (CNVs) in a series of 71 healthy subjects from three ethnic groups. CNVs were analyzed using comparative genomic hybridization (CGH) to a BAC array covering the human genome, using DNA extracted from peripheral blood, thus avoiding any culture-induced rearrangements. By applying a newly developed computational algorithm based on Hidden Markov modeling, we identified 1,078 autosomal CNVs, including at least two neighboring/overlapping BACs, which represent 315 distinct regions. The average size of the sequence polymorphisms was approximately 350 kb and involved in total approximately 117 Mb or approximately 3.5% of the genome. Gains were about four times more common than deletions, and segmental duplications (SDs) were overrepresented, especially in larger deletion variants. This strengthens the notion that SDs often define hotspots of chromosomal rearrangements. Over 60% of the identified autosomal rearrangements match previously reported CNVs, recognized with various platforms. However, results from chromosome X do not agree well with the previously annotated CNVs. Furthermore, data from single BACs deviating in copy number suggest that our above estimate of total variation is conservative. This report contributes to the establishment of the common baseline for CNV, which is an important resource in human genetics.
- Phenotypically Concordant and Discordant Monozygotic Twins Display Different DNA Copy-Number-Variation ProfilesCarl E G Bruder, Arkadiusz Piotrowski, Antoinet A C J Gijsbers, and 19 more authorsAm. J. Hum. Genet., Mar 2008
2007
- A previously unrecognized microdeletion syndrome on chromosome 22 band q11.2 encompassing the BCR geneFady M Mikhail, Maria Descartes, Arkadiusz Piotrowski, and 6 more authorsAm. J. Med. Genet. A, Sep 2007
Susceptibility of the chromosome 22q11.2 region to rearrangements has been recognized on the basis of common clinical disorders such as the DiGeorge/velocardiofacial syndrome (DG/VCFs). Recent evidence has implicated low-copy repeats (LCRs); also known as segmental duplications; on 22q as mediators of nonallelic homologous recombination (NAHR) that result in rearrangements of 22q11.2. It has been shown that both deletion and duplication events can occur as a result of NAHR caused by unequal crossover of LCRs. Here we report on the clinical, cytogenetic and array CGH studies of a 15-year-old Hispanic boy with history of learning and behavior problems. We suggest that he represents a previously unrecognized microdeletion syndrome on chromosome 22 band q11.2 just telomeric to the DG/VCFs typically deleted region and encompassing the BCR gene. Using a 32K BAC array CGH chip we were able to refine and precisely narrow the breakpoints of this microdeletion, which was estimated to be 1.55-1.92 Mb in size and to span approximately 20 genes. This microdeletion region is flanked by LCR clusters containing several modules with a very high degree of sequence homology (>95%), and therefore could play a causal role in its origin.
- Overlapping phenotype of Wolf-Hirschhorn and Beckwith-Wiedemann syndromes in a girl with der(4)t(4;11)(pter;pter)Fady M Mikhail, Achara Sathienkijkanchai, Nathaniel H Robin, and 9 more authorsAm. J. Med. Genet. A, Aug 2007
We report on an 8-month-old girl with a novel unbalanced chromosomal rearrangement, consisting of a terminal deletion of 4p and a paternal duplication of terminal 11p. Each of these is associated with the well-known clinical phenotypes of Wolf-Hirschhorn syndrome (WHS) and Beckwith-Wiedemann syndrome (BWS), respectively. She presented for clinical evaluation of dysmorphic facial features, developmental delay, atrial septal defect (ASD), and left hydronephrosis. High-resolution cytogenetic analysis revealed a normal female karyotype, but subtelomeric fluorescence in situ hybridization (FISH) analysis revealed a der(4)t(4;11)(pter;pter). Both FISH and microarray CGH studies clearly demonstrated that the WHS critical regions 1 and 2 were deleted, and that the BWS imprinted domains (ID) 1 and 2 were duplicated on the der(4). Parental chromosome analysis revealed that the father carried a cryptic balanced t(4;11)(pter;pter). As expected, our patient manifests findings of both WHS (a growth retardation syndrome) and BWS (an overgrowth syndrome). We compare her unique phenotypic features with those that have been reported for both syndromes.