Investigating the coupling between RNA expression and three dimensional chromatin organisation

This post refers to our latest pre-print [1] where we comprehensively compare RNA expression data with 3D chromatin capture data, and suggest that the former can be highly predictive of the latter. Here I introduce our work and attempt to explain the rationale behind our `transcriptional decomposition’ approach.

The ordering and placement of regulatory elements (such as genes and enhancers) across chromosomes is far from random, but highly dependent on neighbouring elements. This comes from observations that genes situated together within a given neighbourhood are more likely to be regulated together than those separated by large distances or on different chromosomes [2].

Datasets informing on the three dimensional folding of DNA within cell nuclei have shown that the genome can be broken up into distinct compartments which tend to be actively expressed or silenced [3]. More fine-scaled domains of variable size have also been identified, referred to as topologically associated domains (TADs) [4-5]. The elements within TADs are enriched in proximity interactions, whereby the intervening DNA between two regulatory elements is looped out, allowing the elements to sit in proximal space and share the machinery which binds to their sequences and regulates RNA output. The most well studied cases of these are enhancer-promoter interactions, whereby an interaction with an enhancer boosts or represses the output of the associated gene.

There are various clues that three dimensional structures are closely coupled with expression output. Deletion or inversions of boundary regions demarcating domains has lead to the observation of ectopic expression at regions which would otherwise be silenced, often with disease associated implications [6]. On the other hand, the presence of expression seems to show modest importance in the maintenance of 3D structures [7-8] and RNA expression observed at both enhancer and promoter has also seen to be predictive of individual interactions [9]. These observations and others suggest the importance of correct 3D chromatin conformation in the regulation of expression and vice-versa, but the extent of their interdependency is unclear.

In our study, we reasoned that whilst the chromosomal positioning of regulatory elements within their physical context appears crucial for their correct output, there may be other, independent factors involved in regulating their resulting expression profiles. A way to separate out these two aspects from expression datasets could lead to new insights into cell-type specific expression regulation within 3D topological contexts. With this in mind, we developed `transcriptional decomposition’ (TD), a random effects model which takes expression output on a chromosome by chromosome basis and models it as the sum of two main sub-components – a component assuming an underlying, linear dependency between adjacent 10KB regions, and a component assuming that the expression from a given region is independent of the expression from another region. Thus, the result is to split, or decompose, RNA expression datasets into two pieces, one exploiting the relative positions of expressed units and the other as a measure of expression independent of position (see Figure 1 below for graphical overview). Importantly, the two separated pieces generally do not correlate, suggesting that they might represent two different modes of regulation which we are able to separate from one another (something we attempt to investigate over the course of the paper).

figure1

Figure 1: Overview of the transcriptional decomposition (TD) approach. (A) Expression levels from positionally organised units within chromatin neighbourhoods expressed as two components – positionally dependent (PD) and positionally independent (PI). (B) We model binned tag counts from replicated samples at 10KB resolution  to separate out expression levels into an intercept, PI and PD. Figure taken from Figure 1 of the pre-print; see [1] for further details.

We hypothesised that the first component, referred to as positionally-dependent (PD), should reflect co-regulated neighbourhoods of regulatory elements according to 3D proximities. This is what we see in general, and go on to show that the PD component is highly useful in predicting compartments and TAD boundaries. The second component, referred to as positionally-independent (PI), on the other hand appears to represent gene-specific local regulatory programs. Intriguingly, we find that whilst both PD- and PI- components explain a large proportion of expression datasets (roughly 50-50), there is a large variance from loci to loci, suggesting gene-specific differences on the impact of chromatin context. We also leverage our components, and other information from expression, to show that expression data alone is useful in determining which pairs of enhancers and promoters are likely to form interactions.

We furthermore apply our models to 76 replicated samples from the FANTOM5 Consortium [10] (see also separate blog posts) and utilise the information to predict 3D architectures and enhancer-promoter interactions across a multitude of cell types without available chromatin data. Here is the link, which we included in the preprint, to this data (https://zenodo.org/record/556727).

References

[1] Rennie, S., Dalby, M., van Duin, L., and Andersson, R. (2017). Transcriptional decomposition reveals active chromatin architectures and cell specific regulatory interactions. bioRxiv, 130070.

[2] Cohen, B. A., Mitra, R. D., Hughes, J. D., and Church, G. M. (2000). A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nature genetics, 26(2):183–186.

[3] Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., et al. (2009). Comprehensive Mapping of LongRange Interactions Reveals Folding Principles of the Human Genome. Science (New York, N.Y.), 326(5950):289–293.

[4] Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S., and Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485(7398):376–380.

[5] Nora, E. P., Lajoie, B. R., Schulz, E. G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N. L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X inactivation centre. Nature, 485(7398):381–385.

[6] Lupiáñez, D. G., Kraft, K., Heinrich, V., Krawitz, P., Brancati, F., Klopocki, E., et al. (2015). Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell, 161(5), 1012-1025.

[7] Li, L., Lyu, X., Hou, C., Takenaka, N., Nguyen, H. Q., Ong, C.-T., Cubenas-Potts, C., Hu, M., Lei, ˜ E. P., Bosco, G., et al. (2015). Widespread rearrangement of 3d chromatin organization underlies polycomb-mediated stress-induced silencing. Molecular cell, 58(2):216–231.

[8] Branco, M. R. and Pombo, A. (2006). Intermingling of chromosome territories in interphase suggests role in translocations and transcription-dependent associations. PLoS Biol, 4(5):e138.

[9] Whalen, S., Truty, R. M., and Pollard, K. S. (2016). Enhancer–promoter interactions are encoded by complex genomic signatures on looping chromatin. Nature genetics, 48(5):488–496.

[10] FANTOM Consortium and the RIKEN PMI and CLST (DGT), Forrest, A. R. R., Kawaji, H., Rehli, M., Baillie, J. K., de Hoon, M. J. L., Haberle, V., Lassmann, T., et al. (2014). A promoter-level mammalian expression atlas. Nature, 507(7493):462–470.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s