• Tidak ada hasil yang ditemukan

Introduction: What we knew about transcriptional topology at the beginning of this

Chapter III: Transcriptional topology

III.1: Introduction: What we knew about transcriptional topology at the beginning of this

Transcriptional topology, the portion of chromatin topology involved in

transcriptional regulation, has been conceptually differentiated from chromatin topology since the “looping model” of enhancement came to prominence (Muller, Sogo et al.

1989) but genes, their location, and their activity have been studies with respect to chromatin outside of enhancement alone. MARs (DNA regions associating with the nuclear matrix) marking boundaries of active chromatin domains was a popular field in the 80s (see (Mirkovitch, Mirault et al. 1984)), and it was known that active genes

associate with MARs in a variety of organisms (Robinson, Small et al. 1983; Ciejek, Tsai et al. 1983; Small, Nelkin et al. 1985; Gasser and Laemmli 1986), and even that MARs in some case overlapped enhancers. In the 2000s, people began to study chromatin conformation and its effects on genes more closely, noting that some genes are able to

“loop out” of place upon activation (Chambeyron and Bickmore 2004), and that gene- poor and gene-rich regions separate (Shopland, Lynch et al. 2006), or that only certain classes of genes do this (Simonis, Klous et al. 2006). Along with this line of thinking came the notion of the “transcription factory,” previously noted through microscopy as rare foci of pol2 (Jackson, Hassan et al. 1993; Iborra, Pombo et al. 1996) actively transcribing genes (Verschure, van Der Kraan et al. 1999), as a method for groups of related genes to be expressed, and perhaps a primary mode of transcription.

Transcription factories are according to some definitions architecturally unchanging elements since genes can move to and from transcription factories (Osborne, Chakalova

et al. 2004) and their existence does not depend on transcription itself (Mitchell and Fraser 2008; Palstra, Simonis et al. 2008).

The contemporary genome-wide chromatin literature talks much about a potentially related concept called a “topologically active domain” (TAD), a region of chromatin containing active genes capable of interacting with each other and bounded by CTCF and cohesin (Dixon, Selvaraj et al. 2012; Li, Huang et al. 2013). It is unknown how much the modern TAD has in common with earlier understanding of separate gene- rich and gene-poor areas; whether it explains all or only some. In addition to containing active genes, TADs seem to have a role in the timing of cell replication (Pope, Ryba et al. 2014) and are also associated with Lamin-associated domains (Peric-Hupkes, Meuleman et al. 2010). The TAD is by no means the smallest unit of chromatin within which genes preferentially interact, which may be understandable since the TAD was defined by the 1MB-resolution Hi-C method, while other domains, sometimes

confusingly called “smaller TADs,” are found with more highly sensitive measures like 5C and with more computational processing (Phillips-Cremins, Sauria et al. 2013;

Filippova, Patro et al. 2014).

All of the above now appears quite relevant to this study since most genes that connect to far-distal elements connect to elements within their neighborhood of about 150kb. A gene’s neighborhood must be important with respect to what most genes connect since all of those connections are within that neighborhood, and the above bulk- scale or microscopy assays showing active elements connecting to other active

elements are likely related to a subset of the larger CIGs I report. Others have reported that there are some active-gene-poor and some active-gene-rich areas (“ridges”) and it has even been claimed that certain meta-classes of genes such as transcription factors are more likely to be in gene-poor areas versus in gene-rich areas like lineage-specific

genes are (Lercher, Urrutia et al. 2002). However, a similar paper found that

developmentally related genes could actually be in either type of area (Versteeg, van Schaik et al. 2003).

Then there are the cases of known, validated E:P “looping” interactions.

Considering this background, pol2 ChIA-PET data are expected to identify physical interactions of several different functional classes. I am most likely seeing many classic E:P “looping” interactions, but I am also surely seeing interactions that are primarily involved in the nuclear architecture. Such interactions can be mediated by known DNA- site specific chromatin factors such as CTCF (Splinter, Heath et al. 2006; Kim, Abdullaev et al. 2007; Handoko, Xu et al. 2011; Ong and Corces 2014), ZNF143 (Bailey, Zhang et al. 2015), or YY1 (Harr, Luperchio et al. 2015; Zeng 2015), as well as less sequence- specific factors (Galande, Purbey et al. 2007) and likely less-known factors and RNA components (Magistri, Faghihi et al. 2012) as well. In chapter III, I focused on the general aspects of topology that are consistent across ChIA-PET experiments. In this chapter, I will focus on the topological interactions that are different between genes of different expression classes.

III.1.1: State-to-state changes

Globally, chromatin interactions are not thought to change much (Simonis, Klous et al. 2006; Hakim, Sung et al. 2011). Nevertheless, some E:P interactions have been shown to be transient and dependent on transcription (Cheutin, O'Donohue et al. 2003;

Kosak and Groudine 2004; Meshorer and Misteli 2006). One experimentally-driven hypothesis is that rearrangement of CRMs can only occur within certain native active chromatin domains (Noordermeer, Branco et al. 2008). I will report in this chapter on the changes in detectible interactions between two developmental states, but a caveat is that these interactions may well be invisible to us before genes are active, since the

ChIA-PET experiments I undertook only detect interactions that co-occur with ChIPpable factors associated with transcription. Therefore, ChIA-PET cannot tell us where

connectivity changes, only where the active use of connected elements may change from state to state.

III.1.2: Housekeeping genes

The notion of the housekeeping gene has been prevalent since the discovery of genes themselves. Since certain enzymes, structural elements, and other core parts of the universal cellular machinery must be expressed at roughly similar levels in every cell, the reasoning goes, these genes don’t need to be regulated. Housekeeping genes are often used as a foil or control for developmental genes, which are regulated and differentially expressed in different cell types.

The promoters of some housekeeping genes were investigated during the course of researching promoter and transcription biochemistry, but because of the technologic limits of the day, highly-expressed genes had to be studied and genes involved in disease and development were investigated first. It was probably the fact that TATA is prevalent in the promoters of developmental genes that led to it being the first promoter motif discovered in mammals (Goldberg 1989), a bias that was noticed by the

researchers of the time (Breathnach and Chambon 1981). In contrast, the TATA-less genes were generally regarded as housekeeping genes with low expression levels and multiple 5’ ends (Dynan 1986).

Perhaps it was because the promoters of housekeeping genes seemed more complicated to study than that of the developmental genes, or perhaps because many were expressed at a modest level, or perhaps because developmentally regulated genes have traditionally been more scientifically exciting, but for a type of gene that is often used as a conceptual foil or experimental foil, housekeeping genes have fallen by the

wayside in terms of direct research. One reason that surely has played a part in this lack of research is that, somewhere along the line, “being regulated” became

synonymous colloquially with “has enhancer(s).” Housekeeping genes, thought to have steady expression levels, are assumed to maintain these steady expression levels by virtue of a constitutively active promoter. Meanwhile, developmental genes, which have varying levels of expression, need CRMs in order to modulate their levels of transcription over development and in different tissues.

A few people have managed to study the regulation of housekeeping genes, though. One person who studied a vital housekeeping gene, DFHR, was Dr. Peggy Farnham, despite the difficulties of obtaining funding for something assumed not to happen (B. Wold, pers. comm.). Dr. Farnham found that it did, indeed, have enhancers (Farnham and Schimke 1985). However, Dr. Farnham attributed this need for

enhancers to the fact that DFHR was known to be differentially regulated in the cell cycle and did not attempt to question whether housekeeping genes broadly had enhancers.

Likewise, in the case of string in Drosophila, in which the gene appeared at the level of tissues to be broadly expressed but was in fact differentially regulated at the level of cells, enhancers were found, but again were written off as a peculiarity of the particular gene studied. Enhancers continued to be studied almost exclusively in the context of developmental genes over the next decade, and the preponderance of developmental gene enhancer literature and absence of housekeeping gene expression literature sometimes seemingly led many to forget the formal possibility that housekeeping genes in general might have enhancers.

Is it really possible for any gene, much less most genes, to truly be unregulated in any way in every cell type? Some cell types like immune cells with their rearranged genomes or neurons and spermatocytes with their uniquely stripped-down metabolic

requirements surely contain a large number of “housekeeping” genes with varied levels of expression relative to the other tissues in the body. Furthermore, even genes that are known to be regulated by enhancers can appear to drive native expression patterns with their proximal promoters alone, as was once the case with myogenin (Yee and Rigby 1993), so the lack of apparent necessity, in assay, for CRMs does not disprove their existence.