(A) A stack of snapshots taken through the whole volume of a single cell; the result- ing maximum-intensity projection (green box), and a single slice (blue box) are fed into the image-processing algorithm for dot-detection. (B) Cumulative Distributions of dot counts for each of the two imaging approaches is shown across a population of cells. (C) Same distributions as in B, but normalized by the sample median. (D) Technical replicates for the single-slice approach. (E) Correlation between dGFP protein fluorescence as measured simultaneously with dGFP transcripts. (left), and correlation between Rex1 (unmodified allele) and dGFP (knock-in reporter on second allele) transcripts (right). r is the Pearson correlation coefficient. (F) (Left) Sorted subpopulations of the bimodal Rex1-dGFP knock-in reporter. (Right) qPCR results on these subpopulations for a subset of target genes also examined by smFISH. Val- ues were normalized to expression levels of the housekeeping gene Gapdh, and are represented as 2(−∆∆Ct) with respect to the ’Rex1-high’ subpopulation
Figure S2.
A
0 100 200 300
0 100 200 300
400 Unsynchronized
Nanog
Rex1
r=0.65
0 200 400
0 100 200 300 400 500
Synchronized
Nanog
Rex1
r=0.68
B Transcript Count/Cell C
Population Fraction
Long-Tailed Bimodal Unimodal
0 20 40
0 0.05 0.1
Carm1p=0.94
0 20 40 60
0 0.05 0.1
Cnot3p=0.57
0 50 100 150 0
0.05
0.1 Dppa4p=0.17
0 20 40
0 0.05
gp130p=0.71
0 10 20 30
0 0.1
Jmjd2cp=0.30
0 50 100
0 0.05
Nacc1p=0.19
0 100 200 300 0
0.05
Oct4p=0.44
0 20 40 60
0 0.05
Prmt5 p=0.97
0 100 200
0 0.05
Restp=0.28
0 50 100
0 0.05
Sall4 p=0.71
0 20 40 60
0 0.05 0.1
Sdhap=0.46
0 50 100 150
0 0.05
Smarcc1 p=0.10
0 20 40
0 0.05
0.1 Stat3p=0.33
0 20 40
0 0.05 0.1
Tbpp=0.53
0 50 100
0 0.05 0.1
Tcf3p=0.11
0 100 200 300 0
0.05
Trim28p=0.63
0 20 40
0.050.10
Zfp281p=0.34
0 50 100
0 0.05
0.1 Zic3p=0.52
0 100 200
0 0.05 0.1
Esrrbp=0.32
0 100 200
0 0.05
Tet1p=0.73
0 20 40
0 0.05 0.1
Lifrp=0.13
0 100 200 300
0.050.10
Nanogp=0.07
0 100 200
0 0.05
Sox2p=0.41
0 50 100
0.050.10
Tcl1p=0.69 0 20 40 60 80 0
0.05
0.1 Fgf4p=0.15
0 200 400
0 0.05
0.1 Rex1p=0.27
0 10 20 30
0.050.10
Blimpp=0.27
0 200 400
0.050.10
Dnmt3bp=0.07
0 20 40 60
0 0.5
Dppa3 p=0.50
0 10 20 30
0 0.05 0.1
Fgfr2p=0.23
0 50 100 150
0 0.1
Klf4p=0.37
0 50 100 150
0 0.1 0.2
Nr0b1p=0.66
0 20 40
0 0.1
0.2 Pecamp=0.95
0 10 20 30
0 0.2 0.4
Prdm14 p=0.87
0 5 10
0 0.5
Tp=NaN
0 20 40 60
0 0.5
Tbx3p=0.36
0 20 40 60
0 0.5
Zscan4c p=NaN
0 20 40 60
0 0.1
0.2 Socs3
p=0.59
Figure 2.7: mRNA distributions and correlations by smFISH
(A) Empirical distributions and MLE fits for unimodal, bimodal, and long-tailed genes. p-values are for χ2 GOF tests. p>0.05 indicates that the fit to the distribu- tion is indistinguishable from the empirically measured distribution. Where present, solid lines represent components of the fit. Dashed line represents the overall fit to the distribution. (B) Pairwise relationships between heterogeneously expressed genes.
p-values are from the 2D KS-test. r is the Pearson correlation coefficient. (C) Cor- relation and marginal distributions of Rex1 and Nanog in a control population (top) and population synchronized by a double thymidine block fixed immediately following the block (bottom). r is the Pearson correlation coefficient.
Rex1-High Rex1-Low Rex1-Rev
53%
22%
16%
Mael
Rex1-HighRex1-LowRex1-Rev
44%
85%
47%
Dazl
Rex1-HighRex1-LowRex1-Rev
47%
18%
Sycp3
23%
Rex1-HighRex1-LowRex1-Rev
A B
C
Figure S3
2kb upstream of TSS
Relative Position of CpG
500bp downstream of TSS
Rex1-Low Subpopulation Single CpG Methylation Fraction
Rex1-High Subpopulation Single CpG Methylation Fraction
0 20 40 60 80 100
Mael Sycp3 Dazl
0 0.5 1
0 0.5
1B-actin
0 0.5 1
0 0.5
1Dazl
0 0.5 1
0 0.5
1Dnmt3b
0 0.5 1
0 0.5
1Dppa3
0 0.5 1
0 0.5
1Esrrb
0 0.5 1
0 0.5
1Fgf4
0 0.5 1
0 0.5
1Gapdh
0 0.5 1
0 0.5
1Klf4
0 0.5 1
0 0.5
1Mael
0 0.5 1
0 0.5
1Pou5f1
0 0.5 1
0 0.5
1Prdm14
0 0.5 1
0 0.5
1Sdha
0 0.5 1
0 0.5
1Sox2
0 0.5 1
0 0.5
1Sycp3
0 0.5 1
0 0.5
1Tbx3
0 0.5 1
0 0.5
1Tcl1
0 0.5 1
0 0.5
1Tet1
Figure 2.8: Differential methylation between Rex1 states
(A) Locus specific bisulfite sequencing plots between Rex1-high, -low, and -low-to- high-reverting cells at three targets of methylation. Open circles are unmethylated, filled circles are methylated, and x’s are unknown. (B) Measurements from A are plotted as bar graphs for comparison. (C) Scatter plots showing how single CpGs in the promoters of a given gene change between Rex1-high and -low states. Color coding represents the position of a base relative the transcriptional state site.
P51$KDOIïOLIHKRXU
%XUVWIUHTXHQF\KRXUï
842
2 4 7LPH
7RWDO3URWHLQ
8SVWHSIUHTXHQFLHV 'RZQVWHSIUHTXHQFLHV
P51$KDOIïOLIH
%XUVWIUHTXHQF\
)LJXUH6
A
B
1DQRJ 1DQRJ
H2B&LWULQH 1HR ,5(6 pA
(QGRJHQRXV1DQRJORFXV .QRFNLQDQG VHOHFWLRQ
3*.
H2B&HUXOHDQ +\J
pA 69 pA
6LWHVSHFLILFLQWHJUDWLRQ
5DQGRPLQWHJUDWLRQ
NKICit
NKICit+Cer
2FW 2FW
H2B&HUXOHDQ 1HR pA 3*. pA
2FWORFXVRQ%$&
7DUJHWLQJRI UHSRUWHUDQG VHOHFWLRQFDVVHWWH
5DQGRPLQWHJUDWLRQ
OBACCer
F
H
2 4 6 8
5 5 2 5[4
7LPHKRXUV
&RQWLQXL]HGY
YW-YW
# YDU
2
4 5 5 5
[4
7LPHKRXUV
&RQWLQXL]HGY
YW-YW
# YDU
SFILW SFILW
7LPHKRXUV
7RWDOIOXRUHVFHQFH<
2 4 6 8[4
)UDPHVUHPRYHG DURXQGGLYLVLRQV
&HOOGLYLVLRQV
L
2 4 6 8[4
7LPHKRXUV
&RQWLQXL]HGY
2
LL
6WHSGHWHFWHG
7KUHVKROG 6LJQLILFDQWSF VORSHFKDQJHV
7LPHKRXUV
5HVLGXDOQRLVH
5HVLGXDOQRLVH
= 06(SF YDU<W - YW
2
LY
2 4 6 8[4
7LPHKRXUV
&RQWLQXL]HGY
,QWUDVWDWHVHJPHQWV
8SGRZQVWHSVLQ SURGXFWLRQUDWH
Y
LLL
2 3 4
2 3 4
3URGXFWLRQUDWH
%()25(GLYLVLRQ 3URGXFWLRQUDWH $)7(5GLYLVLRQ
$YHUDJHIROGFKDQJH
1RVORSHFKDQJH IROGVORSHFKDQJH 0RYLQJDYHUDJH
VYVV
2
3URGXFWLRQUDWHGLIIHUHQFHIROG +LJKHUUDWH/RZHUUDWH
)UDFWLRQ
2
6WDWHVZLWFKLQJHYHQWV ,QWUDVWDWHVWHSV p = 0.007
D
Total YFP (a.u.)
0 2 4 6 8
x 104
Time (hours)
YFP slope (a.u./15min) 0 5 10 15
−500 0 500 1000
cycloheximide 2FWVLPXODWHG
7LPHKRXUV
7RWDOSURWHLQDX
E
&HOOF\FOHV
* C
NKICit: Serum+LIF
&LWULQHP51$
1DQRJP51$
U
U
Figure 2.9: Construction and analysis of live cell reporters, and simulations based on observed kinetics
(A) Schematic of Nanog reporter (top) and Oct4 reporter (bottom) construction.
(B) Correlation between Nanog (unmodified allele) and Citrine (knock-in reporter on second allele) transcripts in NKICit cell line. r, Pearson correlation coefficient. Light blue, presumed fraction of cells with silenced reporter cassettes (∼10% of all cells; see Supp. Info. for discussion); dark blue, remaining cell population. (C) H2B-Citrine protein degradation rate assayed by blocking translation during movie at time in- dicated. Total YFP became flat (top) with negligible slope (bottom) shortly after cycloheximide treatment. (D) Identification of sharp inflections in total fluorescence traces. (i) First, frames around cell divisions are removed and fluorescence lost during divisions is added back to the daughter trace to create a continuous trace for each lineage (ii), where a step detector spanning a 6-hour window is applied across consec- utive frames. (iii) For each window, a one-piece linear fit is compared with a two-piece fit that is flexible at the midpoint. A two-piece fit is considered better than a one- piece fit when two criteria are met. 1) Residual noise of the one-piece fit is higher than a threshold (see Supp. Info.), and 2) the slopes of the two-piece fit are significantly different between the two pieces. iv) For each stretch of frames meeting both crite- ria 1 (magenta line indicates threshold) and 2 (orange line indicates where two-piece fit yields significantly different slopes), the window with the highest residual noise is assigned to be the inflection. v) Continuized trace approximated into linear seg- ments between identified points of inflection. (E) Apparent steps from simulated Oct4 expression under the bursty transcription model using parameters estimated from sm- FISH. (F) Protein traces were simulated under the bursty transcription model over various mRNA half-life and burst frequency combinations; mean burst size was kept constant at 35 mRNA/burst. Gaussian noise proportional to the total protein level and equivalent to the magnitude of frame-to-frame variation empirically observed was added to the simulated traces for comparability. Arrowheads indicate detected steps on simulated trace of the corresponding color. Note that changes in production rate around cell division events can be identified as steps either before or after the division. Red box: Estimated regime for Nanog-Hi in serum+LIF. Right: Variation in the frequency of detected steps over the same parameter space. (G) Production
rates decrease by an average of 0.63-fold across cell divisions. Each point represents a division event. Average production rates of the 4-hour windows before and after each cell division are compared. Black dotted line: zero change; grey dotted line: 0.5-fold change; purple line: average trend; Inset) example trace indicating slope before and after division. (H) Changes in production rate over state-switching events or intra- state steps. ‘Higher rate’-to-‘lower rate’ ratios are plotted for all steps and events, i.e., down-steps and Nanog-high-to-Nanog-low switching events are represented by the reciprocals of rate change. (p-value, KS test)
0 5 10 0
0.1 0.2
Blimp1 p=0.15
0 20 40
0 0.1 0.2
Dnmt3b p=0.07
0 100 200
0 0.5
Dppa3 p=NaN
0 100 200
0 0.05
Esrrb p=0.38
0 100 200
0 0.05
Klf4p=0.50
0 50 100
0 0.05 0.1
Lifrp=0.31
0 200 400
0 0.05
Nanog p=0.37
0 50 100
0 0.05 0.1
Nr0b1 p=0.16
0 20 40
0 0.05 0.1
Pecam1 p=0.97
0 50 100
0 0.05 0.1
Prdm14 p=0.24
0 200 400
0 0.05
Rex1p=0.63
0 50 100 150
0 0.05
Sox2p=0.07
0 10 20 30
0 0.05
Tbpp=0.30
0 50 100 150
0 0.05
Tbx3p=0.11
0 50 100
0 0.05 0.1
Tcl1p=0.44
0 50 100
0 0.05 0.1
Tet1p=0.49
2i+Serum+LIF
A
Transcript Count/Cell
Population Fraction
Figure S5.
Time (hours)
Total reporter (a.u.)
0 5 10 15 20 25
0 5 10
x 104
*
State-switching event
*
B
Nanog-Low to Nanog-SH
C
Burst size (mRNA/burst)
Burst frequency (hourï)
144723618
1 / 8
9
1 / 4 1 / 2 1 2 4
Time
Total Protein
Up-step frequencies Down-step frequencies
Burst size
Burst frequency
Burst size (mRNA/burst)
Burst frequency (hourï)
144723618
1 / 8
9
1 / 4 1 / 2 1 2 4
Time Production rate rank
f
Burst freq. (hourï)
Burst size (# mRNA)
0.125 0.25 0.59 1 2 4 18
36 72 144
Faster mixing
*
oA=0.5 norm. (fold change) 0.84 0.97 1.10 1.23 1.36 1.49 Protein production rate
mixing time
By simulation
Production rate rank at t = 0
Lowest Highest
0 5 10 15
0 5 10 15
Time (hours)
Production rate (a.u.) 0 5 10 150
10 20 30
Time (hours)
Production rate rank
0 5 10 15
0 1 2 3x 105
Time (hours)
Total Protein (a.u.)
Serum+LIF: Nanog-Hi intra-state
0 5 10 15
0 5 10 15
Time (hours)
Production rate (a.u.) 0 5 10 150
10 20 30
Time (hours)
Production rate rank
0 5 10 15
0 1 2 3x 105
Time (hours)
Total Protein (a.u.)
2i+Serum+LIF: Nanog-SH intra-state
Production rate rank at t = 0 Lowest Highest
Production rate rank at t = 0 Lowest Highest
D
Figure 2.10: Quantitative analysis of how 2i+serum+LIF affect static distributions and dynamics of gene expression for pluripotency regulators
(A) smFISH transcript count distribution of factors in 2i+serum+LIF with MLE fits overlaid. p-values are for χ2 GOF tests. p>0.05 indicates that the fit to the distribution is indistinguishable from the empirically measured distribution. Where present, solid lines represent components of the fit. Dashed line represents the overall fit to the distribution. (B) Example trace of cells switching from Nanog-low to Nanog- SH in 2i+serum+LIF. (C) Left: simulated traces similar to Fig. 9F, except over various combinations of burst size and burst frequency; mRNA half-life was kept constant at 4 hours. Bottom right: rank of production rate of 30 randomly selected traces (out of a total of 200) in each simulation under the corresponding parameter combination. Traces are color-coded by the initial rank at t = 0 as in D. Top right:
mixing time of protein production rate, defined as the time where auto-correlation of rank drops below 0.5. (D) Nanog expression dynamics of cells in serum/LIF with or without 2i. Each trace represents one cell randomly picked from a tracked lineage tree. Production rates are normalized by cell size and ranked within the group for each time point. Traces are color-coded by the initial rank at t = 0.
Direction of switch
Neither sister switched
Only one sister switched
Both sisters switched
Expected number of sister pairs that both switched**
NLo-to- NHi
169 7 7 [0 - 1]
NHi-to- NLo
139 15 2 [0 - 2]
Table 2.1: State-switching events show no correlation between sister cells Data shown in Table 2.1 are combined results from two independent experiments.
Analysis of individual data sets yields the same conclusion. * Data points are dis- carded if one of the cells in a sister pair was lost or not traceable in the movie **
Confidence interval obtained by random permutation test with 100,000 trials. Green indicates observed frequency of sister pairs in which both cells switched falls within the 95% C.I.