Previous Article | Next Article ![]()
Molecular and Cellular Biology, November 2005, p. 9447-9459, Vol. 25, No. 21
0270-7306/05/$08.00+0 doi:10.1128/MCB.25.21.9447-9459.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Curriculum in Genetics and Molecular Biology,1 Department of Biochemistry and Biophysics,2 Department of Biology,3 Carolina Center for the Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 275994
Received 6 June 2005/ Returned for modification 5 July 2005/ Accepted 9 August 2005
|
|
|---|
|
|
|---|
Regulation of accessibility to the DNA template is likely to be mediated in large part through differential regulation of nucleosome occupancy. Promoter regions of S. cerevisiae exhibit reduced nucleosome occupancy genome-wide (2, 26), and these differences in nucleosome occupancy are important for promoter accessibility (47). Furthermore, in S. cerevisiae, promoter and nonregulatory chromatin can be biochemically fractionated, indicating that those regions have distinct physical properties (36). Nucleosomes can be moved or displaced from specific genomic regions by several general mechanisms, including nucleosome-remodeling complexes like SWI/SNF and RSC (34), binding of activators to DNA (3, 4, 35), transcriptional elongation by RNA polymerase II (RNAP II) (20, 26, 46), and inherent properties of DNA sequence (47). Template accessibility and nucleosome occupancy can also be mediated by posttranslational modification of the N-terminal histone tails, most notably acetylation (27, 48). Although chromatin context may be defined in part by regional differences in histone modifications, no chromatin mark has been shown to correspond specifically to coding or regulatory regions throughout the genome. Here, we present evidence that dimethylation of histone H3 at lysine 36 (H3K36me2), which is mediated by the methyltransferase Set2p (25, 51), may provide such a mark.
Set2p interacts with the C-terminal domain (CTD) of RNAP II (22, 29, 56), and this interaction is regulated by the phosphorylation state of the CTD. Serine 5 (Ser5) of the CTD repeat is phosphorylated by Kin28p during initiation of transcription, while serine 2 (Ser2) and Ser5 are phosphorylated by Ctk1p during elongation (8, 17, 19, 30). Set2p associates preferentially with Ser2/Ser5 phosphorylated repeats of the RNAP II CTD, and deletion of CTK1 abolishes H3K36me2 (22, 56). Set2p-RNAP II interactions are also dependent on the Paf1 complex (Paf1p, Rtf1p, Cdc73p, Ctr9p, and Leo1p) (22), which also associates with RNAP II (21, 40, 49). This and other biochemical data suggest that Set2p associates with RNAP II specifically during transcription elongation (18, 22, 29, 45, 56). Chromatin immunoprecipitation (ChIP) assays followed by quantitative PCR on a few selected loci have supported this assertion, showing that H3K36me2 is generally restricted to the transcribed regions of RNAP II-regulated genes (1, 18, 22, 45).
While there
is strong evidence that Set2p is associated with elongating polymerase,
the physiological functions of Set2p and H3K36me2 are still unknown.
Evidence suggesting a function for Set2p in transcriptional elongation
comes from results showing either sensitivity or resistance of
set2
strains to the elongation inhibitor 6-azauracil.
These phenotypes are similar to those exhibited by strains defective
for genes encoding elongation factors like Chd1p, Iswi1p, and Fkh1p
(22,
28,
29,
45,
57). A role in
transcriptional elongation is also supported by synthetic genetic
interactions between set2
and deletions of all
members of the Paf1 complex, the chromodomain factor Chd1p, a putative
elongation factor Soh1p, and the Bre1p or Lge1p components of histone
H2B ubiquitination complex
(22). However, whatever
role Set2p plays in elongation is either not essential or redundant,
since set2
strains are viable and, in many
backgrounds, exhibit very mild phenotypes.
To further elucidate the cellular function of H3K36me2, we determined its pattern of distribution throughout the S. cerevisiae genome. We performed additional experiments to determine how the pattern of H3K36me2 changes in response to a change in global transcriptional state and the relationship between the H3K36me2 mark and nucleosome stability. H3K36me2 demarcates the structurally distinct regulatory and nonregulatory regions of yeast genomic chromatin and may serve as an indicator of chromatin context.
|
|
|---|
Strains and culture conditions.
For H3K36me2
and histone H3 ChIPs, strain AS4 (MAT
trp1-1
arg4-17 tyr7-1 ade6 ura3) was used
(50). For histone H4
ChIPs, a previously described myc-tagged H4 strain constructed in
strain UCC1111 [MAT
ade2::his3-
200
leu2-
0 lys2-
0
met15-
0 trp1-
63
ura3-
0 adh4::URA3-TEL
(VII-L) hhf2-hht2::MET15
hhf1-hht1::LEU2 pRS412 (ADE2
CEN ARS)-HHF2-HHT2] was used
(37,
38). Unless otherwise
described, yeast was grown to an optical density of 0.8 to 1.0 at 600
nm with shaking at 30°C in 100 ml of yeast
extract-peptone-dextrose media (1% yeast extract, 2% peptone, 2%
dextrose).
Antibodies. Antibodies against histone H3 lysine 36 dimethylation have been described previously (51) and were derived from Upstate (catalog no. 07-369). myc-tagged antibodies were also obtained from Upstate (catalog no. 05-419). The rabbit histone H3 antiserum was obtained from Abcam, Inc. (AB1791), and was raised in rabbits using a peptide corresponding to amino acids 124 to 135 (CGIQLARRIRGERA) of human histone H3.
Dot blot. Peptides (KSAPSTGGVKKPHRYKPGTGK-BIOTIN) in which the residue corresponding to H3K36 (underlined) was either mono-, di-, or trimethylated were resuspended in double-distilled H2O (10 µg/µl) and serially diluted in Tris-buffered saline (TBS) (150 mM NaCl, 10 mM Tris, pH 7.6). Aliquots of 100-µl peptide-TBS solution were spotted onto polyvinylidene difluoride membranes by using a Bio-Rad dot blot apparatus. Membranes were washed in TBS and then blocked in a solution of 2.5% (wt/vol) Carnation nonfat dry milk in TBS-Tween 20 (0.1% Tween 20 in TBS) for 10 min prior to incubation with a 1:10,000 dilution of the specified antibody for 2 h at room temperature. Membranes were washed with TBS-Tween 20 for 10 min three times, incubated with anti-rabbit horseradish peroxidase-conjugated immunoglobulin G for 2 h at room temperature, and then washed again for 10 min three times prior to detection using ECL-Plus from Amersham.
ChIP assays.
ChIP assays were performed as
described previously
(23). Briefly, whole-cell
extracts were prepared from 1% formaldehyde-fixed wild-type and
set2
cells by using lysis buffer (50 mM HEPES-KOH, pH
7.5, 300 mM NaCl, 1 mM EDTA, 1% Triton X, and 0.1% sodium deoxycholate)
and sonicated to shear the chromatin (0.25- to 1-kb range).
Immunoprecipitation was performed with anti-H3K36me2, anti-myc, or
anti-H3. After cross-link reversal at 65°C, DNA was extracted
by using a QIAGEN PCR purification kit according to the manufacturer's
instructions.
DNA amplification, labeling, array hybridization, and data processing. ChIP-enriched DNA and reference DNA in all experiments were amplified as described previously (5). Briefly, two initial rounds of DNA synthesis with T7 DNA polymerase using primer 1 (5'-GTTTCCCAGTCACGATCNNNNNNNNN-3') were followed by 25 cycles of PCR with primer 2 (5'-GTTTCCCAGTCACGATC-3'). Cy3-dUTP or Cy5-dUTP was then incorporated directly with an additional 25 cycles of PCR using primer 2. Microarray hybridizations were performed using standard procedures (16). The arrays were scanned with a GenePix 4000 scanner, and data were extracted with Genepix 5.0 software. Data were normalized such that the median log2 ratio value for all quality elements on each array equaled zero, and the median of pixel ratio values was retrieved for each spot. Only spots of high quality by visual inspection, with at least 50 pixels of quality data (regression R2 of >0.6) and for which intensity of the reference signal was strong (>350 U), were used for analysis. Arrayed elements that did not meet these criteria on at least half of the arrays were excluded from analysis. All data were log transformed before further analysis. For normalization with the nucleosome occupancy data, the median log2 ratio values of H4-myc ChIP were subtracted from the median H3K36me2-ChIP ratio values. Unless otherwise noted, all data presented are nucleosome occupancy normalized in this way. While many methods of bulk nucleosome normalization are possible, all must contend with the inherent difficulties of combining ChIP data sets produced with two different antibodies (6). The method used here is simplest and provides a more realistic representation of the modification pattern than do unnormalized data. We provide all raw data (see below) so that readers may apply their preferred normalization method.
DNA microarray preparation. Open reading frames (ORFs) and intergenic regions from yeast (S288C) were PCR amplified and printed on polylysine-coated glass slides by using a robotic arrayer as described previously (16). ORFs were generally represented by PCR products that extended from start codon to stop codon. Elements representing intergenic regions generally included all DNA between annotated ORFs, with the fragments divided such that PCR products were no longer than 1.5 kb.
Locus-specific detection of ChIP enrichment. The sequences of the primers used (see Fig. 2 and 5) are shown in Table 1.
![]() View larger version (34K): [in a new window] |
FIG. 2. Validation
and fine-scale mapping of H3K36me2 distribution across 16 kb of
chromosome XII. (A) Schematic of chromosome XII coordinates
1,036,089 to 1,052,141, showing the locations of primers used to
interrogate three of the independent H3K36me2 ChIP-chips shown in Fig.
1. Primer sets are listed
in Table 1. (B)
Polyacrylamide gel analysis of PCR products generated by the primer
sets shown in panel A (Test). The reference product (Ref.) corresponds
to 146 bp of a large noncoding region between YEL073C and YEL072W on
chromosome V. (C) Quantitation of the gel shown in panel B.
Graphs show the average enrichments detected by the PCR primer sets
following H3K36me2 and histone H4-myc ChIPs. The value plotted for each
fragment was calculated as follows:
[(w/x)/(y/z)], where all values are
the sum of pixel intensities for each band. w, ChIP fragment;
x, ChIP reference; y, input test fragment;
z, input reference. Numbers therefore reflect relative
immunopreciptation efficiencies; values of less than 1 may be expected.
Error bars illustrate the average deviations from the means.
(D) H3K36me2 enrichment values after normalization with
general nucleosome occupancy levels. The values plotted were calculated
as follows:
[(w/x)/ (y/z)]H3K36me2/[(w/x)/(y/z)]H4-myc.
|
![]() View larger version (28K): [in a new window] |
FIG. 5. Validation
and fine-scale mapping of H3K36me2 acquisition upon gene activation.
(A) Schematic of the PHM7 locus, showing the
locations of primers used to interrogate three of the independent
H3K36me2 ChIP-chips shown in Fig.
4B, before and after heat
shock. Primer sets are listed in Table
1. (B)
Polyacrylamide gel analysis of PCR products generated by the primer
sets shown in panel A (Test). The reference product (Ref.) is the same
as that used for Fig. 2.
(C) Quantitation of the gel shown in panel B. Graphs show the
average enrichments detected by the PCR primer sets following H3K36me2
and histone H4-myc ChIPs, both before and after heat shock. The value
plotted for each fragment was calculated as described in the legend for
Fig. 2. (D)
H3K36me2 enrichment values before and after heat shock after
normalization with general nucleosome occupancy levels, calculated as
described in the legend for Fig.
2. H3K36me2 levels
increase after heat shock. Note that the increase is confined primarily
to areas R, S, and T, which represent the center and 3' end of
the gene, but not areas P and Q, which represent, respectively, the
promoter and coding sequences at the 5'
end.
|
|
View this table: [in a new window] |
TABLE 1. PCR
primers used in Fig. 2 and 5
|
|
|
|---|
![]() View larger version (38K): [in a new window] |
FIG. 1. H3K36me2 is restricted to RNAP II-transcribed regions genome-wide. (A) The indicated amounts (top) of the specified peptides (right) were blotted onto a polyvinylidene difluoride membrane and probed with a 1:10,000 dilution of the H3K36me2 antiserum. To verify the presence of the H3K36me1 and H3K36me3 peptides, antibodies specific to each of those peptides were used in parallel to probe identical blots (see Fig. S1) in the supplemental material). (B) Colors (scale at bottom) represent the median of ratios [log2(H3K36me2 ChIP signal intensity/normalized reference signal intensity)] recorded from all arrayed elements in the indicated Saccharomyces Genome Database functional class (numbers of arrayed elements in parentheses). Data were derived from 12 independent wild-type and 8 set2 H3K36me2 ChIP experiments (biological replicates). Intergenic regions are organized into the following three categories: double, upstream of two divergent genes; single, upstream of one gene; non, upstream of zero genes. (C) A histogram showing the distribution of H3K36me2 ChIP median log2 ratios in a set2 strain. Ratios in panels C to H were normalized to bulk nucleosome occupancy by subtracting median H4-myc log2 ratios (eight independent experiments, wild type) or anti-H3 ChIPs (five independent experiments, set2 ) from median H3K36me2 ChIP log2 ratios (same 12 ChIPs as mentioned for panel B). (D) Same as panel C, but for a wild-type strain. (E) Same as panel D, but instead of computational
normalization, plotted ratios were derived from direct hybridization of H3K36me2 ChIP versus H4-myc ChIP (four independent ChIP sets). Note that for panels C to E, a positive correlation between the ratios reported at double promoters and the ratios reported at adjacent ORFs was observed (data not shown). Therefore, any high ratios reported at double promoters may, at least in part, be attributed to high ratios in adjacent ORFs. This is likely due to the limited resolution of the ChIP-chip procedure and our microarrays. (F) Same as panel C, but for noncoding regions. (G) Same as panel D, but for noncoding regions. (H) Same as panel E, but for noncoding
regions.
|
In
S. cerevisiae, the noncoding regions downstream of two
convergently transcribed genes are almost always completely
transcribed, often on both strands, by the converging
polymerases (15). We
found that these regions, which correspond to 3' untranslated
regions (UTRs), were enriched by H3K36me2 ChIPs at a level
equal to or greater than the enrichment observed at ORFs (Fig.
1B). To confirm that our
ChIPs were reflections of H3K36me2 levels, we performed ChIP
experiments using extracts from set2
strains. Very
little DNA was recovered from these ChIPs, and analysis of the DNA that
was recovered revealed none of the specific patterns described above
(Fig. 1B). We therefore
interpret the efficiency of DNA recovery at each locus after H3K36me2
ChIP to reflect relative H3K36me2 levels. The evidence presented thus
far supports the hypothesis that regions of the genome transcribed by
RNAP II are enriched for
H3K36me2.
Chromatin upstream of ORFs is H3K36me2 deficient. In further support of the hypothesis that H3K36me2 is restricted to transcribed regions, the lowest levels of H3K36me2 were found in chromatin upstream of two divergently transcribed genes ("double promoters"), which is not expected to be transcribed by RNAP II (Fig. 1B). On the other hand, "single promoters" are expected to be partially transcribed since they contain the 3' UTR of the upstream gene. As predicted, single promoters exhibit a level of enrichment lower than that observed for ORFs and 3' UTRs but higher than that observed for double promoters (Fig. 1B). These experiments provide evidence that dimethylation of histone H3 at lysine 36 is absent from regions of the genome that are not transcribed by RNAP II.
The genomic pattern of H3 lysine 36 dimethylation persists after normalization for general nucleosome occupancy. Nucleosome occupancy is generally lower in noncoding regions upstream of genes than in ORFs (2-4, 26, 43, 52). Thus, we wondered if the pattern we observed with the H3K36me2-specific antiserum was a reflection, at least in part, of general nucleosome occupancy. To ensure that our results were specific to the H3K36me2 modification, we normalized our H3K36me2 distribution data to general nucleosome occupancy. We prepared extracts from yeast cells in which the only source of histone H4 was tagged with the myc epitope and performed ChIP assays using anti-myc antibodies. Histone H4 ChIPs were performed on five independent yeast cultures. Consistent with published data (26), results of the histone occupancy ChIPs revealed that nucleosomes were more enriched in the coding region of genes than in intergenic regions (Fig. 1B). Indistinguishable results were obtained with nucleosome ChIP-chips using an antibody specific to the C terminus of histone H3 (data not shown). In parallel, H3K36me2 ChIPs were performed using the same extracts. Even without normalization, the qualitative differences between the distribution of H3K36me2 and general nucleosome occupancy indicated that the H3K36me2 pattern was indeed distinct. Specifically, noncoding regions downstream of convergently transcribed genes were enriched by the H3K36me2 ChIP at a level nearly equal to the enrichment observed at ORFs, whereas in histone H3 or H4 ChIPs, ORFs were more strongly enriched than 3' UTRs (Fig. 1B).
For further data analysis, we chose the simplest possible normalization routine by subtracting the median log2 ratio values of the H4-myc ChIP-chip data from the median ratio values of H3K36me2 ChIP-chip data (see Materials and Methods). After normalization, the clear enrichment of transcribed genomic regions and corresponding depletion of regulatory regions of the genome persisted (Fig. 1C, D, F, and G). As a test of the validity of this normalization approach, we performed direct comparative hybridizations between DNA enriched by H3K36me2 ChIP and DNA enriched with H4-myc ChIP. The data obtained from direct comparative hybridizations were essentially identical to the computationally normalized H3K36me2 data (Fig. 1E and H).
Higher resolution, locus-specific ChIPs confirm that H3K36me2 is concentrated in chromatin within coding regions and at the 3' end of genes. In H3K36me2 ChIPs, chromatin downstream of convergently transcribed genes was by far the most highly enriched class of noncoding chromatin (Fig. 1G). This suggested that Set2p is active throughout the entire transcript length, providing a possible mechanism for distinguishing nonregulatory intergenic regions from promoters. To validate this observation, we interrogated our ChIP results with PCR primers that represent regulatory and transcribed chromatin across a 16-kb region on chromosome XII (Fig. 2). Again we found that the chromatin heavily enriched by H3K36me2 ChIPs corresponded to coding regions and to regions lying downstream of two convergently transcribed genes. For example, SST2 and LEU3 are both transcribed under the conditions assayed. Their 3' UTRs are each about 450 bp in length (15) and are represented by the primer sets B, C, and D in Fig. 2. Chromatin covered by these primer sets is among the most heavily enriched in the tested region.
Levels of histone H3K36 dimethylation do not correlate with transcription frequency. Localization of H3K36me2 to chromatin in the body of the RNAP II-transcribed genes is consistent with the earlier studies showing that Set2p associates with the elongating form of RNAP II. We wondered if the frequency of transcription correlated with the degree of H3K36 dimethylation. The transcription frequency (also called transcription rate) for each S. cerevisiae gene has been calculated based on measurements of steady-state RNA levels and RNA half-lives in exponentially growing yeast cells at 30°C (14). We compared these published transcription frequency values to the results of 12 independent H3K36me2 ChIP-chip experiments. We found that among genes with measurable transcription frequencies (>0 mRNA/hour), the level of H3K36me2 enrichment did not correlate with transcription frequency (Fig. 3A). Genes with transcription frequencies ranging from 1 to 120 mRNAs/hour were consistently enriched in the H3K36me2 ChIPs. For example, despite low rates of transcription, genes like BUD14 and TPK2 (both 1.8 mRNAs/hour) were enriched in H3K36me2 ChIPs (97th and 95th ChIP percentiles, respectively) as highly as were heavily transcribed genes like HXK2 (71 mRNAs/hour, 96th ChIP percentile). These results suggest that H3K36 dimethylation occurs chiefly in the initial instance or early instances of gene transcription, with subsequent transcription playing at most a maintenance role.
![]() View larger version (22K): [in a new window] |
FIG. 3. H3K36me2
levels in ORFs do not correlate with transcription frequencies.
(A) Genes were sorted by their transcription rates during
mitotic growth (x axis)
(14), and a moving
average (mov avg) (window [win] = 40, step = 1) of the
degrees of their enrichment in H3K36me2 ChIPs was plotted (percentile
rank, y axis). In this plot, H3K36me2 ChIP values are not
normalized to overall nucleosome occupancy. (B) Same as panel
A, except that both H3K36me2 ChIPs and H4-myc ChIPs are
plotted.
|
We
then asked whether nucleosomes are less stable on highly transcribed
chromatin in the absence of H3K36me2. We measured bulk nucleosome
occupancy by performing histone H3 ChIPs in a set2
strain. However, we found that nucleosomes are lost from highly
transcribed chromatin equally in set2
and wild-type
strains, indicating that nucleosome occupancy is not directly affected
by H3K36me2 (see Fig. S2 in the supplemental material). Therefore,
additional mechanisms may work to stabilize H3K36me2 nucleosomes, or
H3K36me2 may be an indicator, but not a cause, of transcription-stable
nucleosomes.
H3K36 dimethylation correlates with "on" or "off" transcriptional state.
The
hypothesis that H3K36 dimethylation is stable and occurs chiefly in the
initial instance of gene transcription predicts no correlation between
H3K36 dimethylation level and transcriptional rate (as we observed) but
does predict a positive correlation between H3K36 dimethylation level
and the on or off transcription state of a gene. We defined a gene as
on if it had a measurable transcription rate (>0 mRNA/hour)
(14) and off if it did
not (0 mRNA/hour) (14).
To test this prediction, we ranked the ORFs according to their
enrichment levels in H3K36me2 ChIP experiments and divided them equally
into 10 bins, such that the least enriched 10% of ORFs were in bin 1,
the most enriched 10% were in bin 10, and so on. We then simply asked
what proportion of genes in each bin was on (Fig.
4A). The results show that genes that were not enriched by H3K36me2 ChIPs
were more likely than others to be off and that the likelihood of any
given gene to be off decreased with increasing H3K36me2 ChIP
enrichment. No such trend was observed with H4-myc ChIPs (Fig.
4A) or with H3K36me2 ChIPs
performed with a set2
strain (data not shown). This
result provides evidence that H3K36me2 is not linked with how often a
gene is transcribed per se but rather with the occurrence of
transcription.
![]() View larger version (24K): [in a new window] |
FIG. 4. H3K36me2
levels correlate with "on" or "off"
transcriptional state. (A) ORFs were divided equally into 10
bins according to their degrees of enrichment in H3K36me2 ChIPs
(normalized to H4-myc) or H4-myc ChIPs. The ORFs in each bin were then
classified as either "on" (>0 mRNAs/hr) or
"off" (0 mRNAs/hr). The percentage of genes in each bin
that was classified as "on" is shown. We note that ORF
size and transcription state are not independent variables, since
shorter ORFs tend to be in the off state, even after removal of
SGD-annotated dubious ORFs. However, analysis of ORFs of similar size
rather than all ORFs reveals the same pattern shown here (Fig. S3 in
the supplemental material). Note that 85% of genes are
transcribed at detectable levels during mitotic growth
(14,
15), so the uniform
H4-myc data reflect a neutral relationship with transcription state.
(B) Genes were sorted according to the degrees of their
transcript level change upon heat shock (x axis)
(10). A moving average
(mov avg) (window [win] = 40, step = 1) of relative
H3K36me2 ChIP enrichment (normalized to general histone H4 occupancy
measured in parallel) at each ORF (y axis) is plotted before
and after heat shock. Most genes are unchanged upon heat shock and are
therefore clustered near the center of the graph. Points to the right
represent ORFs that are activated by heat shock, and those to the left
represent ORFs repressed by heat
shock.
|
2 test; P = 8.6 x
105).
We also observed a relative
loss of H3K36me2 in the ORFs of repressed genes. Of genes that were on
during log phase and that remained active (log2 expression
ratios of >1), only 46% (1,099/2,388) had increased
H3K36me2 levels. In contrast, of genes that were on during log phase
and repressed fourfold or more after heat shock, 65% (210/332) had
decreased levels of H3K36me2 (
2 test; P
= 0.0014). By examining H3K36me2 ChIP data that were not
normalized to nucleosome occupancy (data not shown), we found that this
relative decrease is attributable in part to bulk nucleosome
replenishment at repressed genes
(26) rather than loss of
H3K36me2 on existing nucleosomes.
To further confirm our
ChIP-chip data, we performed quantitative ChIP analysis before and
after heat shock along the length of PHM7 (Fig.
5A). PHM7 is repressed in logarithmically growing cell cultures
(
0 mRNA per hour) but is induced by a factor of 7 during heat
shock (10). The results
showed that this gene acquired the H3K36me2 mark after heat shock and
only in the 3' region of the ORF (primer sets R and S) (Fig.
5B to C). This result
persisted after we normalized for bulk histone occupancy changes
following heat shock (Fig.
5D).
A positive correlation between gene length and measured H3K36me2 ChIP enrichment suggests that H3K36 dimethylation is initiated at a fixed distance from the start of transcription. Previous studies at selected genes, including ADH1, PYK1, PMA1, and SCC2, have shown that dimethylation of H3K36 is initiated after transcriptional initiation, concomitant with association of Set2p with the elongating polymerase (1, 18, 22). If this mechanism operates genome-wide, and if the interval between transcriptional initiation and Set2p association is constant regardless of gene length, longer genes will appear to be enriched by our H3K36me2 ChIPs to a greater extent than shorter genes. The reason for this predicted relationship is illustrated in Fig. 6A and the corresponding legend and is a consequence of the fact that the DNA on our microarrays covers each ORF from the start codon to the stop codon, regardless of length.
![]() View larger version (21K): [in a new window] |
FIG. 6. Evidence
that H3K36dimethylation begins at a set distance from the initiation of
transcription genome-wide. (A) The hypothesis that H3K36
dimethylation begins at a determined and consistent distance from
transcriptional initiation predicts that higher ratios will be reported
for longer ORFs. Note that "input chromatin" is used as
a reference for these experiments, and the null expectation is that the
raw signal intensities in both channels will be proportional to ORF
length, resulting in neutral ratios for all ORFs. The prediction of
higher ratios for longer ORFs is based entirely on the relative
proportion of the arrayed element that is dimethylated, not the
absolute length of the arrayed element or genomic feature per se. For
example, if the entire ORF were modified, or if the distance at which
the modification began from transcriptional initiation were
proportional to ORF length, equal ratios would be obtained for long and
short ORFs. Blue circles, nucleosomes not dimethylated at H3K36.
(B) ORFs were sorted by length, and moving averages (mov avg)
(window [win] = 40, step = 1) of their ratios of
enrichment in H3K36me2 ChIPs (not normalized to H4-myc) and H4-myc
ChIPs (from this study and reference
26) were
plotted.
|
2,000 bp is also a
predicted feature of a mark that begins at a set distance from the
transcriptional start, since the proportion of the gene that is not
modified becomes smaller with increasing length. No such relationship
between ORF length and enrichment is observed with H3K36me2 ChIPs
performed from set2
extracts (data not shown). We did
observe a weak relationship between ORF length and apparent bulk
nucleosome occupancy for the H4-myc ChIPs performed in this study (the
effect was even less pronounced in reference
26). This may be due to
nucleosome loss very near to the site of transcriptional initiation,
which would be predicted to have this effect. In any case, the
magnitude of the relationship between size and length was much stronger
for the H3K36me2 ChIPs, suggesting a defined boundary for the
initiation of H3K36me2 inside the ORFs genome-wide. This conclusion is
further supported by the H3K36me2 ChIP profile across 16 kb of
chromosome XII (Fig. 2D).
For example, primer set A is situated
154 bp from the
LEU3 transcription start site and does not detect significant
enrichment, whereas chromatin represented by primer set E is situated
400 bp after the start site of SST2 and is highly
enriched. Likewise, for the relatively long FMP27 gene, primer
set K at the 5' end of the coding region does not report
enrichment, but the downstream primer sets L, M, and N show steady
enrichment of the chromatin at the 3' end of the coding
region. In the course of this analysis, we noted that short genes, on the whole, tend to be more frequently transcribed than long ones. Therefore, we wondered whether the correlations between length and ratio reported here confounded the conclusions presented in Fig. 3. To test this possibility, only ORFs greater than 1,000 bp in length, which do not show a relationship between length and transcription frequency, were used in the same analysis shown in Fig. 3. The resulting plot was indistinguishable from the one presented (data not shown).
Evidence that H3K36 dimethylation ends upon termination of transcription. The results presented thus far provide evidence that dimethylation of H3K36 is restricted to transcribed genomic regions. A corollary to that hypothesis is that H3K36 dimethylation ends upon transcriptional termination. This hypothesis predicts that the smaller the interval between two convergently transcribed genes, the higher the measured ratio of enrichment in our H3K36me2 ChIPs. This is because these shorter regions are likely to be entirely transcribed, while as the distance between the two upstream genes grows, it becomes progressively less likely that the entire intergenic region will be transcribed. This would result in unmodified nucleosomes toward the center of the fragment, resulting in lower ratios (illustrated in Fig. 7A). Note that this prediction of the relationship between size and enrichment is the opposite of the previously described scenario for ORFs.
![]() View larger version (17K): [in a new window] |
FIG. 7. Evidence that H3K36 dimethylation ends at transcriptional termination genome-wide. (A) The hypothesis that H3K36 dimethylation ends upon transcriptional termination predicts that higher ratios will be reported for shorter nonpromoters (see the text for details). Blue circles, nucleosomes not dimethylated at H3K36. (B) Nonpromoters were sorted by length, and moving averages (mov avg) (window [win] = 40, step = 1) of their degrees of enrichment expressed as percentile ranks in H3K36me2 ChIPs (not normalized to H4-myc) and H4-myc ChIPs were plotted. (C) Same as panel B but for double promoters. (D) Same as panel B but for single promoters.
|
extracts (data not
shown). H3K36me2 is lacking in transcriptionally silent chromatin and in chromatin transcribed by RNA polymerase III. To explore the possibility of other mechanisms of H3K36 dimethylation, we examined chromatin at telomeres and mating-type loci, two types of loci that are generally transcriptionally silent but serve specialized genomic functions. Both regions exhibit high nucleosome occupancy but are lacking in H3K36me2 (Fig. 8). We also asked whether H3K36 dimethylation was specific to chromatin transcribed by RNAP II or whether other polymerases might support cotranscriptional modification. Due to the repetitive nature of the RNAP I-transcribed rRNA genes, we were unable to make conclusions regarding H3K36me2 levels at these loci. However, we examined the RNAP III-transcribed tRNA loci and found that these regions were not enriched by our H3K36me2 ChIPs (Fig. 8). Although general nucleosome occupancy was also very low in regions transcribed by RNAP III, these results are consistent with the lack of evidence linking Set2p to RNA polymerase III. Therefore, Set2p's function in dimethylation of H3K36 appears to be mediated exclusively through its association with RNAP II.
![]() View larger version (33K): [in a new window] |
FIG. 8. H3K36me2
levels are very low or absent in transcriptionally silent and RNAP
III-transcribed chromatin. Colors (scale at bottom) represent the
medians of reported ChIP ratios, as described in the legend for Fig.
1B.
|
|
|
|---|
Challenges in determining the global distribution of chromatin modifications. Before discussion of the biological function of H3K36me2, it is worth mentioning some of the challenges that are inherent to any experiment that aims to determine the distribution of a histone modification genome-wide. In this study, we used ChIP to specifically enrich for genomic regions that contain H3K36me2 nucleosomes and then interpreted the efficiency of DNA recovery at each locus to reflect the relative amount of H3K36 dimethylation at each locus. Using this approach, several factors could create nonbiological variation in results, including the effects of fixation, epitope accessibility, antibody specificity, microarray content, and underlying bulk nucleosome occupancy. These challenges have been discussed at length in recent reviews (6, 13, 32, 54).
This study
includes important advances in addressing some of these issues.
First, we thoroughly demonstrated the specificity of our
H3K36me2 antibody by dot blot against H3K36me0, H3K36me1, and H3K36me3
peptides (Fig. 1A; see
also Fig. S1 in the supplemental material), Western blots derived from
whole-cell and nuclear extracts (data not shown), and control ChIPs in
set2
strains (Fig.
1B). Second, we used DNA
microarrays that cover the entire genome on a single slide. This
represents a significant improvement over many published studies that
used arrays representing only the ORFs or only noncoding intergenic
regions or that split ChIP samples and hybridized them independently to
separate arrays representing only the ORFs or the intergenic regions.
Use of a whole-genome array was essential to most of the conclusions
presented here (13).
Third, the H3K36me2 data have been normalized to bulk nucleosome
occupancy, using data from H3 or H4-myc ChIPs that were
performed in parallel from the same extract. This is important because
recent studies have shown that nucleosome occupancy throughout the
yeast genome is heterogeneous
(2,
26), and if left
unaccounted for, misleading patterns could emerge. To our knowledge,
this is the first instance of modified-nucleosome ChIP data being
normalized to apparent bulk nucleosome occupancy genome-wide. Finally,
we followed up each of our ChIP-chip experiments with high-resolution
PCR-based detection at individual loci, which provided additional
information and confirmed the conclusions drawn from the array
results.
How is the intricate genomic pattern of histone H3K36 dimethylation specified? The general mechanism of directing Set2p to specific genomic regions by piggybacking on elongating RNAP II through association with a doubly modified CTD (Ser2/Ser5) is entirely sufficient to explain the pattern of H3K36me2 we observed throughout the genome. This is an important conclusion because it indicates that Set2p modifies chromatin only when associated with RNAP II and not, for example, on soluble histone H3 prior to chromatin assembly. More specifically, the genome-wide analysis shows that H3K36me2 occurs at a determined distance from the initiation of transcription, regardless of ultimate transcript length (Fig. 6). This result is consistent with PCR-based ChIP assays performed on single genes (1, 18, 22, 45) and with locus-specific results presented here that imply H3K36 dimethylation begins approximately two nucleosomes downstream of the start codon. In addition, the data indicate that Set2p chromatin-modifying activity stops upon transcriptional termination (Fig. 7). We observed no relationship between the presence of introns and H3K36 dimethylation levels (data not shown).
What is the biologically relevant function of H3K36 dimethylation?
Our
results indicate that levels of H3K36me2 are not correlated with the
frequency of transcription but rather with the occurrence of
transcription per se (Fig.
3 and Fig.
4). This result suggests
that H3K36 methylation does not generally act as a
"rheostat" for gene transcription. In S.
cerevisiae, set2
strains are viable and, in many
backgrounds, exhibit only mild phenotypes. So, what does H3K36
methylation do?
At individual loci, Set2p has been shown to act as a transcriptional repressor (25, 51). In one of these studies, Set2p caused repression of GAL4 but not of other examined genes, and in the other, Set2p was artificially tethered to a promoter, which resulted in transcriptional repression. So while Set2p may act as a transcriptional repressor at individual genes or have the capacity to repress transcription if inappropriately tethered at promoters, a general role for repression of transcription at gene promoters is not consistent with the genomic pattern reported here.
Given Set2p's established interaction with elongating
polymerase and the genomic pattern of H3K36me2 reported here, it is
easy to envision a role for Set2p and H3K36me2 in transcriptional
elongation. Several lines of evidence suggest that this is the case.
Perhaps the most convincing is synthetic genetic array analysis, which
revealed growth defects when a set2
mutant was
combined with deletions of any of the five components of the Paf1p
complex or of the transcription elongation factors Chd1p or Soh1p
(22). It has also been
shown that deletion of genes encoding either of two components
of the Paf1 complex, Rtf1p or Cdc73p, resulted in a decrease in the
recruitment of Set2p across the PMA1 gene and abolished H3K36
dimethylation at that locus
(22). In addition to
these findings, studies involving 6-azauracil help to confirm a role
for Set2 as an elongation factor
(18,
22,
28). However, it is still
not clear exactly how Set2p or H3K36me2 might participate in the
process of elongation itself. It remains possible that Set2p's
association with elongation is solely a mechanism to control the
distribution of H3K36 dimethylation, rather than an indication of any
direct participation in the transcription elongation process. In this case, the defects in elongation observed in the
absence of H3K36me2 could be indirect consequences of a failure to
recruit chromatin-modifying enzymes or other factors important for
transcriptional elongation to ORFs.
An epigenetic mark to distinguish regulatory and nonregulatory chromatin genome-wide. A mark such as H3K36me2 could also function as a "molecular memory" of transcription patterns that are specified at only one point during development or the life cycle but must be maintained afterwards. This concept of transcriptional memory is similar to what has been proposed for S. cerevisiae H3K4me3, which remains stable on chromatin long after the transcription of chromatin at that locus has ceased (39). The putative "memory" role of H3K36 methylation, which appears to be similarly stable, could be accomplished by the ability of this mark to physically affect the chromatin fiber or, more likely, through the recruitment of other remodeling factors that alter chromatin structure (41, 44). For example, it has been recently observed that histone H3 and H4 acetylation is generally lower in coding regions than in promoters and that this global acetylation pattern is regulated by the protein Eaf3p (42). Eaf3 is a subunit of both the NuA4 histone acetylase complex and the Rpd3 histone deacetylase complex (9, 11). Reid et al. proposed that "Eaf3 might recognize some feature of chromatin (e.g., nucleosome conformation or nonhistone protein) that is distinct between promoters and coding regions" (42). H3K36 methylation could be just such a distinguishing feature of coding and noncoding chromatin.
One intriguing possibility along these lines is that Set2p mediates H3K36 dimethylation to create a stable epigenetic mark that generally serves to distinguish regulatory and nonregulatory chromatin genome-wide. What function might such a distinction serve? As described in the introduction, coding and noncoding chromatin exhibit several biologically important differences whose underlying physical basis remains unexplained. Higher nucleosome occupancy in the body of genes may serve to prevent nonproductive transcription factor-DNA interactions by occluding binding sites that occur in coding regions. Conversely, nucleosomes in promoter regions may be more prone to low occupancy or disassembly, thereby exposing binding sites and directing transcription factors to appropriate targets (47). These two tendencies, acting in concert, would have the effect of reducing the "sequence space" that must be searched by any given factor before an appropriate target is found.
In S. cerevisiae, about 85% of genes are transcribed at detectable levels during mitotic growth (14, 15), meaning that H3K36me2 distinguishes regulatory and nonregulatory chromatin throughout most of the genome. In actively transcribed chromatin domains, upstream regulatory sequences are clearly distinguished by their lack of the H3K36me2 mark. Therefore, the H3K36me2 mark, which is conserved throughout eukaryotic evolution, represents the first physical mark that has been shown to distinguish regulatory sequences from coding and nonregulatory intergenic sequences genome-wide. A transcription-coupled mark that does not correlate with transcription rate and is correlated with stabilized nucleosomes on transcribed chromatin, both properties of H3K36me2 described here, may be a general feature of eukaryotic chromatin that contributes to the mechanism of context-dependent targeting of DNA-associated proteins.
This work was supported by NIH grants to B.D.S. (R01GM68088) and J.D.L. (K22HG002577). B.D.S. is a Pew Scholar in the Biomedical Sciences.
Supplemental
material for this article may be found at
http://mcb.asm.org/. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»