Previous Article | Next Article ![]()
Molecular and Cellular Biology, November 2005, p. 9674-9686, Vol. 25, No. 21
0270-7306/05/$08.00+0 doi:10.1128/MCB.25.21.9674-9686.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Department of Biochemistry,1 Howard Hughes Medical Institute, Division of Nucleic Acids Enzymology, University of Medicine and Dentistry of New Jersey, Robert Wood Johnson Medical School, 683 Hoes Lane, Piscataway, New Jersey 08854,2 Department of Biomedical Informatics, The Ohio State University, 3184 Graves Hall, 333 W. 10th Ave., Columbus, Ohio 432103
Received 22 November 2004/ Returned for modification 10 December 2004/ Accepted 5 August 2005
|
|
|---|
|
|
|---|
Yet there are a few examples where the core promoter apparently plays a very specific role. Early examples include studies of the myoglobin and simian virus 40 TATA boxes (86) and the hsp70 TATA box (72) that indicated a requirement for TATA boxes of a particular sequence. The TdT promoter was also shown to become inactive upon the inclusion of a TATA box (20). Lastly, some activators display preferences for particular core promoter architectures (7, 18, 56).
This perception of core promoters lacking diversity is being revised with the findings of other core promoter elements such as the BRE (41) and elements located downstream of the transcriptional start site such as the DPE (5, 6, 40, 89), the downstream core element (DCE) (43), and the motif 10 element (MTE) (48). The most studied of these downstream elements is the DPE. The DPE was discovered during the analysis of TFIID interactions with several Drosophila promoters (5, 6). The DPE is centered at approximately +30 relative to the transcriptional start site. Thus far, it has been shown to function only in TATA-less promoter contexts and is at least partially redundant in the context of a TATA box (6). Additional indications of core promoter specificity arose from studies showing that specific enhancers require a specific core promoter structure containing either a TATA box or DPE (7, 56).
A second class of downstream elements which are architecturally distinct from the DPE exists. Initially discovered in the human ß-globin promoter, the DCE consists of three subelements. Point mutations in each subelement are found in human ß-thalassemia patients (this paper and references 2, 8, 24, 31, 57, and 90). As with the DPE, experiments suggested that TFIID recognized the DCE in vitro (43). Here we confirm and extend these observations by showing that the adenovirus major late promoter (Ad MLP), Ad E3, Ad IX, and the herpes simplex virus (HSV) UL38 promoters contain functional DCE sequences, thus demonstrating that the DCE is not simply restricted to the promoters of the ß-globin locus. Computer analysis of promoter databases indicates that DCE sequences are found in a variety of promoters, further suggesting that the DCE represents a novel, widely distributed core promoter element. We demonstrate that DCE function can be recapitulated in a highly purified transcription system and requires the general transcriptional machinery together with the transcription activation factor (TAF) subunits of the TFIID complex. We also show that TAF1 is the major TAF species that can be cross-linked to wild-type but not mutant DCE subelements. Our results show that the DCE represents a distinct class of downstream elements and underscores the intricacies in transcriptional regulation at the level of core promoter elements.
|
|
|---|
Template construction. MLP scanning mutants, a mutant TATA box, and +7/9 ß-globin mutants were constructed as described previously (43). The mutations are as follows: +1 to +3 (+1/3) ACT to GGG, +4/6 CTC to GGG, +7/9 TTC to GGG, +10/12 CGC to AAA, +13/15 ATC to GGG, +16/18 GCT to AAA, +19/21 GTC to AAA, +22/24 TGC to AAA, +25/27 GAG to CCC, +28/30 GGC to TTT, and +31/33 CAG to TTT; the ß-globin mutant TATA box was changed from CATAA to CGCGC, and ß-globin +7/9 GCT was changed to AAA. All other ß-globin templates are from reference 43. MLP DNA PCR upstream and downstream primers contained XbaI and XhoI restriction sites, which were used to insert the MLP DNA into the XbaI-XhoI-digested pSP72 vector (Promega). Ad E3 and IX promoter regions were amplified from the adenovirus type 2 genome by PCR using primers containing 5' EcoRI and 3' XhoI adapters (5'-CGGAATTCCGGGAAGTGAAATCTGAATAAT-3' and 5'-CCGCTCGAGCGGCTT CGGTAATAACACCTCCG-3'), digested with EcoRI and XhoI, and subcloned into the pSP72 vector (Promega) linearized with the same enzymes. To generate the mutant E3 promoter, wild-type E3 promoter in the pSP72 vector was amplified using the following two sets of primers (SP6 primer and 5'-ACCAAGAGAGGAAAACACCGACTCGTC-3', and T7 and 5'-GACGAGTCGGTGTTTTCCTCTCTTGGT-3'). Those PCR products were used as templates for another PCR with SP6 and T7 primers, digested with EcoRI and XhoI, and subcloned into the pSP72 vector (Promega) linearized with the same enzymes. The IX mutant promoter was amplified using the following sets of primers (SP6 and 5'-AAACGAGTTGGCAAACATGGCGGCGGC-3' and T7 and 5'-GCCGCCGCCATGTTTGCCAACTCGTTT-3').
Statistical analysis.
A total of 1,871 nonredundant human promoter sequences 600 bp long (499 to +100 bp around the TSS) from Eukaryotic Promoter Database (EPD) release 75 (4, 68) (http://www.epd.isb-sib.ch/), and 8,793 promoters sequences 1,200 bp long (1,000 to +200 bp) from the Database of Transcriptional Start Sites (DBTSS) (59, 74, 75) (http://dbtss.hgc.jp/index.html) were used for statistical analyses. The software package Promoter Classifier (available at our website, http://bmi.osu.edu/
ilya/promoter_classifier/) (21, 22) was used for statistical analysis. To divide the promoter databases into TATA+/TATA and Inr+/Inr subsets, the respective position weight matrices were applied (4). Since there are no matrices for DPE, we matched five out of five letters for the DPE consensuses (73).
To define an interval (window) for a functional position of the TATA box, Inr, and DPE core elements, we considered the distribution of the element's occurrence frequencies along the promoters. For both databases, we found the unambiguous maximums for the occurrence frequencies of the centers of the TATA and Inr elements at positions 28 and +1, respectively, which is consistent with the known functional positions of these elements. The occurrence frequency of the TATA box is essentially larger in the window (from 33 to 23 bp) than in the surrounding area. We consider this window the functional window for the TATA box. For the Inr element, the functional window is 5 to +6 bp since the +1 position in the EPD is defined with an accuracy of ±5 bp. Since DPE works in cooperation with an Inr if positioned 27 bp downstream from it (5), we applied the window (28 5) -(28 +5) bp for the DPE. A more detailed description of the statistical analysis procedure can be found elsewhere (21).
To calculate the statistical significance of the occurrence frequency of a subelement in the functional window, we calculated a parameter (statistical significance [dS] measured in units of standard deviation [SD] as
) and dS as (Nin Nout)/
, where Nin is the number of promoters containing an element or combination inside its functional window and Nout is a statistically expected number obtained for the promoter sequence segments outside the functional window.
Site-specific protein-DNA photo-cross-linking. Purification of FLAG-tagged TFIID complexes was essentially as described previously (11). To generate single-stranded mutant DNAs for photo-cross-linking, Ad MLP in pSP72 was digested with XhoI, filled in at the recessed end, and digested with XbaI. M13mp19 RF DNA was digested with EcoRI, filled in at the recessed end, and digested with XbaI. DNAs were ligated and transformed into XL1-Blue cells. Single-stranded template DNAs were prepared as described in Sambrook et al. (66a) Derivatized promoter DNA fragments containing cross-linking agent at positions +9, +19, and +31 of the template strand and radiophosphorylated, derivatized oligodeoxyribonucleotides (10 pmol) were prepared essentially as described previously (39). Photo-cross-linking reactions were performed as described previously (39). Assignment of cross-linked product to TAF1/TAFII250 was confirmed by immunoprecipitation with monoclonal antibody to TAF1/TAFII250 (Santa Cruz) as previously described (39). TFIID/TFIIA binding reactions were as described previously (43). The TFIID used was identical to that used for the in vitro transcription assays.
|
|
|---|
In order to accurately define the Ad MLP downstream element, we conducted a scanning mutagenesis from +1 to +36 (Fig. 1A). The standard errors for transcription assays using nuclear extracts are typically in the 15 to 20% range (43, 44). Thus, it is not possible to distinguish any mutants giving rise to
80 to 85% of wild-type levels of transcription. Applying this statistical cutoff to Fig. 1A excludes mutations at positions +10/12, +13/15, and +25/27. However, reductions from mutations at positions +1/3, +4/6, +7/9, +16/18, +19/21, +28/30, and +31/33 all indicate the presence of regulatory elements. As expected, mutations within the Inr element (+1/3 and +4/6) decreased Ad MLP transcription (35, 58). However, the mutagenesis revealed that three other regions, in addition to the Inr element, are important for optimal transcription: +7 to +9, +16 to +21, and +28 to +33 (Fig. 1A). Although quantifications of the +22/24 mutant are not shown, transcription of this mutant was near wild-type levels in two experiments.
![]() View larger version (26K): [in a new window] |
FIG. 1. The adenovirus major late promoter contains a DCE type of downstream element. A. Scanning mutagenesis of the adenovirus MLP from +1 to +36. Triplet mutations from +1 to +36 were inserted into the Ad MLP templates and assayed by in vitro transcriptions using crude HeLa nuclear extracts. Transcription products were detected by primer extension. Relative transcription levels (RTL) represent the mutant transcription levels relative to the wild-type transcription level (WT). Mean values and standard deviations are calculated from the relative transcription levels (mutant versus wild type) from three to four experiments. Quantifications of the +22/24 and +34/36 mutants were approximately 80 to 90% of the wild-type MLP (n = 2) (data not shown). B. Sequence alignment of the wild-type human ß-globin and adenovirus major late promoters. Boxed sequences, arrows, and numbering indicate positions of deleterious mutations in each promoter (panels A and C and reference 43). Bases in red indicate the positions of known ß-thalassemia point mutations in the human ß-globin promoter (2, 8, 24, 31, 57, 90), except for +13 (13). Blue-colored bases indicate the sequences representing DCE subelements. C. The ß-globin DCE extends to positions +7/9 and does not show any erythroid cell specificity. Triplet mutations of the human ß-globin promoter (43) were assayed by in vitro transcription using crude HeLa nuclear extracts. ßWT, wild-type ß-globin; 2.3GG, deleterious mutation in the human ß-globin Inr element (44); mTATA, defective TATA box created by the replacement of the wild-type ß-globin TATA box sequence with four TATA box ß-thalassemia point mutations. D. Mutation of ß-globin subelement II (ßmSII) CTGT to AAAA. Two different preparations of a ß-globin template containing mutations in all four base pairs of subelement II were assayed in an in vitro transcription assay using HeLa nuclear extracts (NE). The same templates were assayed using TBP in the reconstituted in vitro transcription assay as a negative control (see the text and Fig. 2).
|
We have observed some quantitative differences when comparing the mouse erythroid and HeLa nuclear extract data of the ß-globin scanning mutants. This was especially noticeable with mutations at +19/21 and +22/24. Quantifications of the HeLa data were 0.79 ± 0.22 and 0.65 ± 0.18 (means ± standard deviations) for mutants with mutations at +19/21 and +22/24, respectively, while our previous work gave 0.60 ± 0.009 and 0.43 ± 0.04, respectively. We were concerned that the subelement II (SII) sequence may either behave differently in the two extracts or that we needed to revisit our conclusion that CTGT represents a functional cis element. This is especially a concern when looking at the magnitude of the +19/21 mutant in HeLa extracts (0.79 ± 0.22), which approaches wild-type levels (the +19/21 mutant contains a mutation in the CT positions of the CTGT SII sequence). Therefore, we next mutated all four positions of SII (Fig. 1D) (CTGT to AAAA). We examined two independent mutant templates in transcription assays performed in vitro using HeLa nuclear extracts. The quadruple mutant is considerably decreased to a degree greater than either scanning mutant was by itself. Reconstituted in vitro transcriptions with TBP showed no difference, indicating that this was not a general nonspecific defect in the promoter DNA (see below). This result is consistent with our interpretation that the CTGT sequence represents a functional cis element. Moreover, the differences observed between the extract and TBP-reconstituted transcription system further stress the idea that the CTGT element is functionally important for optimal transcription of the (two) promoters analyzed.
From the sequence comparison and the effects of the mutations on transcription, we conclude that the ß-globin and Ad major late promoters each contain a DCE. We refer to each of the three regions of the DCE as subelements and infer minimal specific sequences for each, based on the alignment of the two promoters: SI is CTTC, SII is CTGT, and SIII is AGC (Fig. 1B).
TFIID is necessary and sufficient for DCE function. The next step in the analysis of the DCE was to ask whether we could recapitulate DCE function using a highly purified transcription system in vitro. This system contains recombinant TFIIA, TFIIB, TFIIE, TFIIF, highly purified TFIIH, RNA polymerase II, and either recombinant TBP or affinity-purified TFIID. Previous results indicated that TFIID physically contacts DCE subelements in the ß-globin promoter (43). If TFIID is responsible for DCE function, then we would expect TFIID but not TBP to reconstitute the observed differences between the wild-type and mutant templates. TBP-dependent transcriptions of ß-globin subelements II and III (+22/24 and +31/33, respectively) and the +13/15 mutant were indistinguishable from that of the wild-type template (Fig. 2A, top panel, and 1D). However, a TATA box mutation was severely compromised, as expected. Replacing TBP with TFIID showed reductions with all three subelement mutants as well as the TATA box mutant (middle panel of Fig. 2A). We have also reconstituted SI function with TFIID (bottom panel in Fig. 2A) (+7/9 and +10/12). These two SI mutants (+7/9 and +10/12) showed decreased transcription relative to that of the wild-type template, when TFIID was used. Two positive controls, an Inr mutant (2,3GG) and the SIII mutant (+31/33), also showed a decrease in transcription.
![]() View larger version (31K): [in a new window] |
FIG. 2. TFIID functionally recapitulates ß-globin and MLP DCE activity in vitro. A. TFIID is necessary for ß-globin DCE function. Using the highly purified in vitro transcription system, the three major ß-globin DCE subelement mutants (Fig. 1B and C and reference 43) show decreased transcription using affinity-purified TFIID (two bottom panels) but not recombinant TBP (top panel). The mutant TATA box (mTATA) serves as a positive control. The second TFIID panel illustrates decreases in transcription using an Inr mutant (2,3GG), two mutations in SI (+7/9 and +10/12), and the SIII mutation (+31/33). WT and ßWT, wild-type ß-globin. B. MLP Inr mutant (+1/3) and DCE subelement mutants (Fig. 1A and B) show decreased transcription in vitro using the highly purified transcription system and affinity-purified TFIID but not recombinant TBP. C. Quantifications of the MLP TFIID-dependent transcriptions. Mean relative transcription levels (RTL) and standard deviations are calculated for three to four experiments.
|
![]() View larger version (30K): [in a new window] |
FIG. 3. DCE subelement III can function independently of subelements I and II. A. HSV UL38 wild-type (WT) and mutant promoter sequences. Predicted SIII sequences are in blue, and mutated bases are in red (26). In vitro transcriptions of the UL38 WT promoter and DAS mutants 1 and 3 (das1 and das3) were performed using HeLa nuclear extracts (NE) or a highly purified transcription system containing either TBP or affinity-purified TFIID. Below the in vitro transcription are quantifications of two sets of the in vitro transcription experiments. B. Adenovirus promoter sequences containing DCE sequences. Shown are several promoters from adenovirus 5 (GenBank accession numbers BK000408 and X02996). The promoters are indicated on the left. The arrow indicates the +1 transcriptional start site. The vertical boxes indicate positions that fit the Inr consensus sequence (YYANA/TYY [73]). The three boxes and blue lettering indicate, in the case of the MLP, experimentally derived positions of the three DCE subelements (Fig. 1). In the case of the remaining promoters, the boxes and blue lettering indicate predicted DCE subelements. C. In vitro transcription analysis of the wild-type adenovirus E3 promoter and a mutation of the predicted SIII (E3 mAGC; AGC is mutated to TTT). Shown are transcriptions using HeLa nuclear extracts (NE) or using the highly purified reconstituted transcription system (RTS) containing either recombinant TBP or affinity-purified TFIID. D. Titration of the adenovirus IX promoter containing either the wild-type promoter sequences (IX) or a mutation of the predicted SIII AGC sequence (IX mAGC; AGC is mutated to TTT). Shown are transcriptions using HeLa nuclear extracts. E. DCE subelement III function has severe spatial constraints. Wild-type and mutant E3 in vitro transcriptions using HeLa nuclear extracts are compared to the deletion or insertion of 3 base pairs between the subelement III AGC and the start site of transcription (3AGC and +3AGC, respectively; the insertion of AAA was at +24 for +3AGC, and deletion of the TCG sequence was at +21 to +23 for 3AGC). F. Analysis of Ad MLP spacing mutants. The Ad MLP SIII was moved 4 bp closer to SII to establish a 10-bp interval between SII and SIII as in the ß-globin promoter (4SIII; the GGCC from +28 to +31 was deleted). In this context, SIII was also mutated to TTT (4mSIII). Shown are results of duplicate in vitro transcriptions (using HeLa NE) of these two templates, compared to the wild-type Ad MLP (MLP).
|
Several viral promoters contain DCE subelement III. One question arising from this analysis is whether any individual subelement can function independently of the others. Previous work showed that a region downstream of the start site of the HSV UL38 and US11 promoters, termed the downstream activation sequence (DAS), was necessary for the proper expression of these promoters, both during viral infection and in vitro (26, 27). Our sequence analysis suggested that the DAS contains the AGC of SIII of the DCE (Fig. 3A). We analyzed two UL38 promoter DAS mutants, das1 and das3, using nuclear extracts in vitro and confirmed the results of Guzowski et al. (26), who showed that these mutants were defective for UL38 transcription (Fig. 3A, NE lanes). In the das3 mutant, two out of three positions of the AGC (at +26 to 28) were mutated along with two adjacent guanines (AGCGG to AATAA [Fig. 3A]). These results suggest that the HSV UL38 DAS element is a DCE subelement III and that this subelement can function independently of SI and SII of the DCE. If the DAS is actually DCE subelement III, then we would expect to be able to reconstitute its activity with TFIID but not TBP, as we observed previously. TBP-dependent transcriptions using the highly purified reconstituted in vitro system showed no differences between the wild-type UL38 promoter and das1 and das3 (Fig. 3A, TBP lanes). However, when we substituted TFIID for TBP, we observed that the das1 and das3 mutants were now defective transcriptionally (Fig. 3A, TFIID lanes), and closely resembled the defects observed with nuclear extracts (compare the lanes with NE with those of TFIID in Fig. 3). Thus, by sequence identity, position, and requirement for TFIID, we conclude that the DAS element is DCE SIII.
Since the Ad MLP contains a DCE, we asked whether any other adenovirus promoters also contained DCEs. The TATA box-containing E1b and E4 promoters do not have DCE sequences (Fig. 3B), and this correlates with the absence of TFIID DNase I protections downstream of the TATA box on these promoters (12, 32, 92). Although we did not find a promoter that contained all three subelements, the TATA box-containing IX and E3 promoters included DCE subelement III sequences in the expected neighborhood of +30 (Fig. 3B). We also did not find in these promoters any sequences resembling a DPE.
We amplified and cloned the Ad E3 and Ad IX promoter DNA fragments from adenovirus genomic DNA and then mutated only the predicted subelement III AGC sequence in both promoters. Transcriptions using nuclear extracts in vitro were used to compare transcription levels from wild-type and mutant promoters (Fig. 3C and D). As predicted, mutation of the AGC in both the E3 (0.35 ± 0.12) and IX (0.019 ± 0.01) promoters resulted in significant reductions in RNA levels in vitro. These data are significant in several respects. They suggest that our initial derivation of the sequence of subelement III is correct in that it accurately predicted the presence of subelement III in two other promoters. Second, it is clear from the UL38, E3, and IX promoters that subelement III can function as a separate distinct element.
If these are DCE subelements, then we would expect SIII function to be recapitulated using the highly purified reconstituted transcription system in vitro. As with the UL38 promoter, we found that the function of Ad E3 subelement III is recapitulated not with TBP, but only with the TFIID complex (Fig. 3C). These data strongly suggest that one or more TAF components of TFIID are capable of a sequence-specific interaction with SIII (see below).
DCE subelement III spacing requirements. We had previously established that the insertion of 5 nucleotides between the subelements in the ß-globin promoter had very deleterious effects on transcription in vitro (43). In a similar vein, we moved SIII in the Ad E3 promoter 3 base pairs closer or further from its original position. In this case, the effects were equally deleterious, severely decreasing transcription (Fig. 3E). We suggest then that the possible critical spacing parameter established by TFIID is not the relative distance between SII and SIII, which is one interpretation of the spacing data (43). Instead, the critical factor may be the distance of SIII from the rest of the core promoter, either the TATA box or Inr or some combination of the two. This is especially apparent in the Ad E3 promoter, where there are no other subelements. Similar spacing constraints have been observed for both DPE and MTE downstream elements and the Inr (5, 48).
To further address the spacing properties of the DCE subelement III, we reexamined the Ad MLP. We have previously noted that in the ß-globin promoter, SII and SIII are separated by 10 bp but that in the Ad MLP, the distance is approximately 14 base pairs. Is it possible by moving the Ad MLP SIII AGC closer to SII to increase its contribution to the magnitude of transcription? Note that this also serves to test our interpretation of the E3 spacing mutants. Is it important to maintain the spacing between SII and SIII or, instead, the spacing between SIII and either the TATA box and/or Inr?
We constructed two Ad MLP templates; in one, 4 nucleotides were removed between SII and SIII (4SIII). The second is the corresponding mutation in SIII (AGC to TTT) in this mutant (4mSIII). We compared these and the wild-type constructs to assess the function of SIII in this new spacing context (Fig. 3F). We found that 4SIII shows a decrease in transcription relative to that of the wild-type Ad MLP. The experiment also shows that 4SIII still contains a functional SIII, since mutation of it (4mSIII) further abrogates transcription. Since moving SIII closer to SII did not increase transcription and instead showed less transcription than the wild-type configuration, we conclude that it is the spacing between SIII and the remaining core promoter elements (TATA box and Inr) that is critical for SIII function. We suggest that this optimal spacing resides in the neighborhood of +32. While an SIII positioned away from its optimal position is capable of functioning, transcription levels are compromised.
TAF1/TAFII250 photo-cross-links to the three DCE subelements in a sequence-specific manner.
The functional data (Fig. 2) and previous work (43) suggested a physical, sequence-specific interaction of TFIID with the DCE. To identify the TAF(s) responsible for this interaction, we utilized a photo-cross-linking protocol developed previously (39, 41, 42). Briefly, a photo-cross-linking adduct, which protrudes from the sugar-phosphate backbone, was incorporated into the Ad MLP DNA probe at the site of interest, and the DNA was labeled with 32P immediately adjacent to the adduct position. The resulting Ad MLP DNA probe was incubated with affinity-purified TFIID (in the presence of TFIIA to facilitate TFIID binding to promoter sequences [36]) and exposed to UV irradiation to cross-link the DNA to a neighboring protein(s). Following nuclease digestion, the "cross-linked" labeled proteins were analyzed by denaturing polyacrylamide gel electrophoresis (PAGE) and autoradiography. Figure 4A shows the results of an initial scan in the neighborhood of subelement I, where we detected an
200-kDa species that cross-linked to positions +9 and +11. We next extended this analysis to the two remaining subelements. Again, a predominant species of
200 kDa was cross-linked in a UV-dependent manner to wild-type Ad MLP promoter sequences containing the photo-cross-linking adduct at position +9, +19, or +31, which represent the positions of DCE subelements I, II, and III, respectively (Fig. 4B). In each case, no protein was detected when we used Ad MLP templates containing mutations in each of the three subelements (mSI, mSII, and mSIII), which were shown to be transcriptionally defective (Fig. 1). We also note the presence of several minor (cross-linked) proteins of
120 kDa and 50 kDa; the nature of these proteins remains to be elucidated, but they possibly represent minor interactions of TAFs with the three subelements.
![]() View larger version (34K): [in a new window] |
FIG. 4. TAF1/TAFII250 photo-cross-linking is observed with wild-type but not mutant MLP DCE subelements in the adenovirus MLP. A. Photo-cross-linking adducts were placed at the indicated positions relative to the transcriptional start site on the template strand of the wild-type adenovirus MLP and incubated with affinity-purified TFIID and rTFIIA (see Materials and Methods). PAGE and autoradiography were performed on the resulting cross-linked products. B. Photo-cross-linking adducts were placed at positions +9, +19, and +31 of the template strand of the adenovirus MLP wild-type or mutant DCE subelements (WT SI, mSI, etc.) The mutant subelement sequences are identical to those used in Fig. 1A. PAGE and autoradiography were performed on the resulting cross-linked products, with (lanes 2, 3, 5, 6, 8, 9) or without (lanes 1, 4, and 7) irradiation with UV light. Binding reactions were as described for panel A. C. TFIIA/TFIID photo-cross-linked products (at positions +9, +19, and +31 of the template strand) were dissociated and immunoprecipitated with either protein A-agarose beads alone (lane 1), an anti-FLAG antibody ( FLAG; Sigma) plus protein A-agarose beads (lanes 2, 4, and 6), or an anti-TAF1/TAFII250 antibody plus protein A-agarose beads (lanes 3, 5, and 7). The immunoprecipitates were then analyzed by PAGE and autoradiography.
|
Statistical analysis of human promoter databases. To begin to establish the prevalence of the DCE in the mammalian genome, we analyzed two human promoter databases and searched for DCE subelements (see Materials and Methods). We utilized the Eukaryotic Promoter Database (EPD) and the Database of Transcriptional Start Sites (DBTSS). Importantly, the EPD specifically restricts entries to promoters whose start sites have been determined experimentally (4, 68). The DBTSS start sites were determined from full-length cDNAs (59, 74, 75), and a similar strategy was used to construct the Drosophila database used in the initial identification of the MTE (48, 55). Such a strategy is valid, not only because of the identification of the MTE, unknown at the time of that analysis, but also because searches of that database revealed TATA- and DPE-containing promoters (55). As indicated below, our analysis of the EPD and DBTSS yielded similar results, thus indicating that the DBTSS has as accurate a representation of promoter sequences as the EPD, whose transcription start sites were determined with traditional methods (4, 59). The advantage of our approach was demonstrated with a statistical analysis of core promoter elements and their combinations using databases of human promoters (21).
To examine the distribution of subelements, we calculated their actual number, their percentage, and the statistical significance of promoter sequences containing a corresponding motif (SI, SII, or SIII) at the expected positions for different subsets of promoters (Table 1). The promoter database was divided into the following sets (see Materials and Methods): TATA+/TATA (promoters with/without a TATA box at any position in the window from 33 to 23 bp from +1 [sections 1 and 2 in Table 1]); Inr+/Inr (promoters with/without the Inr element at any position in the window from 5 to +6 bp [sections 3 and 4]); DPE+/DPE (promoters with/without DPE at any position in the window from +23 to +28 bp [sections 5 and 6]); TATA+ Inr (promoters with TATA but without Inr elements [section 7]); Inr+ TATA (promoters with Inr but without TATA elements [section 8]); and TATA+ Inr+/TATA Inr (promoters with/without the TATA and Inr elements with an optimal spacing between them [sections 9 and 10]). A promoter is considered as having an optimal combination of the TATA box and Inr element if the distance between their centers is 25 to 30 bp (17, 58). For comparison, we also calculated the averaged percentage of each motif in a randomly generated DNA sequence (section 11). We considered two scenarios of SI and SII motifs: exact matches to the experimentally derived sequences (i.e., CTTC for SI and CTGT for SII) (Fig. 1) and motifs with any 3 out of 4 bases matching the experimentally derived sequences.
|
View this table: [in a new window] |
TABLE 1. Absolute numbers, percentages, and statistical significance of promoter sequences containing one of the DCE subelementsa
|
Although the occurrence frequency of SII downstream of +1 is larger than in regions upstream of +1 when the entire EPD or DBTSS is analyzed, there is no indication of a local occurrence frequency maximum at the expected positions from +16 to +21 relative to +1 (data not shown). However, the presence of SII has weak preference for the TATA, Inr+, DPE, and Inr+ TATA subsets of promoters (Table 1, rows 3 and 4 in sections 2, 3, 5, and 8).
The occurrence frequency of subelement SIII is above a random distribution in the proximal downstream area, especially at positions from +24 to +34 bp when the entire EPD or DBTSS is examined. The statistical significance of the SIII presence in this interval of positions is extremely high (8.4 SD for EPD and 40.3 SD for DBTSS), suggesting the functional significance of this element in many promoters. The percentages of promoters having subelement SIII at any of those positions are 26.5% and 25.3% for the EPD and DBTSS, respectively. As we showed above, SIII functions as part of the DCE at positions from +30 to +34 and can stand alone at positions from +24 to +27. Therefore, we calculated the percentage of promoter sequences containing SIII in these two intervals for different subsets of promoters (Table 1, rows 5 and 6). From Table 1 (row 5, AGC [+30 to +34]), we see that the presence of SIII correlates strongly with the presence of an Inr element (compare sections 3 and 4). Similar to the correlation between SI and the optimal spacing of the TATA box and Inr, the promoters having the TATA+ Inr+ optimal combination are more likely to have SIII (compare sections 9 and 10). For SIII at positions +24 to +27, the dependence on the presence of the TATA box and Inr element is more definite (compare section 1 to section 2 for the TATA box and section 3 to section 4 for the Inr). The percentage of SIII in TATA+ Inr+ (row 6, AGC [+24 to +27]) promoters is essentially higher than in the TATA Inr subset of promoters (sections 9 and 10). The presence of the DPE is not compatible with the presence of SIII in both considered intervals of positions (compare sections 5 and 6 in rows 5 and 6). The conclusion of this analysis is that SIII sequences alone are found in a statistically significant number of promoters at the position predicted from our experimental data.
The positional distribution of all three subelements is graphically presented in Fig. S1 in the supplemental material. We also examined how many promoters have various combinations of subelements (see Table S2 in the supplemental material) and the degeneracy of the DCE subelements in the two promoter databases (see Table S3 in the supplemental material).
|
|
|---|
A second interpretation of the TAF1 photo-cross-linking data are that alterations in DCE sequences might disrupt TAF cross-linking indirectly. These sequence changes may alter DNA bending (which is also sequence specific) such that the TFIID (TAF1)-DNA complex is affected, which in turn perturbs transcription. It is also possible that the TAF1 cross-links are simply proximity cross-linking and do not reflect a sequence-specific interaction of TAF1 and DCE DNA sequences. Instead, as mentioned above, the DNA structure might be altered or there may be another TAF responsible for DNA binding. The inability of this hypothetical TAF(s) to bind a mutated DCE sequence might then result in additional structural changes in TFIID such that the TAF1 cross-linking is lost.
A comparison of the DCE and DPE reveals several important differences. The DPE consists of one element centered around +30 (and possible additional sequence information at +24), whose consensus sequence is A/G G A/T C/T GT (40). The DCE is strikingly different in sequence, architecture, and factor requirements. The DCE consists of three subelements, and each is distinct from the DPE sequence: SI is CTTC, SII is CTGT, and SIII is AGC. SI resides approximately from +6 to +11, SII from +16 to +21, and SIII from +30 to +34. SIII may display more positional variability since it is found from +26 to +28 in the HSV UL38 and Ad E3 promoters. Note that the presence of SI and SIII does not correlate with the presence of DPE (compare rows 2 to 6 in sections 5 and 6 in Table 1), suggesting that the functions of these two subelements and the DPE are mutually exclusive. We also amend the conclusions in reference 43 to now suggest that the ß-globin DCE is more extensive than previously recognized. SI seems to comprise positions +8 to +11, and it is not clear whether to consider the +13/15 mutation a defect in SI or whether it represents an additional subelement.
Photo-cross-linking data indicate that TAF1 interacts with the DCE in a sequence-dependent manner. These data can be contrasted with the observations of Burke and Kadonaga (5), who concluded that dTAFII40/60 (TAF9/TAF6) are necessary for sequence-specific recognition of the DPE. This contrast between the DCE and DPE shows that downstream elements will function by contacting distinct TAFs, and we suggest that other TAFs will recognize additional, as-yet-undiscovered core promoter elements (for example, the recently described MTE [48]).
Despite the expanded definition of the subelements, the distances between the subelements are maintained at 10-base-pair intervals (43). This is most readily seen in the spacing between the ß-globin subelements II and III and the spacing of SI and SII of the Ad MLP DCE. The exception is SIII of the Ad MLP, which is
14 bp from SII. However, SIII function in the ß-globin, UL38, and E3 promoters is accurately recapitulated with TFIID. Previous work indicated that the spacing between the subelements may be functionally relevant, as 5-base-pair insertions between each of the three subelements was deleterious to transcription (43). Our analysis here though indicates that another interpretation is more likely. We found that movement of SIII in the Ad E3 and Ad MLP promoters was deleterious for promoter function (Fig. 3). Since Ad E3 (or Ad IX) does not appear to have SI and SII, we argue that it is much more likely to be the distance from either the TATA box or the Inr element to SIII that is important. In any case, it is apparent that TFIID and the resident TAFs impose architectural constraints on the position of the DCE subelements.
The derivation of sequences of the DCE subelements explains additional data as well. There are two ß-thalassemia mutations at +10 (T) (2) and +20 (C to G) (24) and these can now be explained as mutations in SI and SII. The sequence derivations also help to explain the intermediate effects seen in the scanning mutagenesis of both the ß-globin and Ad major late promoters as partial defects in a subelement. Indeed, similar partial defects can be seen in the analysis of the MTE (48), and we suggest that these are likely characteristics of downstream elements in general. The presence of a DCE in the Ad MLP explains the observations of several groups that TFIID has extended interactions from the TATA box at 30, downstream to +40 but that the TFIID interaction with the adenovirus E4 promoter, which does not have any DCE subelements (Fig. 3B), is centered only around the TATA box (12, 52, 67, 92).
Finally, the identification of a TAF1 interaction with the DCE has several significant implications. Our data support the interpretation that TFIID serves as a core promoter recognition complex (9, 25, 28, 36, 37, 81, 82). Additionally, it offers a mechanistic explanation as to why certain core promoter regions are TAF1 responsive (29, 47, 53, 64, 70, 71, 76, 78, 79, 84, 85), and we suggest that some of these promoters will contain DCE subelements. Interestingly, HSV late gene expression, including the UL38 gene, is markedly reduced in a TAF1 temperature-sensitive cell line at the nonpermissive temperature (14). We suggest that this down-regulation is due to the presence of DCE subelements in these late gene promoters. Others have noted that several such late genes do contain DAS elements, within which we find DCE SIII (Fig. 3), and these were proposed to be the basis for the regulation of late gene expression (38, 63, 88).
The photo-cross-linking of TAF1 to wild-type DCE DNA but not mutant DCE DNA suggests that TAF1 contains distinct DNA-binding domains in addition to its kinase, acetyltransferase, and ubiquitinase activities (16, 51, 60), as well as having bromodomains which bind acetylated proteins (33, 50). Several reports have predicted that TAF1 contains an HMG box with extensive similarities to HMG1 (30, 64, 69, 87). Some members of this family of HMG proteins bind nonspecifically to DNA and instead recognize DNA structural features such as bent sequences (HMG1/2), while others in this family do recognize specific DNA sequences (LEF-1, SRY) (3).
In Fig. 5 we propose a model of downstream element function that explains the distinctions between the DCE and DPE. The DCE is readily explained simply via its physical and functional interactions with TFIID via TAF1. TAF1 is capable of sequence-specific DNA binding and is the third example of TAF/sequence-specific core promoter DNA recognition (TAF1/TAF2 Inr recognition and TAF6/9 DPE recognition) (5, 9, 36, 37, 81, 83), although other TAF/promoter DNA contacts have been described previously (54), especially downstream contacts involving TAF1 and TAF2 (19, 23, 61, 62, 77, 91). In contrast, the DPE is recognized by TAF6/TAF9 (5). In order to accommodate core promoters containing a DCE or DPE, we suggest that TFIID assumes different conformations on the DCE- and DPE-dependent promoters (compare Fig. 5A to B and C). These different TFIID conformations lead to the formation of unique PICs (PICDCE and PICDPE), which, due to the architecture of the PIC, may require additional factors to attain a productive transcription initiation complex (as is the case with PICDPE [45]). Alternatively, others have suggested that different TFIID complexes with differing TAF contents exist (10).
![]() View larger version (26K): [in a new window] |
FIG. 5. Models depicting the interaction of TFIID with the DPE and DCE. In the three models, TAF1 and TAF2 are jointly responsible for Inr element recognition (for a review, see reference 73). A. DPE sequence recognition is established by TAF6/TAF9 components of TFIID. This interaction results in a unique TFIID conformation that in turn results in the formation of a DPE-specific PIC (PICDPE). B. TFIID interacts with the DCE subelements via TAF1. Again, this results in a TFIID conformation that is different than a TFIID/DPE interaction. This also leads to the formation of a DCE-specific PIC (PICDCE). C. A similar interaction of TFIID occurs with promoters containing only SIII of the DCE. In all three cases, these unique PICs consist of their own unique set of factors and cofactors that, in the end, manifest themselves as different regulatory phenomena.
|
|
View this table: [in a new window] |
B.A.L. was supported by NIH/NIDDK grant K01 DK60001, and D.R. was supported by NIH grant GM 64844. D.R. is an investigator of the Howard Hughes Medical Institute.
Supplemental material for this article may be found at http://mcb.asm.org/. ![]()
|
|
|---|
T substitution at nt 101 in a conserved DNA sequence of the promotor region of the beta-globin gene is associated with "silent" beta-thalassemia. Blood 73:1705-1711.
2 gene. J. Virol. 66:4855-4863.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»