Previous Article | Next Article ![]()
Molecular and Cellular Biology, May 2003, p. 3152-3162, Vol. 23, No. 9
0270-7306/03/$08.00+0 DOI: 10.1128/MCB.23.9.3152-3162.2003
Copyright © 2003, American Society for Microbiology. All Rights Reserved.
Verna and Marrs McLean Department of Biochemistry and Molecular Biology,1 Department of Molecular and Human Genetics, Baylor College of Medicine,3 Institute of Biosciences and Technology, Center for Genome Research, Texas A & M University, Houston, Texas 770302
Received 11 October 2002/ Returned for modification 26 November 2002/ Accepted 3 February 2003
|
|
|---|
|
|
|---|
DM is the most common inherited neuromuscular disorder in adults, affecting 1 in 8,000 people worldwide (21). The DM mutation, which is autosomally dominant, arises due to expansion of a CTG repeat located in the 3' untranslated region of the gene for DM protein kinase (DMPK) (78). The mechanism by which the expanded CTG repeats cause the characteristic features of the disease is not well defined and is likely to be complicated. It has been shown previously that transcripts from the mutant allele are retained in the nucleus (12), that the distribution of DMPK cytoplasmic mRNA isoforms is altered (76), that expression of the adjacent SIX5 homeodomain gene is decreased (34), and that binding of CUG binding proteins to mutant DMPK mRNAs (75) can alter the splicing of other, unrelated mRNAs (42, 56). The lack of DM patients with point mutations in the DMPK gene suggests that simple haploinsufficiency does not account for the disease phenotype (78), a conclusion supported by the relatively minor phenotype displayed by DMPK-knockout mice (28).
The size distributions of normal and mutant DM alleles, along with extensive analyses of CAG · CTG repeats (hereafter, CAG repeats) in sperm (39, 84), suggest that the instability thresholds for CTG repeats in DM and for CAG repeats in other diseases are around 30 repeats (80). Typically, there is a bias for male germ line transmission to generate the first clinically recognized disorder in a DM pedigree (40) and a bias for female germ line transmission to generate congenital DM, the most severe form of the disease (20). It is unclear whether germ line instability of CTG repeats in DM results from meiotic or from mitotic events. Instability of CAG triplet repeats, which are associated with other diseases, can occur before meiosis (39), during arrest in prophase I (32), and in postmeiotic haploid cells (35). It is clear that CTG instability can occur during mitotic divisions, based on changes in repeat length in different tissues from DM patients (22, 43, 83), in patient-derived cell lines upon serial passage (1, 82), and in tissues and cell lines from transgenic mouse models (19, 79).
The ability of long CTG repeats to form unusual secondary structures is the likely basis for expansion at the DMPK locus (81). Several in vitro studies have shown that CTG and CAG repeats can form hairpins (18, 73, 74) and slipped-strand DNA duplexes (55). In addition, studies of Escherichia coli (11, 30, 67) and Saccharomyces cerevisiae (48) suggest that CTG and CAG repeats form unusual DNA structures in cells. One of the strongest links between unusual secondary structures and disease progression is the effect of nucleotide changes within the repeat sequence. Interruptions that decrease the propensity of the repeat to form secondary structures also reduce the frequency of expansions and the potential for disease (7, 37, 73).
Both E. coli and S. cerevisiae provide useful model systems to probe genetic influences on triplet repeat instability, especially of CTG and CAG repeats. Virtually every process that exposes single strands of DNA destabilizes triplet repeats, including transcription (6, 70), nucleotide excision repair (50, 53), mismatch repair (29, 61, 68, 71), replication (23, 30, 44, 67), and recombination (17, 24, 25, 26, 27, 58, 59). CTG and CAG triplet repeats also cause double-strand DNA breaks in yeast (17, 26). Differences in assays and differences in frequencies of contractions and expansions make it difficult to rank order these processes in terms of their effects on repeat instability. Nevertheless, it is thought that errors in mismatch repair correspond to small changes in repeat length, errors in replication correspond to small and intermediate changes, and errors during recombination correspond to intermediate and large changes (7, 78). Studies in these systems have shaped present models for replication-based and recombination-based instability of triplet repeats (7, 60, 78).
The causes of triplet repeat instability in mammalian cells are difficult to investigate directly. Changes in repeat length upon serial passage of cell lines derived from patients (1, 82) or transgenic mice (16, 19) have been characterized, and the influences of mismatch repair (36, 79) and replication (9, 51) have been examined. Present approaches in mammalian cells are limited, however, because different sizes of repeats cannot be readily tested in the same genetic background, different genomic contexts and orientations of the repeats are difficult to compare, and different genetic influences cannot be easily examined. An additional significant restriction is a lack of sensitivity, which is limited to a frequency of 10-2 to 10-3 for changes in repeat length that must be screened for in the cell population.
In this paper we describe a model system in CHO cells that is designed to overcome some of these limitations. We have deposited CTG repeats from the DM gene into an intron in one copy of a tandemly duplicated pair of APRT genes at their endogenous locus in CHO cells. By selecting for homologous recombination between the duplicated copies of the APRT gene, we can examine changes to the inserted CTG repeats in a population of cells that have experienced a nearby recombination event. Within this selected population we show that long CTG repeats experience large contractions and generate a high frequency of rearrangements that extend beyond the repeat tract. These distinctive changes were not observed for long tracts of CTG repeats in replicating cells that were not recombinant at APRT. Instead, replicating cells displayed a high frequency of expansions and contractions that usually involved just one to three triplets. These results indicate that homologous recombination dramatically destabilizes long CTG repeats in CHO cells.
|
|
|---|
Construction of cell lines. Site-specific recombination involving the FLP recombinase and the FLP recombinase recombination target (FRT) site was used to generate tandemly duplicated APRT genes, as described previously (46). The APRT- gene in the RMP41 cell line carries a point mutation in exon 2 that eliminates APRT function and removes an EcoRV recognition sequence (63). The RMP41 cell line also carries an FRT site in intron 2, which does not affect the function of the APRT gene (46). Plasmids pRW3502, pRW3504, and pRW3506 each carry an FRT site adjacent to the tracts of inserted CTG repeats. FLP recombinase-mediated site-specific recombination between the FRT sites in these plasmids and the FRT site in RMP41 cells was used to generate APRT+ colonies with the tandemly duplicated gene structures shown in Fig. 1. Sequencing of the triplet repeats in each cell line showed that GS3502 carried (CTG)17 and GS3504 carried (CTG)98, as expected. Cell line GS3506, however, carried (CTG)183 instead of the expected (CTG)175. The presence of 183 CTG repeats, instead of the expected 175, presumably reflects some instability that occurred in the growth of the plasmid in E. coli or in the targeting and establishment of the cell line. In GS3506 the G-to-A changes are located in CTG repeats 27 and 70 (instead of CTG repeats 27 and 67). In each of these cell lines, the upstream copy of APRT is inactive due to the point mutation in exon 2 and the exon 5 truncation, whereas the downstream APRT gene, which carries the CTG triplet repeats in intron 2, is functional. Structures of all cell lines were verified by Southern blotting after digestion with restriction enzymes diagnostic for the predicted structure.
![]() View larger version (32K): [in a new window] |
FIG. 1. Structures of the APRT locus in parental CHO cells and in colonies isolated under various selections. (A) Molecular structures of the tandem duplication of APRT sequences at the endogenous locus. The locations of the CTG triplet repeats in the parental cell lines are shown above their common site of insertion in the second intron of the downstream, functional APRT gene (the five exons of APRT are shown as boxes). The sequences immediately surrounding the upstream FRT site (black triangle) and downstream FRT site (open triangle) are different and therefore distinguishable. The upstream copy of APRT is nonfunctional by virtue of a truncated fifth exon and a point mutation in exon 2 that eliminates an EcoRV site (filled box). The upstream and downstream copies share 6.8 kb of homology indicated by brackets: 4.5 kb upstream of the APRT gene (thick line) and 2.3 kb of homology within the gene itself. (B) Molecular structures of the APRT locus in APRT- and TK- APRT- colonies. Products were distinguished by a combination of Southern blotting and PCR analysis. Conversions have a structure like the parental tandem duplication, except that some lose the insert as part of the conversion process (status of the insert is indicated by +/-). CTG+ conversions were distinguished from mutations by PCR analysis. Mutations are assumed to carry point mutations or small deletions elsewhere in the APRT gene; however, they were not further characterized. Crossovers have a single copy of the APRT gene whose digestion pattern depends on whether the insert (+) was retained or lost (-). Rearrangements yield a Southern blot pattern that does not correspond to conversions, crossovers, or mutations.
|
Fluctuation analysis (38, 41) was carried out by using 12 parallel cultures grown from initial populations of 50 to 100 cells for each rate determination, as described previously (63). The numbers of APRT- or TK- APRT- colonies in parallel cultures were used to calculate rates by the method of the median (38). A single colony was picked from each parallel culture to ensure that all analyzed colonies arose independently.
Southern analyses, PCR analysis, and DNA sequencing. Southern analyses were carried out according to standard protocols (62). The probe for Southern analysis was the 3.9-kb BamHI fragment containing the entire APRT gene, labeled by random priming with [32P]dCTP. PCR analysis of the recombination products was carried out as previously described (63). The distinction between conversions and point mutations was based on restriction digestion of a PCR product that includes exon 2. The copy of exon 2 in the downstream APRT gene was specifically amplified by using one primer that was located in unique sequences adjacent to the CTG repeat in intron 2. PCR products from convertants give rise to PCR products that are not cleaved by EcoRV because they have picked up the EcoRV mutation from the upstream copy of the APRT gene. By contrast, colonies that have acquired point mutations elsewhere in the APRT gene give rise to a PCR fragment that is cleaved by EcoRV.
PCR amplification of triplet repeat fragments for sequencing used one primer in intron 2 and a second primer at the 3' end of the gene, which is unique to the downstream copy. This choice of primers allowed the downstream insertion site to be specifically amplified. In addition, it was found that embedding the CTG repeats in a larger PCR fragment gave more reliable sequencing results. The locations of PCR primers used for analysis of rearrangements and their sequences are available on request. DNA sequencing was carried out on amplified PCR fragments to determine the numbers of triplet repeats and to decipher the structures of the rearrangement junctions.
Statistical analysis. Means were compared by the two-tailed t test. Standard errors of the means are reported in Table 2 and were used to determine the propagated errors reported in Table 4. For t test comparisons of the means in Table 4, however, the propagated error was recalculated by using standard deviations (which is the standard error x the square root of the sample size). Distributions were compared by the chi-square test. For two-by-two comparisons with 1 df, the Yates adjustment was used to compensate for the tendency of such comparisons to exaggerate significance. This adjustment consists of changing the observed frequencies by half a unit to give smaller deviations from the expected values, thereby giving a larger, more realistic P value (8). For all comparisons a P value of 0.05 was used to accept or reject the null hypothesis, which was that the means or distributions were the same. All calculations for the statistical tests were performed by using the PHStat add-in for Excel.
|
View this table: [in a new window] |
TABLE 2. Rates of APRT- and TK- APRT- colony formation
|
|
View this table: [in a new window] |
TABLE 4. Individual rates of formation of different types of colonies
|
|
|
|---|
We chose to test CTG repeats in the same orientation in which they occur in the DM gene, that is, so that the RNA transcript carries CUG repeats. In contrast to the DMPK mRNA, which retains the CUG sequences, the CUG repeats are absent from the APRT mRNA because they are removed by splicing, along with the rest of the sequences in intron 2. Because cells that carry the CTG repeats in the otherwise wild-type copy of the gene are phenotypically APRT+, the CUG repeats in the RNA evidently do not interfere with normal splicing, nor do they have any observable effects on cell growth. We tested three lengths of CTG repeats. (CTG)17 is within the range of repeats in healthy individuals and was anticipated to be unaffected by DNA replication and homologous recombination. By contrast, (CTG)98 and (CTG)183 are in the range found in affected individuals and have been demonstrated previously to cause instability in E. coli (30).
CTG repeat stability in replicating cells. To measure the influence of replication on CTG repeat stability, we grew the (CTG)17, (CTG)98, and (CTG)183 cell lines (GS3502, GS3504, and GS3506, respectively) through about 30 cell doublings and then isolated individual colonies from each cell line. Individual colonies were grown to about 106 cells and analyzed by PCR amplification with flanking primers. Out of 43 colonies from the (CTG)17 cell line, 118 colonies from the (CTG)98 cell line, and 134 colonies from the (CTG)183 cell line, no colonies gave rise to a PCR fragment that differed substantially from its siblings, suggesting that these repeat lengths are reasonably stable during cell proliferation (data not shown). However, a slight unevenness in the alignment of bands was evident in gels from (CTG)183 colonies, as is apparent in the long gel run shown in Fig. 2.
![]() View larger version (64K): [in a new window] |
FIG. 2. Agarose gel electrophoresis of PCR fragments generated by amplification across the CTG repeats in individual colonies isolated from a population of proliferating GS3506 cells. The individual colonies correspond to those in Table 1, and the PCR products shown here (marked by CTG) were the ones that were isolated and subjected to DNA sequence analysis. The bands in the outside lanes show standard 2.5- and 3.0-kb markers from a commercial ladder. To accentuate the slight differences between bands, the fragments were electrophoresed through about 20 cm of gel. The source of the faint band at around 2.5 kb in all lanes is unknown.
|
|
View this table: [in a new window] |
TABLE 1. Lengths of CTG triplet repeats in colonies from a population of proliferating cells
|
Initially, we measured the rates of production of APRT- and TK- APRT- cells to determine whether CTG repeats might stimulate homologous recombination, as long CTG repeats have been shown elsewhere to do in E. coli and yeast (17, 24, 25, 26, 27, 58, 59). As shown in Table 2, the (CTG)17, (CTG)98, and (CTG)183 cell lines yielded APRT- and TK- APRT- colonies at rates that were not substantially different from the mean rates for the other cell lines, which each carried the identical tandemly duplicated APRT locus, but with different DNA inserts in the downstream copy of the gene. Thus, CTG repeats of these lengths do not dramatically stimulate homologous recombination in mammalian cells.
To measure the contributions of homologous recombination, mutation, and rearrangement to the measured rates, independent APRT- and TK- APRT- colonies were isolated and examined by Southern blotting and PCR analysis. A combination of these two approaches was used to identify the molecular structure of the APRT locus and to assign each colony to a specific category of event. An example of a Southern blot analysis of BamHI-cleaved DNA from several APRT- colonies derived from the (CTG)98 cell line is shown in Fig. 3B. Colonies 194, 198, 207, 208, 210, and 218 (lanes 1, 5, 10, 11, 12, and 16, respectively) have Southern blot patterns that are consistent either with a conversion event that has retained the CTG sequence or with a point mutation elsewhere in the APRT gene. Subsequent PCR analyses showed that colonies 198, 207, and 218 (lanes 5, 10, and 16, respectively) are conversion events and that colonies 194, 208, and 210 (lanes 1, 11, and 12, respectively) arose by point mutation. Colonies 195 and 196 (lanes 2 and 3) are conversion events that have lost the repeats. Colonies 215 and 217 (lanes 13 and 15) are crossover events that have retained the CTG repeat, and colony 199 (lane 6) is a crossover event that has lost the repeat. Colonies 200 and 203 (lanes 7 and 8) are rearrangements. The results of all such analyses are listed in Table 3.
![]() View larger version (90K): [in a new window] |
FIG. 3. Southern analysis of APRT- colonies isolated from the (CTG)98 cell line, GS3504. (A) Restriction map for conversions and crossovers. Arrows indicate the sites at which BamHI cleaves, and the sizes of the resulting fragments are indicated in kilobases. Because a BamHI site is located adjacent to the CTG sequence in the insert, the Southern blot pattern depends on whether the insert is retained or lost. A conversion that has lost the insert yields fragments of 12.3 and 4.0 kb. If the insert is retained, the 4.0-kb fragment is replaced by a pair of fragments at 1.3 and 3.1 kb. Similarly, a crossover that has lost the insert yields a single band at 4.0 kb, whereas a crossover that has retained the repeat yields bands at 1.3 and 3.1 kb. (B) Southern blot of BamHI-digested DNA from APRT- colonies isolated from the (CTG)98 cell line, GS3504. DNAs from individual colonies were digested with BamHI, and the fragments were resolved by gel electrophoresis and made visible by Southern blotting. Numbers at the side indicate the lengths of fragments in kilobases. Numbers at the top identify the individual colonies. The structures of GS3504-200 and GS3504-203, which are rearrangements, are shown in more detail in Fig. 4.
|
|
View this table: [in a new window] |
TABLE 3. Distribution of types of products among APRT- and TK- APRT- colonies
|
As expected, most of the APRT- colonies for all CTG cell lines arose by homologous recombination. The proportion of homologous recombinants for the (CTG)17 cell lines (87%) is not significantly less than the 95% for the pooled data from the other cell lines (P = 0.07); however, the proportions of homologous recombinants for the (CTG)98 and (CTG)183 cell lines (72 and 82%, respectively) are significantly less (P = 7 x 10-7 and P = 4 x 10-4, respectively). This suggests that these two cell lines generate homologous recombinants at lower rates and/or generate mutations and rearrangements at higher rates.
To determine the rates for specific types of events for each cell line, the percentage of each event (Table 3) was multiplied by the overall rate of APRT- or TK- APRT- colony formation (Table 2). The individual rates are presented in Table 4. All three CTG cell lines have significantly lower rates of conversion relative to the pooled data from the other cell lines [(CTG)17, P = 0.02; (CTG)98, P = 0.0004; (CTG)183, P = 0.001]. In addition, the rates of crossover recombination in the APRT- population and in the TK- APRT- population are significantly elevated for the (CTG)183 cell line (P = 0.005 for both). These changes are evident in the ratios of conversions to crossovers, which decrease from 6.1 in cell lines that do not contain CTG repeats to 3.8 for the (CTG)17 cell line, 2.5 for the (CTG)98 cell line, and 0.9 for the (CTG)183 cell line.
Effects of homologous recombination on CTG repeats. These results indicate that CTG repeats affect homologous recombination processes in their vicinity; however, they do not address the issue of CTG repeat stability during homologous recombination. One way that instability might manifest itself would be as changes in repeat length in APRT- and TK- APRT- recombinants. To detect such changes, conversion and crossover recombinants that retained their CTG repeats were screened by PCR. No changes were found in recombinants isolated from the (CTG)17 cell line. However, in addition to the expected small variations detected in replicating populations of (CTG)98 and (CTG)183 cell lines (data not shown), five large contractions were found among the 95 CTG-positive recombinants from the (CTG)98 and (CTG)183 cell lines. Contractions to 58, 76, and 81 repeats were isolated from the (CTG)98 cell line; contractions to 96 and 157 repeats were isolated from the (CTG)183 cell line. These contractions were all associated with homologous recombination events: three with conversions and two with crossovers. The contraction to 76 repeats, which arose in a conversion event, is shown in lane 10 in Fig. 3B. Five large contractions among 95 recombinants (5%) represent a significant increase (P = 0.002) over replicating cell populations, which yielded no large changes out of 252 colonies (<0.4%). These results indicate that long CTG repeats yield large contractions at a >10-fold-higher rate during homologous recombination than during replication.
A second way that instability might manifest itself is as rearrangements that disrupt the function of the APRT gene, or of both the TK and APRT genes. Among 176 APRT- and TK- APRT- colonies isolated from the (CTG)98 and (CTG)183 cell lines, 17 were rearrangements, accounting for nearly 10% of all colonies (Table 3). In contrast, the (CTG)17 cell line yielded no rearrangements out of 69 colonies, and the other cell lines (33, 63, 65, 66) have yielded no rearrangements out of 429 APRT- and TK- APRT- colonies (Table 3). Seventeen rearrangements out of 176 colonies (10%) from the (CTG)98 and (CTG)183 cell lines is significantly different (P = 2 x 10-12) from 0 rearrangements out of 498 colonies (<0.2%) from the (CTG)17 cell line and the pooled data from the other cell lines. Thus, in a population that has been selected for a nearby homologous recombination event, long CTG repeats are associated with a >50-fold-higher frequency of rearrangements (10% compared with <0.2%).
To determine whether there might be a direct link between recombination and the formation of rearrangements, 12 of the 17 rearrangements were examined by more extensive Southern blotting and PCR analyses (Fig. 4). These rearrangements include a variety of deletions and insertions, but their common feature is that they all involve the CTG sequence. In 7 of the 12 rearrangements, one or both of the restriction sites that flank the triplet repeat were deleted, indicating that the rearrangement encompasses the CTG repeat or ends within or adjacent to it. In the other five rearrangements, extra DNA was present at the site of the repeat.
![]() View larger version (31K): [in a new window] |
FIG. 4. Molecular structures of rearrangements. The structure of the APRT locus in the parental (CTG)98 and (CTG)183 cell lines, GS3504 and GS3506, respectively, is indicated at the top. B and H indicate sites of BamHI and HindIII cleavage, respectively. The CTG tracts in these cell lines carry a BamHI site to the left of the CTG sequence and a HindIII site to the right. Numbers to the right identify individual colonies. Southern blotting was carried out on BamHI-digested DNA and on HindIII-digested DNA for each colony. PCR analyses were used to refine estimates of the positions of the ends of the rearrangements. Boxes above the site of the original CTG sequence indicate various inserted sequences. Open boxes in GS3504-658 and GS3506-859 indicate insertions that have not been further characterized. For GS3504-200, GS3504-639, GS3504-657, GS3506-289, and GS3506-295 the rearrangement junction fragment was isolated and sequenced.
|
GS3504-639 from the (CTG)98 cell line appears to have been derived from a conversion event that went awry. The downstream copy of the APRT gene now resembles the upstream copy: it includes the mutation in exon 2, the FRT site from the upstream sequence, the 3' truncation of exon 5, and the contiguous plasmid sequences (Fig. 4). This homologous copy of the upstream sequence is joined nonhomologously through its plasmid sequences to the downstream APRT flanking sequences. GS3504-657 from the (CTG)98 cell line has picked up 34 bp from the upstream FRT sequence, which is joined nonhomologously to the remaining two CTG repeats and homologously to sequences on the 3' side of the downstream FRT site (Fig. 4). GS3506-295 from the (CTG)183 cell line has retained its CTG repeat but carries an insert in place of a 65-bp deletion (Fig. 4). The insert consists of inverted copies of two discontinuous segments of upstream sequences. One segment includes plasmid sequences and contiguous APRT sequences extending from the truncated fifth exon into intron 3. The other segment is an inverted 47-bp copy of sequences around the upstream FRT site. These two segments are linked by a 6-bp sequence whose source is unclear.
The five sequenced rearrangements contain a total of eight nonhomologous junctions. For six of those junctions the sequences of the DNAs that formed the junction are known, allowing the homology at each of these junctions to be determined. These junctions display the microhomology that is typical of nonhomologous junctions in mammalian cells (64). One junction had one nucleotide of homology, one junction had two nucleotides of homology, and four junctions had three nucleotides of homology. Thus, the rearrangements generated during homologous recombination are complex, with elements of both homologous and nonhomologous recombination in evidence.
|
|
|---|
In contrast to the small changes observed in replicating cells, large changes were common among homologous recombinants. Once again, the changes were confined to the longer CTG repeats, with (CTG)17 repeats being stable. The (CTG)98 and (CTG)183 cell lines showed two kinds of instability in response to recombination. Large contractions of these CTG repeats (greater than 15 repeats) were stimulated more than 10-fold by nearby homologous recombination events. High frequencies of recombination-stimulated repeat instability have also been reported for E. coli and yeast. In experiments where both contractions and expansions could be assayed, some studies have reported a two- to eightfold preponderance of contractions (26, 58, 59), whereas others have shown mainly expansions (17, 24, 25). Given the small number of characterized examples in the present study, we can conclude only that recombination-associated instability in our system is biased in favor of contractions.
The most striking manifestation of recombination-associated repeat instability, however, was a novel class of rearrangements that extended outside the CTG repeats. These occurred at a frequency more than 50-fold above the normal frequency of rearrangements (33, 63, 65, 66). Three of the five sequenced rearrangements had inserted DNA from the upstream copy of the APRT gene at the site of the CTG repeat. In two of these cases, one end of the rearrangement was clearly formed by a homologous recombination event, while the other end was formed by a nonhomologous event. These footprints of recombination at the sites of rearrangements involving the CTG repeats argue that homologous recombination is responsible for these events.
Rearrangements associated with triplet repeats have been noted previously for E. coli (24, 25, 57) and in patients with fragile X syndrome (13, 15, 45, 47, 69). The characterized fragile X rearrangements include deletions that encompass the CGG repeat (13, 69) and deletions that have one end within the repeat (45, 47), which are similar to the types of rearrangements that we have identified for CTG repeats. Like the more common CGG repeat expansion, deletions are observed at the fragile X MR gene (FMR1) because they can eliminate the function of FMR1, which is the basis for the disease phenotype. By contrast, deletions of CTG repeats and flanking DNA sequences have not been observed as the cause of DM, presumably because inactivation of the DMPK gene does not lead to the disease (28, 78). Our results suggest that careful examination of somatic tissues from DM patients would uncover cells with rearrangements of the type described here. Analysis of patient samples, which usually focuses on changes in repeat number with flanking PCR primers, may overlook these rearrangements. In general, rearrangements triggered by long tracts of triplet repeats might be expected to contribute to the progression of diseases such as fragile X syndrome and Friedreich ataxia, which are caused by loss-of-function mutations, but not of diseases such as DM and Huntington disease, which are caused by gain-of-function mutations (10, 78).
For both replication and transcription, the effects on triplet repeat stability in E. coli and yeast (6, 17, 23, 44, 53, 67) are accompanied by a reciprocal effect on the process itself. Triplet repeats have been shown elsewhere to interfere, for example, with the progress of DNA polymerase (31, 77) and RNA polymerase (54). These reciprocal effects are presumed to have the same root cause: the unusual structure adopted by the repeat. In this study we have shown not only that CTG triplet repeats are significantly destabilized by homologous recombination but also that CTG repeats have a reciprocal effect on homologous recombination. The most dramatic effect was observed for the (CTG)183 cell line, which displayed a three- to fourfold-lower rate of conversion and a two- to threefold-higher rate of crossover. In this cell line, an additional difference was evident among the crossover class of recombinants. In previous studies with other inserted sequences (63), the insert was retained (25 examples) about as often as it was lost (29 examples). Equal retention versus loss was also observed for (CTG)17 (12 versus 12) and for (CTG)98 (9 versus 13); however, for (CTG)183 there was a significant bias (P = 0.002) toward loss of the repeat (10 retained versus 41 lost).
In yeast, long triplet repeat sequences have been shown previously to cause frequent double-strand DNA breaks, which stimulate homologous recombination (3, 17, 26). In our studies, no substantial increase in homologous recombination was observed, suggesting that CTG repeats do not induce frequent DNA breaks at the APRT locus in CHO cells. Unlike E. coli and S. cerevisiae, however, which repair breaks almost exclusively by homologous recombination, mammalian cells repair breaks by both homologous and nonhomologous processes. It was shown previously that I-SceI-induced double-strand breaks at the site where the CTG repeat sequences were inserted lead to a 100-fold increase in homologous recombinants and to a 1,000-fold increase in rearrangements (64). In the present studies we observed more than a 50-fold increase in rearrangements for long CTG repeats. If these rearrangements had resulted from CTG-induced breaks and the ratio of break resolution by homologous and nonhomologous recombination were maintained, we might have expected a fivefold increase in homologous recombination, which would have been readily detected. This line of reasoning does not rule out a low frequency of CTG-induced breaks in CHO cells, but it does encourage us to think about other ways by which long CTG repeats might influence homologous recombination.
Focusing on the (CTG)183 cell line, where the effects were most evident, and on known recombination activities, we can envision the following pathway. We assume that recombination is initiated at a spontaneous double-strand break (that is, one not induced by the CTG repeat) and the ends are stripped normally, as outlined in Fig. 5. When the repeat becomes single stranded, however, we propose that it forms a hairpin, which triggers removal of the single-stranded tail by Ercc1/Xpf endonuclease (14). This nuclease is known to remove single-strand tails from standard DNA hairpins (14) but has not been tested against triplet repeat hairpins. The resulting hairpin cap might reasonably be expected to alter the ability of this intermediate to participate in conversions and crossovers. For example, if the terminal hairpin blocked strand invasion by preventing the loading of Rad51 (4, 49, 72), then gene conversion, which occurs predominantly by synthesis-dependent strand annealing (SDSA), would be inhibited, as it is in the (CTG)183 cell line. By contrast, formation of crossovers, which occurs mainly by single-strand annealing (SSA) and is not dependent on Rad51 (52), would not be inhibited and might be increased, as it is in the (CTG)183 cell line, if recombination intermediates are diverted from the more common SDSA pathway. This proposed scenario would also account for the preferential loss of repeats from the crossover class of recombinants in the (CTG)183 cell line. Finally, it is possible that the hairpin cap shunts this abnormal intermediate toward resolution by a nonhomologous pathway, leading to rearrangements. This model makes several testable predictions.
![]() View larger version (20K): [in a new window] |
FIG. 5. Speculative model for interaction of CTG repeats with homologous recombination. Although the upstream and downstream copies of the two APRT genes are not shown explicitly, the CTG repeats are present in the top strand of the downstream copy, as illustrated. To generate APRT- crossovers by SSA requires a double-strand break between the mutant and wild-type copies of exon 2, that is, to the left of the CTG repeat, as shown. Conversions by SDSA can be generated in a variety of ways that depend on invasion of a 3' end, priming of DNA synthesis, release of the extended strand, and pairing with sequences on the other side of the break. The depicted events depart from the standard models for SSA and SDSA (52) at the formation of a hairpin by the triplet repeats and cleavage of the hairpin by the Ercc1/Xpf (E/X) endonuclease. It is hypothesized that a single strand ending in a hairpin is compromised for the strand invasion step in the SDSA pathway. If the initial break is in APRT sequences to the right of the repeat, the right-hand end will not have a hairpin and conversion may proceed normally.
|
This work was supported by National Institutes of Health grants GM38219 (J.H.W.), GM52982 (R.D.W.), NS37554 (R.D.W.), and ES11347 (R.D.W.) and by funds from the Robert A. Welch Foundation (R.D.W.) and the Muscular Dystrophy Association (J.H.W.).
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»