首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Abstract The influence of local base composition on mutations in chloroplast DNA (cpDNA) is studied in detail and the resulting, empirically derived, mutation dynamics are used to analyze both base composition and codon usage bias. A 4 × 4 substitution matrix is generated for each of the 16 possible flanking base combinations (contexts) using 17,253 noncoding sites, 1309 of which are variable, from an alignment of three complete grass chloroplast genome sequences. It is shown that substitution bias at these sites is correlated with flanking base composition and that the A+T content of these flanking sites as well as the number of flanking pyrimidines on the same strand appears to have general influences on substitution properties. The context-dependent equilibrium base frequencies predicted from these matrices are then applied to two analyses. The first examines whether or not context dependency of mutations is sufficient to generate average compositional differences between noncoding cpDNA and silent sites of coding sequences. It is found that these two classes of sites exist, on average, in very different contexts and that the observed mutation dynamics are expected to generate significant differences in overall composition bias that are similar to the differences observed in cpDNA. Context dependency, however, cannot account for all of the observed differences: although silent sites in coding regions appear to be at the equilibrium predicted, noncoding cpDNA has a significantly lower A+T content than expected from its own substitution dynamics, possibly due to the influence of indels. The second study examines the codon usage of low-expression chloroplast genes. When context is accounted for, codon usage is very similar to what is predicted by the substitution dynamics of noncoding cpDNA. However, certain codon groups show significant deviation when followed by a purine in a manner suggesting some form of weak selection other than translation efficiency. Overall, the findings indicate that a full understanding of mutational dynamics is critical to understanding the role selection plays in generating composition bias and sequence structure.  相似文献   

2.
Chloroplast DNA evolution in potato (Solanum tuberosum L.)   总被引:1,自引:1,他引:0  
Summary A deletion specific to chloroplast (ct) DNA of potato (Solanum tuberosum ssp. tuberosum) was determined by comparative sequence analysis. The deletion was 241 bp in size, and was not flanked by direct repeats. Five small, open reading frames were found in the corresponding regions of ctDNAs from wild potato (S. tuberosum ssp. andigena) and tomato (Lycopersicon esculentum). Comparison of the sequences of 1.35-kbp HaeIII ctDNA fragments from potato, tomato, and tobacco (Nicotiana tabacum) revealed the following: the locations of the 5 ends of both rubisco large subunit (rbcL) and ATPase beta subunit (atp) mRNAs were probably the same as those of spinach (Spinacia oleracea); the promoter regions of the two genes were highly conserved among the four species; and the 5 untranslated regions diverged at high rates. A phylogenetic tree for the three potato cultivars, one tomato cultivar, and one tobacco cultivar has been constructed by the maximum parsimony method from DNA sequence data, demonstrating that the rate of nucleotide substitution in potato ctDNA is much slower than that in tomato ctDNA. This fact might be due to the differences in the method of propagation between the two crops.  相似文献   

3.
Estimation of evolutionary distances between nucleotide sequences   总被引:11,自引:0,他引:11  
A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86–93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414–422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191–210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269–285, 1984) method is superior to others.  相似文献   

4.
We studied the substitution patterns in 7661 well-conserved human–mouse alignments corresponding to the intergenic regions of human chromosome 22. Alignments with a high average GC content tend to have a higher human GC content than mouse GC content, indicating a lack of stationarity. Segmenting the alignments into four groups of GC content and fitting the general reversible substitution model (REV) separately gave significantly better fits than the overall fit and the levels of fit are close to that expected under an REV model. In addition, most of the fitted rate matrices are not of the HKY type but are remarkably strand-symmetric, and we constructed a number of substitution matrices that should be useful for genomic DNA sequence alignment. We did not find obvious signs of temporal inhomogeneity in the substitution rates and concluded that the conserved intergenic regions in human chromosome 22 and mouse appear to have evolved from their common ancestors via a process that is approximately reversible and strand-symmetric, assuming site homogeneity and independence.  相似文献   

5.
A model of nucleotide substitution that allows the transition/transversion rate bias to vary across sites was constructed. We examined the fit of this model using likelihood-ratio tests by analyzing 13 protein coding genes and 1 pseudogene. Likelihood-ratio testing indicated that a model that allows variation in the transition/transversion rate bias across sites provided a significant improvement in fit for most protein coding genes but not for the pseudogene. When the analysis was repeated with parameters estimated separately for first, second, and third codon positions, strong heterogeneity was uncovered for the first and second codon positions; the variation in the transition/transversion rate was generally weaker at the third codon position. The transition rate bias and branch lengths are underestimated when variation in the transition/transversion rate was not accommodated, suggesting that it may be important to accommodate variation in the pattern of nucleotide substitution for accurate estimation of evolutionary parameters. Received: 4 November 1997 / Accepted: 19 May 1998  相似文献   

6.
Maximum likelihood (ML) phylogenies based on 9,957 amino acid (AA) sites of 45 proteins encoded in the plastid genomes of Cyanophora, a diatom, a rhodophyte (red algae), a euglenophyte, and five land plants are compared with respect to several properties of the data, including between-site rate variation and aberrant amino acid composition in individual species. Neighbor-joining trees from AA LogDet distances and ML analyses are seen to be congruent when site rate variability was taken into account. Four feasible trees are identified in these analyses, one of which is preferred, and one of which is almost excluded by statistical criteria. A transition probability matrix for the general reversible Markov model of amino acid substitutions is estimated from the data, assuming each of these four trees. In all cases, the tree with diatom and rhodophyte as sister taxa was clearly favored. The new transition matrix based on the best tree, called cpREV, takes into account distinct substitution patterns in plastid-encoded proteins and should be useful in future ML inferences using such data. A second rate matrix, called cpREV*, based on a weighted sum of rate matrices from different trees, is also considered. Received: 3 June 1999 / Accepted: 26 November 1999  相似文献   

7.
Nucleotide substitutions (i.e., point mutations) are the primary driving force in generating DNA variation upon which selection can act. Substitutions called transitions, which entail exchanges between purines (A=adenine, G=guanine) or pyrimidines (C=cytosine, T=thymine), typically outnumber transversions (e.g., exchanges between a purine and a pyrimidine) in a DNA strand. With an increasing number of plant studies revealing a transversion rather than transition bias, we chose to perform a detailed substitution analysis for the plant family Cucurbitaceae using data from several short plastid DNA sequences. We generated a phylogenetic tree for 19 taxa of the tribe Benincaseae and related genera and then scored conservative substitution changes (e.g., those not exhibiting homoplasy or reversals) from the unambiguous branches of the tree. Neither the transition nor (A+T)/(G+C) biases found in previous studies were supported by our overall data. More importantly, we found a novel and symmetrical substitution bias in which Gs had been preferentially replaced by A, As by C, Cs by T, and Ts by G, resulting in the GACTG substitution series. Understanding this pattern will lead to new hypotheses concerning plastid evolution, which in turn will affect the choices of substitution models and other tree-building algorithms for phylogenetic analyses based on nucleotide data.  相似文献   

8.
It is widely approved that comparing restriction profiles and maps of chloroplast DMA provides valuable information concerning inter-and/or intra-specific relationships among plant species. Such chloroplast DNA analysis was applied to species and strains inSesamum which is a genus of approximately 38 species and contains a large number of strains of the cultivated sesame,S. Indlcum. Our chloroplast DNA investigations of 22 species and strains showed that; (i) among four species (S. capense, S. radiatum, S. schinzianum andS. indicum), the chloroplast genome ofS. capense was most distantly related to that of the cultivated species,S. indicum, (ii) chloroplast DNA polymorphism was also recognized among eight cultivated stralns collected from various regions in the tropical zone, but not among eight different varieties grown in the temperate zone, and (iii) the chloroplast DNA alterations observed could be attributed to the site gains or losses with the exception of the alterartion detected within the inverted repeat sequences inS. capense chloroplast DNA. These results demonstrate the presence of chloroplast genome diversity amongSesamum species and strains, suggesting the usefulness of chloroplast DNA analysis for elucidating the species relationships in the genusSesamum and the origin and evolutionary process of the cultivated sesame species. The present paper is based on the contribution which was read in a symposium entitled “Organellar DNA Variations in Higher Plants and Their Taxonomic Significance”, at the 50th Annual Meeting of the Botanical Society of Japan in Shizuoka on October 2, 1990, under the auspices of the Japan Society of Plant Taxonomists.  相似文献   

9.
The variability of cocoa (Theobroma cacao) cytoplasmic genomes has been investigated. A total of 177 cocoa clones was surveyed for restriction fragment length polymorphism (RFLP) in chloroplast DNA and in mitochondrial DNA using two restriction endonucleases and various heterologous cytoplasmic probes. A high level of polymorphism was found for the mitochondrial genome. This study points up a structuring of the species that fits with the distinction between the Criollo and Forastero populations. In contrast to all previous analyses, a higher level of polymorphism is found among the Criollo clones while the Forastero clones form quite a homogeneous group.  相似文献   

10.
Substitution rates were estimated for the coding and noncoding regions of the hepatitis delta virus (HDV). The estimated rates of synonymous substitution in HDV were lower than the rates of substitution at nonsynonymous sites and in the noncoding region. HDV has lower synonymous substitution rates than the hepatitis C virus, though both are RNA viruses. The relatively low rate of synonymous substitution in HDV may be due to a strong preference of G and C nucleotides at third codon positions. Variation in substitution rate among HDV lineages may be correlated with the clinical development of the HDV-induced hepatitis. The phylogenetic tree inferred for 24 HDV strains reveals similarities between lineages isolated from the same geographic region. Correspondence to: W.-H. Li  相似文献   

11.
Substitutions occurring in noncoding sequences of the plant chloroplast genome violate the independence of sites that is assumed by substitution models in molecular evolution. The probability that a substitution at a site is a transversion, as opposed to a transition, increases significantly with increasing A + T content of the two adjacent nucleotides. In the present study, this dependency of substitutions on local context is examined further in a number of noncoding regions from the chloroplast genome of members of the grass family (Poaceae). Two features were examined; the influence of specific neighboring bases, as opposed to the general A + T content, on transversion proportion and an influence on substitutions by nucleotides other than the two immediately adjacent to the site of substitution. In both cases, a significant effect was found. In the case of specific nucleotides, transversion proportion is significantly higher at sites with a pyrimidine immediately 5′ on either strand. Substitutions at sites of the type YNR, where N is the site of substitution, have the highest rate of transversion. This specific effect is secondary to the A + T content effect such that, in terms of proportion of substitutions that are transversions, the nucleotides are ranked T > A > C > G as to their effect when they are immediately 5′ to the site of substitution. In the case of nucleotides other than the immediate neighbors, a significant influence on substitution dynamics is observed in the case where the two neighboring bases are both A and/or T. Thus, substitutions are primarily, but not exclusively, influenced by the composition of the two nucleotides that are immediately adjacent. These results indicate that the pattern of molecular evolution of the plant chloroplast genome is extremely complex as a result of a variety of inter-site dependencies. Received: 18 October 1996 / Accepted: 12 April 1997  相似文献   

12.
Rates of synonymous and nonsynonymous nucleotide substitutions and codon usage bias (ENC) were estimated for a number of nuclear and chloroplast genes in a sample of centric and pennate diatoms. The results suggest that DNA evolution has taken place, on an average, at a slower rate in the chloroplast genes than in the nuclear genes: a rate variation pattern similar to that observed in land plants. Synonymous substitution rates in the chloroplast genes show a negative association with the degree of codon usage bias, suggesting that genes with a higher degree of codon usage bias have evolved at a slower rate. While this relationship has been shown in both prokaryotes and multicellular eukaryotes, it has not been demonstrated before in diatoms. Received: 3 June 1998 / Accepted: 11 August 1998  相似文献   

13.
This study presents the first global, 1-Mbp-level analysis of patterns of nucleotide substitutions along the human lineage. The study is based on the analysis of a large amount of repetitive elements deposited into the human genome since the mammalian radiation, yielding a number of results that would have been difficult to obtain using the more conventional comparative method of analysis. This analysis revealed substantial and consistent variability of rates of substitution, with the variability ranging up to twofold among different regions. The rates of substitutions of C or G nucleotides with A or T nucleotides vary much more sharply than the reverse rates, suggesting that much of that variation is due to differences in mutation rates rather than in the probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all types of substitution we observe substantially more hotspots than coldspots, with hotspots showing substantial clustering over tens of Mbp’s. Our analysis revealed that GC-content of surrounding sequences is the best predictor of the rates of substitution. The pattern of substitution appears very different near telomeres compared to the rest of the genome and cannot be explained by the genome-wide correlations of the substitution rates with GC content or exon density. The telomere pattern of substitution is consistent with natural selection or biased gene conversion acting to increase the GC-content of the sequences that are within 10–15 Mbp away from the telomere.Reviewing Editor: Dr. Jerzy Jurka
This revised version was published online in July 2005 with corrected page numbers.  相似文献   

14.
In infectious disease epidemiology, it is useful to know how quickly genetic markers of pathogenic agents evolve while inside hosts. We propose a modular framework with which these genotype change rates can be estimated. The estimation scheme requires a model of the underlying process of genetic change, a detection scheme that filters this process into observable quantities, and a monitoring scheme that describes the timing of observations. We study a linear "birth-shift-death" model for change in transposable element genotypes, obtaining maximum-likelihood estimators for various detection and monitoring schemes. The method is applied to serial genotypes of the transposon IS6110 in Mycobacterium tuberculosis. The estimated birth rate of 0.0161 (events per copy of the transposon per year) and death rate of 0.0108 are both significantly larger than the estimated shift rate of 0.0018. The sum of these estimates, which corresponds to a "half-life" of 2.4 years for a typical strain that has 10 copies of the element, substantially exceeds a previous estimate of 0.0135 total changes per copy per year. We consider experimental design issues that enable the precision of estimates to be improved. We also discuss extensions to other markers and implications for molecular epidemiology.  相似文献   

15.
Summary The course of evolutionary change in DNA sequences has been modeled as a Markov process. The Markov process was represented by discrete time matrix methods. The parameters of the Markov transition matrices were estimated by least-squares direct-search optimization of the fit of the calculated divergence matrix to that observed for two aligned sequences. The Markov process corrected for multiple and parallel substitutions of bases at the same site. The method avoided the incorrect assumption of all previously described methods that the divergence between two present-day sequences is twice the divergence of either from the common and unknown ancestral sequence. The three previous methods were shown to be equivalent. The present method also avoided the undesirable assumptions that sequence composition has not changed with time and that the substitution rates in the two descendant lineages were the same. It permitted simultaneous estimation of ancestral sequence composition and, if applicable, of different substitution rates for the two descendant lineages, provided the total number of estimated parameters was less than 16. Properties of the Markov chain were discussed. It was proved for symmetric substitution matrices that all elements of the equilibrium divergence matrix equal 1/16, and that the total difference in the divergence matrix at epoch k equals the total change in the common substitution matrix at epoch 2k for all values of k. It was shown how to resolve an ambiguity in the assignment of two different substitution rates to the two descendant lineages when four or more similar sequences are available. The method was applied to the divergence matrix for codon site 3 for the mouse and rabbit beta-globins. This observed divergence matrix was significantly asymmetric and required at least two different substitution rates. This result could be achieved only by using different asymmetric substitution matrices for the two lineages.  相似文献   

16.
共获得49个太湖新银鱼(Neosalanx taihuensis)个体的线粒体细胞色素b(Cyt b)全序列和控制区(D-loop)部分序列。所测线粒体D-loop部分序列长度变化范围为648~680bp,识别到位于前端的一个串联重复序列、一个终止相关序列(ETAS),3个中央保守区保守序列(CSB-F、CSB-E、CSB-D)及一个保守序列区保守序列(CSB-1),结构与其他鱼类的研究结果类似。太湖新银鱼线粒体Cyt b和D-loop片段的相对进化速率的比较研究结果表明,太湖新银鱼D-loop总的序列多态性位点的比例为0.83%,低于线粒体Cyt b部分总的序列多态性位点的比例(1.31%)。假设太湖新银鱼Cyt b基因平均进化速率相对值为1,贝叶斯(Bayes)MCMC模拟给出Cyt b基因的相对速率区间估计为1.000±0.131,而D-loop基因的相对速率为0.859±0.261,表明太湖新银鱼D-loop基因的进化速率低于Cyt b基因,同时,后验概率分布的变异方差也比较大。说明Cyt b基因比D-loop基因具有相对较高的进化速率,也相对更接近分子钟假设。因此,可以认为Cyt b基因比D-loop基因更适于太湖新银鱼种内及近缘种间相关分子生态及系统地理格局的研究。  相似文献   

17.
The amino acid sequences of proteins provide rich information for inferring distant phylogenetic relationships and for predicting protein functions. Estimating the rate matrix of residue substitutions from amino acid sequences is also important because the rate matrix can be used to develop scoring matrices for sequence alignment. Here we use a continuous time Markov process to model the substitution rates of residues and develop a Bayesian Markov chain Monte Carlo method for rate estimation. We validate our method using simulated artificial protein sequences. Because different local regions such as binding surfaces and the protein interior core experience different selection pressures due to functional or stability constraints, we use our method to estimate the substitution rates of local regions. Our results show that the substitution rates are very different for residues in the buried core and residues on the solvent-exposed surfaces. In addition, the rest of the proteins on the binding surfaces also have very different substitution rates from residues. Based on these findings, we further develop a method for protein function prediction by surface matching using scoring matrices derived from estimated substitution rates for residues located on the binding surfaces. We show with examples that our method is effective in identifying functionally related proteins that have overall low sequence identity, a task known to be very challenging.  相似文献   

18.
Primula cuneifolia Ledeb. (Primulaceae), we analyzed intraspecific variation of the nucleotide sequences of non-coding regions of chloroplast DNA: the intergenic spacers between trnT (UGU) and the trnL (UAA) 5′exon, the trnL (UAA) 3′exon and trnF (GAA), and atpB and rbcL. In 20 populations of P. cuneifolia, 22 nucleotide substitutions and five insertions/deletions were inferred, and their genetic distances ranged from 0.001 to 0.008. Eight distinct haplotypes could be recognized and each haplotype was found to be geographically structured. Three major clades (the Northern, Hokkaido and Southern clades) were revealed in phylogenetic analyses of the haplotypes. The haplotypes of the Northern clade had a wider distribution area in the populations of Mt. Rausu and Rishiri Island of eastern and northern Hokkaido in Japan, northward to Unalaska Island in the Aleutians, and those of the Hokkaido clade were distributed in the populations of central Hokkaido and Mt. Iwaki of the northern Honshu in Japan; in addition, those of the Southern clade were observed only in the populations of the central Honshu. It was shown that the genetic diversifications of the Southern clade were higher than those of the Northern and Hokkaido clades. Furthermore, it was shown that the topology within the Southern clade was hierarchical, and the haplotypes of the Southern populations in the clade were derivative. From these results, we concluded that the cpDNA haplotypes of the three clades in P. cuneifolia arose and assumed the present distribution areas through several cycles of glacial advance and retreat in the Pleistocene. Received 24 June 1998/ Accepted in revised form 28 December 1998  相似文献   

19.
Polygonum cuspidatum in Japan, we analyzed the chloroplast DNA sequences of a region from the rbcL to the accD gene (ca. 1,420 bp), and found nucleotide variations at 22 sites in 68 samples. The phylogenetic relationship deduced from the sequence variations revealed the existence of at least five groups. The first group consisted of P. cuspidatum var. cuspidatum in the central part of Honshu; in Nagano, Yamanashi, and Shizuoka. The second, a sister of the first, consisted of those plants in Shizuoka-Itoigawa Line. The third group consisted of plants in the northern part of Japan including P. sachalinense in Hokkaido, P. cuspidatum var. cuspidatum in Aomori and var. uzensis in Akita. The fourth consisted of var. uzensis in the Tohoku District. The fifth consisted of var. terminalis in the Izu Islands. P. cuspidatum are differentiated according to their distribution, and two varieties, var. terminalis and var. uzensis, are differentiated genetically. Polygonum sachalinensis, a distinct species morphologically, fell into the accessions of P. cuspidatum on the phylogenetic tree obtained in the present study. Received 9 July 2000/ Accepted in revised form 11 October 2000  相似文献   

20.
A strong negative correlation between the rate of amino-acid substitution and codon usage bias in Drosophila has been attributed to interference between positive selection at nonsynonymous sites and weak selection on codon usage. To further explore this possibility we have investigated polymorphism and divergence at three kinds of sites: synonymous, nonsynonymous and intronic in relation to codon bias in D. melanogaster and D. simulans. We confirmed that protein evolution is one of the main explicative parameters for interlocus codon bias variation (r(2) approximately 40%). However, intron or synonymous diversities, which could have been expected to be good indicators of local interference [here defined as the additional increase of drift due to selection on tightly linked sites, also called 'genetic draft' by Gillespie (2000)] did not covary significantly with codon bias or with protein evolution. Concurrently, levels of polymorphism were reduced in regions of low recombination rates whereas codon bias was not. Finally, while nonsynonymous diversities were very well correlated between species, neither synonymous nor intron diversities observed in D. melanogaster were correlated with those observed in D. simulans. All together, our results suggest that the selective constraint on the protein is a stable component of gene evolution while local interference is not. The pattern of variation in genetic draft along the genome therefore seems to be instable through evolutionary times and should therefore be considered as a minor determinant of codon bias variance. We argue that selective constraints for optimal codon usage are likely to be correlated with selective constraints on the protein, both between codons within a gene, as previously suggested, and also between genes within a genome.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号