首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We report here a novel method for predicting melting temperatures of DNA sequences based on a molecular-level hypothesis on the phenomena underlying the thermal denaturation of DNA. The model presented here attempts to quantify the energetic components stabilizing the structure of DNA such as base pairing, stacking, and ionic environment which are partially disrupted during the process of thermal denaturation. The model gives a Pearson product-moment correlation coefficient (r) of ∼0.98 between experimental and predicted melting temperatures for over 300 sequences of varying lengths ranging from 15-mers to genomic level and at different salt concentrations. The approach is implemented as a web tool (www.scfbio-iitd.res.in/chemgenome/Tm_predictor.jsp) for the prediction of melting temperatures of DNA sequences.  相似文献   

2.
We demonstrate quantitatively that, as predicted by evolutionary theory, sequences of homologous proteins from different species converge as we go further and further back in time. The converse, a non-evolutionary model can be expressed as probabilities, and the test works for chloroplast, nuclear and mitochondrial sequences, as well as for sequences that diverged at different time depths. Even on our conservative test, the probability that chance could produce the observed levels of ancestral convergence for just one of the eight datasets of 51 proteins is ≈1×10−19 and combined over 8 datasets is ≈1×10−132. By comparison, there are about 1080 protons in the universe, hence the probability that the sequences could have been produced by a process involving unrelated ancestral sequences is about 1050 lower than picking, among all protons, the same proton at random twice in a row. A non-evolutionary control model shows no convergence, and only a small number of parameters are required to account for the observations. It is time that that researchers insisted that doubters put up testable alternatives to evolution.  相似文献   

3.
We describe the development and testing of a simple statistical mechanics methodology for duplex DNA applicable to sequences of any composition and extensible to genomes. The microstates of a DNA sequence are modeled in terms of blocks of basepairs that are assumed to be fully closed (paired) or open. This approach generates an ensemble of bubblelike microstates that are used to calculate the corresponding partition function. The energies of the microstates are calculated as additive contributions from hydrogen bonding, basepair stacking, and solvation terms parameterized from a comprehensive series of molecular dynamics simulations including solvent and ions. Thermodynamic properties and nucleotide stability constants for DNA sequences follow directly from the partition function. The methodology was tested by comparing computed free energies per basepair with the experimental melting temperatures of 60 oligonucleotides, yielding a correlation coefficient of −0.96. The thermodynamic stability of genic/nongenic regions was tested in terms of nucleotide stability constants versus sequence for the Escherichia coli K-12 genome. It showed clear differentiation of the genes from promoters and captures genic regions with a sensitivity of 0.94. The statistical thermodynamic model presented here provides a seemingly new handle on the challenging problem of interpreting genomic sequences.  相似文献   

4.
We describe the development and testing of a simple statistical mechanics methodology for duplex DNA applicable to sequences of any composition and extensible to genomes. The microstates of a DNA sequence are modeled in terms of blocks of basepairs that are assumed to be fully closed (paired) or open. This approach generates an ensemble of bubblelike microstates that are used to calculate the corresponding partition function. The energies of the microstates are calculated as additive contributions from hydrogen bonding, basepair stacking, and solvation terms parameterized from a comprehensive series of molecular dynamics simulations including solvent and ions. Thermodynamic properties and nucleotide stability constants for DNA sequences follow directly from the partition function. The methodology was tested by comparing computed free energies per basepair with the experimental melting temperatures of 60 oligonucleotides, yielding a correlation coefficient of −0.96. The thermodynamic stability of genic/nongenic regions was tested in terms of nucleotide stability constants versus sequence for the Escherichia coli K-12 genome. It showed clear differentiation of the genes from promoters and captures genic regions with a sensitivity of 0.94. The statistical thermodynamic model presented here provides a seemingly new handle on the challenging problem of interpreting genomic sequences.  相似文献   

5.
DNA序列进化过程中核苷酸替代的非独立性研究   总被引:4,自引:2,他引:2  
杨子恒 《遗传学报》1990,17(5):354-359
本文评述了DNA序列间核苷酸替代数的估计方法,并通过对七个物种中组蛋白基因的比较对DNA进化的模型进行了考察。发现H2A基因第三位点上的碱基组成在物种间变异很大,并且跟H2A基因第一位点、H4基因第一、三位点及H2A上游,下游序列中的碱基组成有强正相关,提示DNA序列进化过程中存在着物种特异的区域性约束力。可能的原因是高等真核生物中GC含量升高,或者是染色体重组使这些同源序列位于不同的等质区段,从而受到不同的选择突变压。密码内各位点上核苷酸替代的相关性分析表明不同位点的替代是非独立的,其原因可能是一次替代事件引起多个位点的变化。文中讨论了这些结果对进化树推断的意义。  相似文献   

6.
A Model for DNA Sequence Evolution within Transposable Element Families   总被引:5,自引:2,他引:3  
J. F. Y. Brookfield 《Genetics》1986,112(2):393-407
A quantitative model is proposed for the expected degree of relationship between copies of a family of transposable elements in a finite population of hosts. Special cases of the model (in which the process of homogenization of element copies either is or is not limited by transposition rate) are presented and illustrated, using data on mobile sequences from different species. It is shown that transposition will be expected, in large populations, to result in only a rather distant relationship between transposable elements at different genomic sites. Possible inadequacies of the model are suggested and quantified.  相似文献   

7.
We suggest hypotheses to account for two major features of chromosomal organization in higher eukaryotes. The first of these is the general restriction of crossing over in the neighborhood of centromeres and telomeres. We propose that this is a consequence of selection for reduced rates of unequal exchange between repeated DNA sequences for which the copy number is subject to stabilizing selection: microtubule binding sites, in the case of centromeres, and the short repeated sequences needed for terminal replication of a linear DNA molecule, in the case of telomeres. An association between proximal crossing over and nondisjunction would also favor the restriction of crossing over near the centromere. The second feature is the association between highly repeated DNA sequences of no obvious functional significance and regions of restricted crossing over. We show that highly repeated sequences are likely to persist longest (over evolutionary time) when crossing over is infrequent. This is because unequal exchange among repeated sequences generates single copy sequences, and a population that becomes fixed for a single copy sequence by drift remains in this state indefinitely (in the absence of gene amplification processes). Increased rates of exchange thus speed up the process of stochastic loss of repeated sequences.  相似文献   

8.
Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.  相似文献   

9.
The cattle genome contains several distinct centromeric satellites with interrelated evolutionary histories. We compared these satellites in Bovini species that diverged 0.2 to about 5 Myr ago. Quantification of hybridization signals by phosphor imaging revealed a large variation in the relative amounts of the major satellites. In the genome of water buffalo this has led to the complete deletion of satellite III. Comparative sequencing and PCR-RFLP analysis of satellites IV, 1.711a, and 1.711b from the related Bos and Bison species revealed heterogeneities in 0.5 to 2% of the positions, again with variations in the relative amounts of sequence variants. Restriction patterns generated by double digestions suggested a recombination of sequence variants. Our results are compatible with a model of the life history of satellites during which homogeneity of interacting repeat units is both cause and consequence of the rapid turnover of satellite DNA. Initially, a positive feedback loop leads to a rapid saltatory amplification of homogeneous repeat units. In the second phase, mutations inhibit the interaction of repeat units and coexisting sequence variants amplify independently. Homogenization by the spreading of one of the variants is prevented by recombination and the satellite is eventually outcompeted by another, more homogeneous tandem repeat sequence. Received: 21 July 2000 / Accepted: 30 October 2000  相似文献   

10.
本文对DNA序列进化过程中核苷酸替代的随机模型进行了评价,对替代速率在时间和空间上不恒定的情形进行了考察和推广。Lanave等(1984)曾提出一个模型,宣称对替代的模式未做任何假定,但事实上我们证明它假定替代过程是可逆的。运用2-p、4-p和6-p模型进行的计算表明替代速度在位点间的差异会造成估计的替代数严重偏低,并且替代数越大,偏差也越大。替代模式在位点间的差异也会造成估计值偏低,但偏差不严重  相似文献   

11.
Recent studies of mitochondrial DNA sequences have indicated the requirement for substantial revisions of the morphological understanding of the phylogeny of Megachiroptera (Pteropodidae). There is disagreement between studies as to what these revisions might be. This investigation was undertaken to expand the number of studied species and to add the first data from a nuclear gene sequence. For 12S ribosomal DNA (aligned length of 405 positions), 75 Megachiroptera (50 species in 20 genera) and two outgroup species were sequenced. For the oncogene c-mos (aligned length of 488 bases), 56 Megachiroptera (42 species in 19 genera) were sequenced and three eutherians from GenBank used as outgroups.The root of the megachiropteran phylogeny cannot be determined with the present data. Nyctimene, the only studied insectivorous genus (Paranyctimene not being included), plus Notopteris, the only long-tailed megachiropteran, form the sister clade to the other genera in combined analyses. Several alternative rootings are not rejected by the data, suggesting a rapid early radiation. Generic distributions indicate that this may have occurred in Melanesia. The results confirm that the subfamily Macroglossinae is not monophyletic with the long tongued phenoptype arising at least twice and support the existence of a major clade including a monophyletic endemic African component and biogeographically neighboring genera such as Rousettus and Eonycteris. The phylogenetic position of one African genus, Eidolon, remains uncertain.A cynopterine section (excluding Nyctimene and Myonycteris) is supported, albeit weakly, as a monophyletic group. Pteropus and the related, possibly polyphyletic genus Pteralopex, are unexpectedly basal compared to previous molecular studies.  相似文献   

12.
13.
P. Marjoram  P. Donnelly 《Genetics》1994,136(2):673-683
We consider the effect on the distribution of pairwise differences between mitochondrial DNA sequences of the incorporation into the underlying population genetics model of two particular effects that seem realistic for human populations. The first is that the population size was roughly constant before growing to its current level. The second is that the population is geographically subdivided rather than panmictic. In each case these features tend to encourage multimodal distributions of pairwise differences, in contrast to existing, unimodal datasets. We argue that population genetics models currently used to analyze such data may thus fail to reflect important features of human mitochondrial DNA evolution. These may include selection on the mitochondrial genome, more realistic mutation mechanisms, or special population or migration dynamics. Particularly in view of the variability inherent in the single available human mitochondrial genealogy, it is argued that until these effects are better understood, inferences from such data should be rather cautious.  相似文献   

14.

Background

Comparative DNA sequence analysis provides insight into evolution and helps construct a natural classification reflecting the Tree of Life. The growing numbers of organisms represented in DNA databases challenge tree-building techniques and the vertical hierarchical classification may obscure relationships among some groups. Approaches that can incorporate sequence data from large numbers of taxa and enable visualization of affinities across groups are desirable.

Methodology/Principal Findings

Toward this end, we developed a procedure for extracting diagnostic patterns in the form of indicator vectors from DNA sequences of taxonomic groups. In the present instance the indicator vectors were derived from mitochondrial cytochrome c oxidase I (COI) sequences of those groups and further analyzed on this basis. In the first example, indicator vectors for birds, fish, and butterflies were constructed from a training set of COI sequences, then correlations with test sequences not used to construct the indicator vector were determined. In all cases, correlation with the indicator vector correctly assigned test sequences to their proper group. In the second example, this approach was explored at the species level within the bird grouping; this also gave correct assignment, suggesting the possibility of automated procedures for classification at various taxonomic levels. A false-color matrix of vector correlations displayed affinities among species consistent with higher-order taxonomy.

Conclusions/Significance

The indicator vectors preserved DNA character information and provided quantitative measures of correlations among taxonomic groups. This method is scalable to the largest datasets envisioned in this field, provides a visually-intuitive display that captures relational affinities derived from sequence data across a diversity of life forms, and is potentially a useful complement to current tree-building techniques for studying evolutionary processes based on DNA sequence data.  相似文献   

15.
Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime ( 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic sequences. We review the results obtained with the so-called wavelet-based multifractal analysis when investigating the DNA sequences of various organisms in the three kingdoms. Some of these results have been announced in B. Audit et al. [1, 2].  相似文献   

16.
T. E. Kijima  Hideki Innan 《Genetics》2013,195(3):957-967
A population genetic simulation framework is developed to understand the behavior and molecular evolution of DNA sequences of transposable elements. Our model incorporates random transposition and excision of transposable element (TE) copies, two modes of selection against TEs, and degeneration of transpositional activity by point mutations. We first investigated the relationships between the behavior of the copy number of TEs and these parameters. Our results show that when selection is weak, the genome can maintain a relatively large number of TEs, but most of them are less active. In contrast, with strong selection, the genome can maintain only a limited number of TEs but the proportion of active copies is large. In such a case, there could be substantial fluctuations of the copy number over generations. We also explored how DNA sequences of TEs evolve through the simulations. In general, active copies form clusters around the original sequence, while less active copies have long branches specific to themselves, exhibiting a star-shaped phylogeny. It is demonstrated that the phylogeny of TE sequences could be informative to understand the dynamics of TE evolution.  相似文献   

17.
Surnames are inherited in much the same way as biological traits like alleles of one locus. Assuming the heritability of surnames, a simple stochastic model for X, the total number of occurrences of a surname, the Consul distribution defined by the probability mass function: for x = 1, 2, 3,… and zero otherwise and where either (i) m is a positive integer when 0 ≤ θ ≤ 1 such that θ ≦ mθ ≦ 1, or (ii) m≤0, θ ≤0 such that mθ 1, can be arrived at by considering the branching process mechanism. Some applications of the model to real data are also considered.  相似文献   

18.
We present a model for genome size evolution that takes into account both local mutations such as small insertions and small deletions, and large chromosomal rearrangements such as duplications and large deletions. We introduce the possibility of undergoing several mutations within one generation. The model, albeit minimalist, reveals a non-trivial spontaneous dynamics of genome size: in the absence of selection, an arbitrary large part of genomes remains beneath a finite size, even for a duplication rate 2.6-fold higher than the rate of large deletions, and even if there is also a systematic bias toward small insertions compared to small deletions. Specifically, we show that the condition of existence of an asymptotic stationary distribution for genome size non-trivially depends on the rates and mean sizes of the different mutation types. We also give upper bounds for the median and other quantiles of the genome size distribution, and argue that these bounds cannot be overcome by selection. Taken together, our results show that the spontaneous dynamics of genome size naturally prevents it from growing infinitely, even in cases where intuition would suggest an infinite growth. Using quantitative numerical examples, we show that, in practice, a shrinkage bias appears very quickly in genomes undergoing mutation accumulation, even though DNA gains and losses appear to be perfectly symmetrical at first sight. We discuss this spontaneous dynamics in the light of the other evolutionary forces proposed in the literature and argue that it provides them a stability-related size limit below which they can act.  相似文献   

19.
在DNA全对称群基础上,首先给出了正四面体所有对称操作与碱基变换群元素之间的一一对应关系;然后,归纳出了判断水或亲水密码子的对称原则;最后,讨论了多义密码子序列的对称性。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号