首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
Because microsatellite loci are abundant in the human genome and are highly polymorphic in most global populations, such loci have become very popular in studies on reconstructing evolutionary relationships among contemporary human populations. We have made an assessment of the efficiency of recovery of true evolutionary relationships using simulated data of microsatellite loci and a variety of distance measures. We find that allele frequency data on about 30 microsatellite loci and the use ofD A (Neiet al. 1983) orD c (Cavalli-Sforza and Edwards 1967) distance measures with UPGMA clustering algorithm can recover true short-term evolutionary relationships with a high degree of accuracy, unless the effective sizes of the populations or mutation rates or both are very small.  相似文献   

2.
Microsatellites (simple sequence repeats, SSRs) still remain popular molecular markers for studying neutral genetic variation. Two alternative models outline how new microsatellite alleles evolve. Infinite alleles model (IAM) assumes that all possible alleles are equally likely to result from a mutation, while stepwise mutation model (SMM) describes microsatellite evolution as stepwise adding or subtracting single repeat units. Genetic relationships between individuals can be analyzed in higher precision when assuming the SMM scenario with allele size differences as a proxy of genetic distance. If population structure is not predetermined in advance, an empirical data analysis usually includes (a) estimating proximity between individual SSR profiles with a selected dissimilarity measure and (b) determining putative genetic structure of a given set of individuals using methods of clustering and/or ordination for the obtained dissimilarity matrix. We developed new dissimilarity indices between SSR profiles of haploid, diploid, or polyploid organisms assuming different mutation models and compared the performance of these indices for determining genetic structure with population data and with simulations. More specifically, we compared SMM with a constant or variable mutation rate at different SSR loci to IAM using data from natural populations of a freshwater bryozoan Cristatella mucedo (diploid), wheat leaf rust Puccinia triticina (dikaryon), and wheat powdery mildew Blumeria graminis (monokaryon). We show that inferences about population genetic structure are sensitive to the assumed mutation model. With simulations, we found that Bruvo's distance performs generally poorly, while the new metrics are capturing the differences in the genetic structure of the populations.  相似文献   

3.
The most commonly used measure of evolutionary distance in molecular phylogenetics is the number of nucleotide substitutions per site. However, this number is not necessarily most efficient for reconstructing a phylogenetic tree. In order to evaluate the accuracy of evolutionary distance, D(t), for obtaining the correct tree topology, an accuracy index, A(t), was proposed. This index is defined as D'(t)/square root of[D(t)], where D'(t) is the first derivative of D(t) with respect to evolutionary time and V[D(t)] is the sampling variance of evolutionary distance. Using A(t), namely, finding the condition under which A(t) gives the maximum value, we can obtain an evolutionary distance which is efficient for obtaining the correct topology. Under the assumption that the transversional changes do not occur as frequently as the transitional changes, we obtained the evolutionary distances which are expected to give the correct topology more often than are the other distances.   相似文献   

4.
Accuracy of estimated phylogenetic trees from molecular data   总被引:27,自引:0,他引:27  
The accuracies and efficiencies of three different methods of making phylogenetic trees from gene frequency data were examined by using computer simulation. The methods examined are UPGMA, Farris' (1972) method, and Tateno et al.'s (1982) modified Farris method. In the computer simulation eight species (or populations) were assumed to evolve according to a given model tree, and the evolutionary changes of allele frequencies were followed by using the infinite-allele model. At the end of the simulated evolution five genetic distance measures (Nei's standard and minimum distances, Rogers' distance, Cavalli-Sforza's f theta, and the modified Cavalli-Sforza distance) were computed for all pairs of species, and the distance matrix obtained for each distance measure was used for reconstructing a phylogenetic tree. The phylogenetic tree obtained was then compared with the model tree. The results obtained indicate that in all tree-making methods examined the accuracies of both the topology and branch lengths of a reconstructed tree (rooted tree) are very low when the number of loci used is less than 20 but gradually increase with increasing number of loci. When the expected number of gene substitutions (M) for the shortest branch is 0.1 or more per locus and 30 or more loci are used, the topological error as measured by the distortion index (dT) is not great, but the probability of obtaining the correct topology (P) is less than 0.5 even with 60 loci. When M is as small as 0.004, P is substantially lower. In obtaining a good topology (small dT and high P) UPGMA and the modified Farris method generally show a better performance than the Farris method. The poor performance of the Farris method is observed even when Rogers' distance which obeys the triangle inequality is used. The main reason for this seems to be that the Farris method often gives overestimates of branch lengths. For estimating the expected branch lengths of the true tree UPGMA shows the best performance. For this purpose Nei's standard distance gives a better result than the others because of its linear relationship with the number of gene substitutions. Rogers' or Cavalli-Sforza's distance gives a phylogenetic tree in which the parts near the root are condensed and the other parts are elongated. It is recommended that more than 30 loci, including both polymorphic and monomorphic loci, be used for making phylogenetic trees. The conclusions from this study seem to apply also to data on nucleotide differences obtained by the restriction enzyme techniques.  相似文献   

5.
Uncovering the correct phylogeny of closely related species requires analysis of multiple gene genealogies or, alternatively, genealogies inferred from the multiple alleles found at highly polymorphic loci, such as microsatellites. However, a concern in using microsatellites is that constraints on allele sizes may occur, resulting in homoplasious distributions of alleles, leading to incorrect phylogenies. Seven microsatellites from the pathogenic fungus Coccidioides immitis were sequenced for 20 clinical isolates chosen to represent the known genetic diversity of the pathogen. An organismal phylogeny for C. immitis was inferred from microsatellite-flanking sequence polymorphisms and other restriction fragment length polymorphism-containing loci. Two microsatellite genetic distances were then used to determine phylogenies for C. immitis, and the trees found by these three methods were compared. Congruence between the organismal and microsatellite phylogenies occurred when microsatellite distances were based on simple allele frequency data. However, complex mutation events at some loci made distances based on stepwise mutation models unreliable. Estimates of times of divergence for the two species of C. immitis based on microsatellites were significantly lower than those calculated from flanking sequence, most likely due to constraints on microsatellite allele sizes. Flanking-sequence insertions/deletions significantly decreased the accuracy of genealogical information inferred from microsatellite loci and caused interspecific length homoplasies at one of the seven loci. Our analysis shows that microsatellites are useful phylogenetic markers, although care should be taken to choose loci with appropriate flanking sequences when they are intended for use in evolutionary studies.  相似文献   

6.
We investigated the occurrence of intracolonial genetic variability (IGV) in Pocillopora corals in the southwestern Indian Ocean. Ninety‐six colonies were threefold‐sampled from three sites in Reunion Island. Nubbins were genotyped using 13 microsatellite loci, and their multilocus genotypes compared. Over 50% of the colonies presented at least two different genotypes among their three nubbins, and IGV was found abundant in all sites (from 36.7% to 58.1%). To define the threshold distinguishing mosaicism from chimerism, we developed a new method based on different evolution models by computing the number of different alleles for the infinite allele model (IAM) and the Bruvo's distance for the stepwise mutation model (SMM). Colonies were considered as chimeras if their nubbins differed from more than four alleles and if the pairwise Bruvo's distance was higher than 0.12. Thus 80% of the IGV colonies were mosaics and 20% chimeras (representing almost 10% of the total sampling). IGV seems widespread in scleractinians and beyond the disabilities of this phenomenon reported in several studies, it should also bring benefits. Next steps are to identify these benefits and to understand processes leading to IGV, as well as factors influencing them.  相似文献   

7.
Statistical properties of the symmetric stepwise-mutation model for microsatellite evolution are studied under the assumption that the number of repeats is strictly bounded above and below. An exact analytic expression is found for the expected products of the frequencies of alleles separated by k repeats. This permits characterization of the asymptotic behavior of our distances D(1) and (δμ)(2) under range constraints. Based on this characterization we develop transformations that partially restore linearity when allele size is restricted. We show that the appropriate transformation cannot be applied in the case of varying mutation rates (β) and range constraints (R) because of statistical difficulties. In the special case of no variation in β and R across loci, however, the transformation simplifies to a usable form and results in a distance much more linear with time than distances developed for an infinite range. Although analytically incorrect in the case of variation in β and R, the simpler transformation is surprisingly insensitive to variation in these parameters, suggesting that it may have considerable utility in phylogenetic studies.  相似文献   

8.
Variable numbers of tandem repeats (VNTRs) are a class of highly informative and widely dispersed genetic markers. Despite their wide application in biological science, little is known about their mutational mechanisms or population dynamics. The objective of this work was to investigate four summary measures of VNTR allele frequency distributions: number of alleles, number of modes, range in allele size and heterozygosity, using computer simulations of the one-step stepwise mutation model (SMM). We estimated these measures and their probability distributions for a wide range of mutation rates and compared the simulation results with predictions from analytical formulations of the one-step SMM. The average heterozygosity from the simulations agreed with the analytical expectation under the SMM. The average number of alleles, however, was larger in the simulations than the analytical expectation of the SMM. We then compared our simulation expectations with actual data reported in the literature. We used the sample size and observed heterozygosity to determine the expected value, 5th and 95th percentiles for the other three summary measures, allelic size range, number of modes and number of alleles. The loci analyzed were classified into three groups based on the size of the repeat unit: microsatellites (1-2 base pair (bp) repeat unit), short tandem repeats [(STR) 3-5 bp repeat unit], and minisatellites (15-70 bp repeat unit). In general, STR loci were most similar to the simulation results under the SMM for the three summary measures (number of alleles, number of modes and range in allele size), followed by the microsatellite loci and then by the minisatellite loci, which showed deviations in the direction of the infinite allele model (IAM). Based on these differences, we hypothesize that these three classes of loci are subject to different mutational forces.  相似文献   

9.
Landry PA  Koskinen MT  Primmer CR 《Genetics》2002,161(3):1339-1347
Numerous studies have relied on microsatellite DNA data to assess the relationships among populations in a phylogenetic framework, converting microsatellite allelic composition of populations into evolutionary distances. Among other coefficients, (deltamu)(2) and R(st) are often employed because they make use of the differences in allele sizes on the basis of the stepwise mutation model. While it has been recognized that some microsatellites can yield disproportionate interpopulation distance estimates, no formal investigation has been conducted to evaluate to what extent such loci could affect the topology of the corresponding dendrograms. Here we show that single loci, displaying extremely large among-population variance, can greatly bias the topology of the phylogenetic tree, using data from European grayling (Thymallus thymallus, Salmonidae) populations. Importantly, we also demonstrate that the inclusion of a single disproportionate locus will lead to an overestimation of the stability of trees assessed using bootstrapping. To avoid this bias, we introduce a simple statistical test for detecting loci with significantly disproportionate variance prior to phylogenetic analyses and further show that exclusion of offending loci eliminates the false increase in phylogram stability.  相似文献   

10.
The relative efficiencies of the maximum-likelihood (ML), neighbor- joining (NJ), and maximum-parsimony (MP) methods in obtaining the correct topology and in estimating the branch lengths for the case of four DNA sequences were studied by computer simulation, under the assumption either that there is variation in substitution rate among different nucleotide sites or that there is no variation. For the NJ method, several different distance measures (Jukes-Cantor, Kimura two- parameter, and gamma distances) were used, whereas for the ML method three different transition/transversion ratios (R) were used. For the MP method, both the standard unweighted parsimony and the dynamically weighted parsimony methods were used. The results obtained are as follows: (1) When the R value is high, dynamically weighted parsimony is more efficient than unweighted parsimony in obtaining the correct topology. (2) However, both weighted and unweighted parsimony methods are generally less efficient than the NJ and ML methods even in the case where the MP method gives a consistent tree. (3) When all the assumptions of the ML method are satisfied, this method is slightly more efficient than the NJ method. However, when the assumptions are not satisfied, the NJ method with gamma distances is slightly better in obtaining the correct topology than is the ML method. In general, the two methods show more or less the same performance. The NJ method may give a correct topology even when the distance measures used are not unbiased estimators of nucleotide substitutions. (4) Branch length estimates of a tree with the correct topology are affected more easily than topology by violation of the assumptions of the mathematical model used, for both the ML and the NJ methods. Under certain conditions, branch lengths are seriously overestimated or underestimated. The MP method often gives serious underestimates for certain branches. (5) Distance measures that generate the correct topology, with high probability, do not necessarily give good estimates of branch lengths. (6) The likelihood-ratio test and the confidence-limit test, in Felsenstein's DNAML, for examining the statistical of branch length estimates are quite sensitive to violation of the assumptions and are generally too liberal to be used for actual data. Rzhetsky and Nei's branch length test is less sensitive to violation of the assumptions than is Felsenstein's test. (7) When the extent of sequence divergence is < or = 5% and when > or = 1,000 nucleotides are used, all three methods show essentially the same efficiency in obtaining the correct topology and in estimating branch lengths.(ABSTRACT TRUNCATED AT 400 WORDS)   相似文献   

11.
Accuracy of phylogenetic trees estimated from DNA sequence data   总被引:4,自引:1,他引:3  
The relative merits of four different tree-making methods in obtaining the correct topology were studied by using computer simulation. The methods studied were the unweighted pair-group method with arithmetic mean (UPGMA), Fitch and Margoliash's (FM) method, thd distance Wagner (DW) method, and Tateno et al.'s modified Farris (MF) method. An ancestral DNA sequence was assumed to evolve into eight sequences following a given model tree. Both constant and varying rates of nucleotide substitution were considered. Once the DNA sequences for the eight extant species were obtained, phylogenetic trees were constructed by using corrected (d) and uncorrected (p) nucleotide substitutions per site. The topologies of the trees obtained were then compared with that of the model tree. The results obtained can be summarized as follows: (1) The probability of obtaining the correct rooted or unrooted tree is low unless a large number of nucleotide differences exists between different sequences. (2) When the number of nucleotide substitutions per sequence is small or moderately large, the FM, DW, and MF methods show a better performance than UPGMA in recovering the correct topology. The former group of methods is particularly good for obtaining the correct unrooted tree. (3) When the number of substitutions per sequence is large, UPGMA is at least as good as the other methods, particularly for obtaining the correct rooted tree. (4) When the rate of nucleotide substitution varies with evolutionary lineage, the FM, DW, and MF methods show a better performance in obtaining the correct topology than UPGMA, except when a rooted tree is to be produced from data with a large number of nucleotide substitutions per sequence.(ABSTRACT TRUNCATED AT 250 WORDS)   相似文献   

12.
The genetic relationships of five Indian horse breeds, namely Marwari, Spiti, Bhutia, Manipuri and Zanskari were studied using microsatellite markers. The DNA samples of 189 horses of these breeds were amplified by polymerase chain reaction using 25 microsatellite loci. The total number of alleles varied from five to 10 with a mean heterozygosity of 0.58 ± 0.05. Spiti and Zansakari were the most closely related breeds, whereas, Marwari and Manipuri were most distant apart with Nei's DA genetic distance of 0.071 and 0.186, respectively. In a Nei's DA genetic distances based neighbour joining dendrogram of these breeds and a Thoroughbred horse outgroup, the four pony breeds of Spiti, Bhutia, Manipuri and Zanskari clustered together and then with the Marwari breed. All the Indian breeds clustered independently from Thoroughbreds. The genetic relationships of Indian horse breeds to each other correspond to their geographical/environmental distribution.  相似文献   

13.
An expression is obtained for the time-dependent variance of the microsatellite genetic distance (delta(mu))2 when the mutation rate is allowed to vary randomly among loci. An estimator is presented for the coefficient of variation, C(w), in the mutation rate. Estimated values of C(w) from genetic distances between African and non-African populations were less than 100%. Caveats to this conclusion are discussed.  相似文献   

14.
We investigated genetic variation at six microsatellite (simple sequence repeat) loci in yellow baboons (Papio hamadryas cynocephalus) at two localities: the Tana River Primate Reserve in eastern Kenya and Mikumi National Park, central Tanzania. The six loci (D1S158, D2S144, D4S243, D5S1466, D16S508, and D17S804) were all originally cloned from and characterized in the human genome. These microsatellites are polymorphic in both baboon populations, with the average heterozygosity across loci equal to 0.731 in the Tana River sample and 0.787 in the Mikumi sample. The genetic differentiation between the two populations is substantial. Kolmogornov–Smirnov tests indicate that five of the six loci are significantly different in allele frequencies in the two populations. The mean F ST across loci is 0.069, and Shriver's measure of genetic distance, which was developed for microsatellite loci (Shriver et al., 1995), is 0.255. This genetic distance is larger than corresponding distances among human populations residing in different continents. We conclude that (a) the arrays of alleles present at these six microsatellite loci in two geographically separated populations of yellow baboons are quite similar, but (b) the two populations exhibit significant differences in allele frequencies. This study illustrates the potential value of human microsatellite loci for analyses of population genetic structure in baboons and suggests that this approach will be useful in studies of other Old World monkeys.  相似文献   

15.
Microsatellites are now used ubiquitously as genetic markers. One important application is to the assessment of population subdivision and phylogenetic relatedness. Such applications require a method of estimation of genetic distance. Here we examine the most widely used measure of microsatellite genetic distance, Goldstein et al.'s delta-mu squared ([delta mu]2), with respect to a large data set of 213 markers typed across samples from four diverse human populations. We find that (delta mu)2 yields plausible interpopulation distances. For the first time, we report significant interpopulation differences in mean microsatellite length, although the effect of these differences on (delta mu)2 is negligible. However, we also show that the method is extremely sensitive to one or two loci that contribute extreme values, even when a sample size of >200 loci is used. Some of these extreme loci can be removed on the grounds that some alleles carry large indels, but for others there is no clear justification for exclusion a priori. Our data suggest a rather recent African/non-African split, with an upper limit of some 70,000-80,000 years ago.  相似文献   

16.
Telomere function is essential to maintaining the physical integrity of linear chromosomes and healthy human aging. The probability of forming proper telomere structures depends on the length of the telomeric DNA tract. We attempted to identify common genetic variants associated with log relative telomere length using genome-wide genotyping data on 3,554 individuals from the Nurses'' Health Study and the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial that took part in the National Cancer Institute Cancer Genetic Markers of Susceptibility initiative for breast and prostate cancer. After genotyping 64 independent SNPs selected for replication in additional Nurses'' Health Study and Women''s Genome Health Study participants, we did not identify genome-wide significant loci; however, we replicated the inverse association of log relative telomere length with the minor allele variant [C] of rs16847897 at the TERC locus (per allele β = −0.03, P = 0.003) identified by a previous genome-wide association study. We did not find evidence for an association with variants at the OBFC1 locus or other loci reported to be associated with telomere length. With this sample size we had >80% power to detect β estimates as small as ±0.10 for SNPs with minor allele frequencies of ≥0.15 at genome-wide significance. However, power is greatly reduced for β estimates smaller than ±0.10, such as those for variants at the TERC locus. In general, common genetic variants associated with telomere length homeostasis have been difficult to detect. Potential biological and technical issues are discussed.  相似文献   

17.
The landscape genetics of a widespread and locally adaptated species increases the understanding of the adaptive evolutionary process of local flora and vegetation. In this study, we selected Cotinus coggygria, which is widespread and locally adaptated species in China's warm-temperate zone, to investigate its landscape genetic pattern. We used eight microsatellite loci to examine the adaptive genetic variations of C. coggygria. A total of 43 microsatellite alleles were genotyped in 142 individual plants from 16 wild populations. The data demonstrated significant population differentiation within C. coggygria, which was caused by geographical distance, human activities, and precipitation. Five ecologically relevant microsatellite alleles, which were all related to the precipitation, were identified by association analysis. Our results indicate that precipitation is an important factor that drives adaptive genetic differentiation in C. coggygria.  相似文献   

18.
Takezaki N  Nei M 《Genetics》2008,178(1):385-392
Microsatellite DNA loci or short tandem repeats (STRs) are abundant in eukaryotic genomes and are often used for constructing phylogenetic trees of closely related populations or species. These phylogenetic trees are usually constructed by using some genetic distance measure based on allele frequency data, and there are many distance measures that have been proposed for this purpose. In the past the efficiencies of these distance measures in constructing phylogenetic trees have been studied mathematically or by computer simulations. Recently, however, allele frequencies of 783 STR loci have been compiled from various human populations. We have therefore used these empirical data to investigate the relative efficiencies of different distance measures in constructing phylogenetic trees. The results showed that (1) the probability of obtaining the correct branching pattern of a tree (PC) is generally highest for DA distance; (2) FST*, standard genetic distance (DS), and FST/(1-FST) give similar PC-values, FST* being slightly better than the other two; and (3) (deltamu)2 shows PC-values much lower than the other distance measures. To have reasonably high PC-values for trees similar to ours, at least 30 loci with a minimum of 15 individuals are required when DA distance is used.  相似文献   

19.
One of the key issues concerning the application of microsatellite DNA data in evolutionary studies is how the number of loci applied may influence the stability of genetic distances and corresponding phylograms. While computer simulations have suggested that over 30 microsatellites are required for accurate evolutionary inference, we show that a median of only six loci have been generally applied in studies of wild populations. Factors contributing to this contrast include: i) uncertainty regarding the potential benefits that can be gained from a realistic increase in the number of loci used; and ii) the lack of empirical studies assessing the influence of the number of microsatellites on the reliability of genetic distance estimation and phylogeny construction. In order to address these issues, we applied resampling techniques to microsatellite data in widely distributed populations of European grayling (Thymallus thymallus, Salmonidae). In agreement with expectations based on simulated data, we demonstrate empirically that the stability of commonly used genetic distances (DCE, DA and (deltamu)2) and the corresponding neighbor-joining phylograms is positively associated with the number of microsatellites utilized. For instance, increasing the number of loci from six to 17 resulted in a striking 75% increase in the proportion of DCE phylogram nodes supported by a bootstrap estimate of over 70%. Our results demonstrate that even moderately increasing the number of loci can be very beneficial--a finding extremely relevant for studies of natural populations for which optimally high microsatellite numbers are out of reach. Furthermore, the number of loci most commonly used to date may lead to erroneous inference of the evolutionary relationships between populations.  相似文献   

20.
In many studies involving microsatellites cross-species amplification, primers designed for one (source) species are used to amplify homologous loci in related (target) species. However, it is not clear how closely related the species must be to attain significant success. Genetic divergence is a clear and easy way to assess similarity between species and provides an accurate measure of their evolutionary distance. Eight Mediterranean target species of the family Serranidae were analysed using twelve primers developed for Serranus cabrilla. Additionally, two mitochondrial genes (12S rRNA and 16S rRNA) were chosen on the basis of their extensive use in phylogenetic and evolutionary analyses to compute genetic divergence between the species. Significant negative correlations were found between genetic divergence and both cross-species amplification and maintained polymorphism of microsatellite markers, which could be generalized by gathering information from different fish studies. The success of obtaining amplifiable and polymorphic microsatellite loci can be a priori approximated knowing the mtDNA genetic divergence between a given source and target species using our inferred regression equations. Electronic Supplementary Material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号