首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.  相似文献   

2.
Summary We have investigated the compositional properties of coding sequences from cold-blooded vertebrates and we have compared them with those from warm-blooded vertebrates. Moreover, we have studied the compositional correlations of coding sequences with the genomes in which they are contained, as well as the compositional correlations among the codon positions of the genes analyzed.The distribution of GC levels of the third codon positions of genes from cold-blooded vertebrates are distinctly different from those of warm-blooded vertebrates in that they do not reach the high values attained by the latter. Moreover, coding sequences from cold-blooded vertebrates are either equal, or, in most cases, lower in GC (not only in third, but also in first and second codon positions) than homologous coding sequences from warm-blooded vertebrates; higher values are exceptional. These results at the gene level are in agreement with the compositional differences between cold-blooded and warm-blooded vertebrates previously found at the whole genome (DNA) level (Bernardi and Bernardi 1990a,b).Two linear correlations were found: one between the GC levels of coding sequences (or of their third codon positions) and the GC levels of the genomes of cold-blooded vertebrates containing them; and another between the GC levels of third and first+ second codon positions of genes from cold-blooded vertebrates. The first correlation applies to the genomes (or genome compartments) of all vertebrates and the second to the genes of all living organisms. These correlations are tantamount to a genomic code.  相似文献   

3.
Codon usage and genome composition   总被引:17,自引:0,他引:17  
Summary The GC levels of codon third positions from 49 genomes coveering a wide phylogenetic range are linearly correlated with the GC levels of the corresponding genomes. Three different relationships have been found: one for prokaryotes and viruses, one for lower eukaryotes, and one for vertebrates. All points not fitting the first relationship can be brought into quasi coincidence with it when plotted against GC levels of coding sequences.  相似文献   

4.
Summary The compositional distributions of coding sequences and DNA molecules (in the 50-100-kb range) are remarkably narrower in murids (rat and mouse) compared to humans (as well as to all other mammals explored so far). In murids, both distributions begin at higher and end at lower GC values. A comparison of homologous coding sequences from murids and humans revealed that their different compositional distributions are due to differences in GC levels in all three codon positions, particularly of genes located at both ends of the distribution. In turn, these differences are responsible for differences in both codon usage and amino acids. When GC levels at first+second codon positions and third codon positions, respectively, of murid genes are plotted against corresponding GC levels of homologous human genes, linear relationships (with very high correlation coefficients and slopes of about 0.78 and 0.60, respectively) are found. This indicates a conservation of the order of GC levels in homologous genes from humans and murids. (The same comparison for mouse and rat genes indicates a conservation of GC levels of homologous genes.) A similar linear relationship was observed when plotting GC levels of corresponding DNA fractions (as obtained by density gradient centrifugation in the presence of a sequence-specific ligand) from mouse and human. These findings indicate that orderly compositional changes affecting not only coding sequences but also noncoding sequences took place since the divergence of murids. Such directional fixations of mutations point to the existence of selective pressures affecting the genome as a whole.  相似文献   

5.
The compositional distributions of large (main-band) DNA fragments from eight birds belonging to eight different orders (including both paleognathous and neognathous species) are very broad and extremely close to each other. These findings, which are paralleled by the compositional similarity of homologous coding sequences and their codon positions, support the idea that birds are a monophyletic group.The compositional distribution of third-codon positions of genes from chicken, the only avian species for which a relatively large number of coding sequences is known, is very broad and bimodal, the minor GC-richer peak reaching 100% GC. The very high compositional heterogeneity of avian genomes is accompanied (as in the case of mammalian genomes) by a very high speciation rate compared to cold-blooded vertebrates which are characterized by genomes that are much less heterogeneous. The higher GC levels attained by avian compared to mammalian genomes might be correlated with the higher body temperature (41–43°C) of birds compared to mammals (37°C).A comparison of GC levels of coding sequences and codon positions from man and chicken revealed very close average GC levels and standard deviations. Homologous coding sequences and codon positions from man and chicken showed a surprisingly high degree of compositional similarity which was, however, higher for GC-poor than for GC-rich sequences. This indicates that GC-poor isochores of warm-blooded vertebrates reflect the composition of the isochores of the genome of the common reptilian ancestor of mammals and birds, which underwent only a small compositional change at the transition from cold- to warm-blooded vertebrates. In contrast, the GC-rich isochores of birds and mammals are the result of large compositional changes at the same evolutionary transition, where were in part different in the two classes of warm-blooded vertebrates.Correspondence to: G. Bernaadi  相似文献   

6.
Romero H  Zavala A  Musto H 《Gene》2000,242(1-2):307-311
It is widely accepted that the compositional pressure is the only factor shaping codon usage in unicellular species displaying extremely biased genomic compositions. This seems to be the case in the prokaryotes Mycoplasma capricolum, Rickettsia prowasekii and Borrelia burgdorferi (GC-poor), and in Micrococcus luteus (GC-rich). However, in the GC-poor unicellular eukaryotes Dictyostelium discoideum and Plasmodium falciparum, there is evidence that selection, acting at the level of translation, influences codon choices. This is a twofold intriguing finding, since (1) the genomic GC levels of the above mentioned eukaryotes are lower than the GC% of any studied bacteria, and (2) bacteria usually have larger effective population sizes than eukaryotes, and hence natural selection is expected to overcome more efficiently the randomizing effects of genetic drift among prokaryotes than among eukaryotes. In order to gain a new insight about this problem, we analysed the patterns of codon preferences of the nuclear genes of Entamoeba histolytica, a unicellular eukaryote characterised by an extremely AT-rich genome (GC = 25%). The overall codon usage is strongly biased towards A and T in the third codon positions, and among the presumed highly expressed sequences, there is an increased relative usage of a subset of codons, many of which are C-ending. Since an increase in C in third codon positions is 'against' the compositional bias, we conclude that codon usage in E. histolytica, as happens in D. discoideum and P. falciparum, is the result of an equilibrium between compositional pressure and selection. These findings raise the question of why strongly compositionally biased eukaryotic cells may be more sensitive to the (presumed) slight differences among synonymous codons than compositionally biased bacteria.  相似文献   

7.
S Zoubak  A Rynditch  G Bernardi 《Gene》1992,119(2):207-213
The compositional distributions of genomes, genes (and their third codon positions) and long terminal repeats from retroviruses of warm-blooded vertebrates are characterized by a striking bimodality which is accompanied by a remarkable compositional homogeneity within each retroviral genome. A first, major class of retroviral genomes is GC-rich, whereas a second, minor class is GC-poor. Representative expressed viral genomes from the two classes integrate in GC-rich and GC-poor isochores, respectively, of host genomes. The first class comprises all oncoviruses (except B-types and some D-types), the second, lentiviruses, spumaviruses, as well as B-type and some D-type oncoviruses (e.g., mouse mammary tumor virus and simian retroviruses type D, respectively). The compositional bimodal distribution of retroviral genomes and the accompanying compositional homogeneity within each retroviral genome appear to be the result of the compositional evolution of retroviral genomes in their integrated form.  相似文献   

8.
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch.  相似文献   

9.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

10.
Most prokaryotic genomes display strand compositional asymmetries, but the reasons for these biases remain unclear. When the distribution of gene orientation is biased, as it often is, this may induce a bias in composition, as codon frequencies are not identical. We show here that this effect can be estimated and removed, and that the residual base skews are the highest at third base codon positions and lower at first and second positions. This strongly suggests that compositional asymmetries result from 1) a replication-related mutational bias that is filtered through selective pressure and/or from 2) an uneven distribution of gene orientation. In most cases, the mutational bias alters the codon usage and amino acid frequencies of the leading and the lagging strand. However, these features are not ubiquitous amongst prokaryotes, and the biological reasons for them remain to be found.  相似文献   

11.
Differences in the base composition of genomes can occur because of GC pressure, purine-loading pressure (AG pressure) and RNY pressure, for which there are possible functional explanations, and because of the more abstract pressures exerted by individual bases. The graphical approach of Muto and Osawa was used to analyse how bacteriophages and bacteria balance potentially conflicting pressures on their genomes. Phages generally respond to AG pressure by increasing A while keeping T constant, and by decreasing C while keeping G constant. In contrast, bacteria generally increase both A and T, the former more so, and decrease both G and C, the latter more so. These differences largely occur at third codon positions, which are more responsive than first and second codon positions to AG pressure and GC pressure. Phages respond to AG pressure more in the third codon position than bacteria, whereas bacteria respond more in the first codon position than phages. Conversely, bacteria respond to GC pressure more in the third codon position than phages, whereas phages respond more in the first codon position than bacteria. As GC pressure increases, A is traded for C and AG pressure decreases; first and second codon positions, having more A than T, are most responsive to this negative effect of increased GC pressure; third positions either do not respond (phages) or respond weakly (bacteria). In a set of 48 phage-host pairs, degrees of purine loading were less correlated between phage and host than were GC percentages. These results suggest that pressures on conventional and genome phenotypes operate differentially in phages and bacteria, generating both general differences in base composition and specific differences characteristic of particular phage-host pairs. The reciprocal relationship between GC pressure and AG pressure implies that effects attributed to GC pressure may actually be due to AG pressure, and vice versa.  相似文献   

12.
Wada and colleagues have shown that, whether prokaryotic or eukaryotic, each gene has a "homostabilising propensity" to adopt a relatively uniform GC percentage (GC%). Accordingly, each gene can be viewed as a "microisochore" occupying a discrete GC% niche of relatively uniform base composition amongst its fellow genes. Although first, second and third codon positions usually differ in GC%, each position tends to maintain a uniform, gene-specific GC% value. Thus, within a genome, genic GC% values can cover a wide range. This is most evident at third codon positions, which are least constrained by amino acid encoding needs. In 1991, Wada and colleagues further noted that, within a phylogenetic group, genomic GC% values can also cover a wide range. This is again most evident at third codon positions. Thus, the dispersion of GC% values among genes within a genome matches the dispersion of GC% values among genomes within a phylogenetic group. Wada described the context-independence of plots of different codon position GC% values against total GC% as a "universal" characteristic. Several studies relate this to recombination. We have confirmed that third codon positions usually relate more to the genes that contain them than to the species. However, in genomes with extreme GC% values (low or high), third codon positions tend to maintain a constant GC%, thus relating more to the species than to the genes that contain them. Genes in an extreme-GC% genome collectively span a smaller GC% range, and mainly rely on first and second codon positions for differentiation as "microisochores". Our results are consistent with the view that differences in GC% serve to recombinationally isolate both genome sectors (facilitating gene duplication) and genomes (facilitating genome duplication, e.g. speciation). In intermediate-GC% genomes, conflict between the needs of the species and the needs of individual genes within that species is minimal. However, in extreme-GC% genomes there is a conflict, which is settled in favour of the species (i.e. group selection) rather than in favour of the gene (genic selection).  相似文献   

13.
Naya H  Romero H  Carels N  Zavala A  Musto H 《FEBS letters》2001,501(2-3):127-130
In unicellular species codon usage is determined by mutational biases and natural selection. Among prokaryotes, the influence of these factors is different if the genome is skewed towards AT or GC, since in AT-rich organisms translational selection is absent. On the other hand, in AT-rich unicellular eukaryotes the two factors are present. In order to understand if GC-rich genomes display a similar behavior, the case of Chlamydomonas reinhardtii was studied. Since we found that translational selection strongly influences codon usage in this species, we conclude that there is not a common pattern among unicellular organisms.  相似文献   

14.
Summary We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unity slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.  相似文献   

15.
We have investigated the genome organization in the flatworm Schistosoma mansoni. First, we analyzed the compositional distributions of the three codon positions. Second, we investigated the correlations that exist between (1) the GC levels of exons against flanking regions, (2) the GC levels of third codon positions against flanking regions, (3) the dinucleotide frequencies of exons against flanking regions, and (4) the GC levels of 5 against 3 regions. The modality of the distribution of third codon positions, together with the significant correlations found, leads us to propose that the nuclear genome of this species is compositionally compartmentalized.  相似文献   

16.
The vertebrate genome: isochores and evolution   总被引:18,自引:6,他引:12  
  相似文献   

17.
Okayasu T  Sorimachi K 《Amino acids》2009,36(2):261-271
We recently classified 23 bacteria into two types based on their complete genomes; “S-type” as represented by Staphylococcus aureus and “E-type” as represented by Escherichia coli. Classification was characterized by concentrations of Arg, Ala or Lys in the amino acid composition calculated from the complete genome. Based on these previous classifications, not only prokaryotic but also eukaryotic genome structures were investigated by amino acid compositions and nucleotide contents. Organisms consisting of 112 bacteria, 15 archaea and 18 eukaryotes were classified into two major groups by cluster analysis using GC contents at the three codon positions calculated from complete genomes. The 145 organisms were classified into “AT-type” and “GC-type” represented by high A or T (low G or C) and high G or C (low A or T) contents, respectively, at every third codon position. Reciprocal changes between G or C and A or T contents at the third codon position occurred almost synchronously in every codon among the organisms. Correlations between amino acid concentrations (Ala, Ile and Lys) and the nucleotide contents at the codon position were obtained in both “AT-type” and “GC-type” organisms, but with different regression coefficients. In certain correlations of amino acid concentrations with GC contents, eukaryotes, archaea and bacteria showed different behaviors; thus these kingdoms evolved differently. All organisms are basically classifiable into two groups having characteristic codon patterns; organisms with low GC and high AT contents at the third codon position and their derivatives, and organisms with an inverse relationship.  相似文献   

18.
Veitia RA 《Genomics》2004,83(3):502-507
A compositional analysis of a sample of 50 zebrafish proteins containing at least one alanine run and of their open reading frames (ORFs) has been performed. The sample of poly(Ala) proteins showed a tendency to have runs of other amino acids (His/H, Gln/Q, Ser/S, Pro/P). Their ORFs and the first and second codon positions had higher GC contents than a reference gene set. The "universal" correlation between the GC content of the first+second and third codon positions (GC1+2 vs GC3) does not hold, but I provide an explanation in terms of genomic heterogeneity. Significant correlation between AHQS content and GC3 was obtained, reflecting codon bias favoring G/C at the third codon position of these amino acids. A correspondence analysis (COA) of relative synonymous codon usage showed that the poly(Ala) proteins have a biased distribution according to the second axis of the COA, which correlates with gene expression in zebrafish. A comparison with human is undertaken.  相似文献   

19.
Genomes of the herpes simplex viruses are extremely enriched with GC. Elevated G+C level in genomes of the simplex viruses is a result of their long-term evolution under the influence of the mutational pressure. We counted the rates of nucleotide substitutions from gene coding major capsid protein (MCP) (G+C = 0.68, 3GC = 0.89) of human simplex virus 1 (HSV-1) to the MCP gene (G+C = 0.70, 3GC = 0.91) of HSV-2 (the first pair of genes) and from the same MCP gene of HSV-1 to the homologous gene (G+C = 0.73, 3GC = 0.99) from cercopithecine herpes virus 16 (the second pair of genes). The rates of transitions from A-T to G-C base pairs increases 2.17-, 3.09-, and 1.27-fold in the first, second, and third codon positions, respectively, if compared those rates between the second and first pair of genes (the growth of GC-richness is only 3%). This effect is due to an approximately 90% GC-richness of the third codon positions in all those genes. Transitions caused by the strong mutational pressure (from A-T to G-C base pairs) have a low probability to occur in the third positions, but high probability to occur in the first and second positions. For MCP gene of human herpes 3, the probability of the occurrence of transition caused by mutational pressure in the third codon position is 2.36 times higher than in MCP gene of HSV1, and 3 times higher than in MCP gene of HSV2. These data could provide an explanation of rarely occurring relapses of herpes Zoster infection and frequently occurring relapses of herpes simplex infection.  相似文献   

20.
The purpose of our work was to analyze the case of the strong mutational GC-pressure influence on the ratio between nonsynonymous (DN) and synonymous (DS) distances (DN/DS ratio). We have used as the material the genes coding for ICP0 from five completely sequenced genomes of simplexviruses. DN/DS ratio, total GC-content (G + C), and GC-content in first, second, and third codon positions (1GC, 2GC, and 3GC, respectively) have been calculated separately for exon 2, nonconserved part of exon 3, and conserved part of exon 3 from ICP0 genes. Results showed that DN is more than DS only in the conserved part of exon 3 of ICP0 genes from cercopithecine herpesvirus 2 and cercopithecine herpesvirus 16. However, the cause of this result (DN/DS = 2.54) is the GC-pressure acting on the coding districts with 3GC = 99% rather than the biological process called positive selection. Only in these two viruses, because of the strong GC-pressure, 3GC has reached 99% in the conserved part of ICP0 exon 3, and so nucleotide substitutions that increase the GC-content practically cannot occur in third codon positions, where most substitutions are synonymous. In this case, GC-pressure has a substrate for nucleotide substitutions only in first and second codon positions, where most substitutions are nonsynonymous.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号