首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
G D'Onofrio  G Bernardi 《Gene》1992,110(1):81-88
We have investigated the compositional distributions of third codon positions of genes from the 16 prokaryotes and seven eukaryotes for which the largest numbers of coding sequences are available in data banks. In prokaryotes, both narrow and broad distributions were found. In eukaryotes, distributions were very broad (except for Saccharomyces cerevisiae) and remarkably different for different genomes. In low-GC genomes, third codon positions were lower in GC than first + second codon positions and trailed towards high GC; the opposite situation was found for high-GC genomes. In all genomes, first codon positions were higher in GC than second codon positions. We then investigated the compositional correlations between third and first + second codon positions in prokaryotic genomes (the 16 mentioned above plus 87 additional ones) and in genome compartments of eukaryotes. A general, common relationship was found, which also holds within the same (heterogeneous) genomes. This universal correlation is due to the fact that the relative effects of compositional constraints on different codon positions are the same, on the average, whatever the genome under consideration.  相似文献   

2.
Genome-wide analysis of sequence divergence patterns in 12,024 human-mouse orthologous pairs reveals, for the first time, that the trends in nucleotide and amino acid substitutions in orthologs of high and low GC composition are highly asymmetric and polarized to opposite directions. The entire dataset has been divided into three groups on the basis of the GC content at third codon sites of human genes: high, medium, and low. High-GC orthologs exhibit significant bias in favor of the replacements, Thr --> Ala, Ser --> Ala, Val --> Ala, Lys --> Arg, Asn --> Ser, Ile --> Val etc., from mouse to human, whereas in low-GC orthologs, the reverse trends prevail. In general, in the high-GC group, residues encoded by A/U-rich codons of mouse proteins tend to be replaced by the residues encoded by relatively G/C-rich codons in their human orthologs, whereas the opposite trend is observed among the low-GC orthologous pairs. The medium-GC group shares some trends with high-GC group and some with low-GC group. The only significant trend common in all groups of orthologs, irrespective of their GC bias, is (Asp)(Mouse) --> (Glu)(Human) replacement. At the nucleotide level, high-GC orthologs have undergone a large excess of (A/T)(Mouse) --> (G/C)(Human) substitutions over (G/C)(Mouse) --> (A/T)(Human) at each codon position, whereas for low-GC orthologs, the reverse is true.  相似文献   

3.
Codon usage in Clonorchis sinensis was analyzed using 12,515 codons from 38 coding sequences. Total GC content was 49.83%, and GC1, GC2 and GC3 contents were 56.32%, 43.15% and 50.00%, respectively. The effective number of codons converged at 51-53 codons. When plotted against total GC content or GC3, codon usage was distributed in relation to GC3 biases. Relative synonymous codon usage for each codon revealed a single major trend, which was highly correlated with GC content at the third position when codons began with A or U at the first two positions. In codons beginning with G or C base at the first two positions, the G or C base rarely occurred at the third position. These results suggest that codon usage is shaped by a bias towards G or C at the third base, and that this is affected by the first and second bases.  相似文献   

4.
Differences in the base composition of genomes can occur because of GC pressure, purine-loading pressure (AG pressure) and RNY pressure, for which there are possible functional explanations, and because of the more abstract pressures exerted by individual bases. The graphical approach of Muto and Osawa was used to analyse how bacteriophages and bacteria balance potentially conflicting pressures on their genomes. Phages generally respond to AG pressure by increasing A while keeping T constant, and by decreasing C while keeping G constant. In contrast, bacteria generally increase both A and T, the former more so, and decrease both G and C, the latter more so. These differences largely occur at third codon positions, which are more responsive than first and second codon positions to AG pressure and GC pressure. Phages respond to AG pressure more in the third codon position than bacteria, whereas bacteria respond more in the first codon position than phages. Conversely, bacteria respond to GC pressure more in the third codon position than phages, whereas phages respond more in the first codon position than bacteria. As GC pressure increases, A is traded for C and AG pressure decreases; first and second codon positions, having more A than T, are most responsive to this negative effect of increased GC pressure; third positions either do not respond (phages) or respond weakly (bacteria). In a set of 48 phage-host pairs, degrees of purine loading were less correlated between phage and host than were GC percentages. These results suggest that pressures on conventional and genome phenotypes operate differentially in phages and bacteria, generating both general differences in base composition and specific differences characteristic of particular phage-host pairs. The reciprocal relationship between GC pressure and AG pressure implies that effects attributed to GC pressure may actually be due to AG pressure, and vice versa.  相似文献   

5.
To study the possible codon usage and base composition variation in the bacteriophages, fourteen mycobacteriophages were used as a model system here and both the parameters in all these phages and their plating bacteria, M. smegmatis had been determined and compared. As all the organisms are GC-rich, the GC contents at third codon positions were found in fact higher than the second codon positions as well as the first + second codon positions in all the organisms indicating that directional mutational pressure is strongly operative at the synonymous third codon positions. Nc plot indicates that codon usage variation in all these organisms are governed by the forces other than compositional constraints. Correspondence analysis suggests that: (i) there are codon usage variation among the genes and genomes of the fourteen mycobacteriophages and M. smegmatis, i.e., codon usage patterns in the mycobacteriophages is phage-specific but not the M. smegmatis-specific; (ii) synonymous codon usage patterns of Barnyard, Che8, Che9d, and Omega are more similar than the rest mycobacteriophages and M. smegmatis; (iii) codon usage bias in the mycobacteriophages are mainly determined by mutational pressure; and (iv) the genes of comparatively GC rich genomes are more biased than the GC poor genomes. Translational selection in determining the codon usage variation in highly expressed genes can be invoked from the predominant occurrences of C ending codons in the highly expressed genes. Cluster analysis based on codon usage data also shows that there are two distinct branches for the fourteen mycobacteriophages and there is codon usage variation even among the phages of each branch.  相似文献   

6.
The frequencies of occurrence of four bases in the first, second and third codon positions and in the total coding sequences have been calculated by the codon usage table published in 1990 by Ikemura et al. The distribution of frequencies are further analysed in detail by a graphic technique presented recently by us. Formulas expressing the frequencies of four bases in the first and second codon positions in terms of frequencies of amino acids have been given. It is shown by the graphic analysis that for 90 species, in the first codon position the purine bases are dominant and in most cases G is the most dominant base. In the second codon position A is the most dominant base, while G is the least dominant base. In the third codon position the G + C content varies from 0.1 to 0.9, keeping the A + C content equal to 1/2 and G content equal to that of C, approximately. If the frequencies for bases A, C, G and U in the total coding sequences are denoted by a, c, g and u, respectively, it is found that the unequal formula: a2 + c2 + g2 + u2 less than 1/3, is valid for each of the 90 species including the human and E.coli etc.  相似文献   

7.
Okayasu T  Sorimachi K 《Amino acids》2009,36(2):261-271
We recently classified 23 bacteria into two types based on their complete genomes; “S-type” as represented by Staphylococcus aureus and “E-type” as represented by Escherichia coli. Classification was characterized by concentrations of Arg, Ala or Lys in the amino acid composition calculated from the complete genome. Based on these previous classifications, not only prokaryotic but also eukaryotic genome structures were investigated by amino acid compositions and nucleotide contents. Organisms consisting of 112 bacteria, 15 archaea and 18 eukaryotes were classified into two major groups by cluster analysis using GC contents at the three codon positions calculated from complete genomes. The 145 organisms were classified into “AT-type” and “GC-type” represented by high A or T (low G or C) and high G or C (low A or T) contents, respectively, at every third codon position. Reciprocal changes between G or C and A or T contents at the third codon position occurred almost synchronously in every codon among the organisms. Correlations between amino acid concentrations (Ala, Ile and Lys) and the nucleotide contents at the codon position were obtained in both “AT-type” and “GC-type” organisms, but with different regression coefficients. In certain correlations of amino acid concentrations with GC contents, eukaryotes, archaea and bacteria showed different behaviors; thus these kingdoms evolved differently. All organisms are basically classifiable into two groups having characteristic codon patterns; organisms with low GC and high AT contents at the third codon position and their derivatives, and organisms with an inverse relationship.  相似文献   

8.
Summary We have investigated the compositional properties of coding sequences from cold-blooded vertebrates and we have compared them with those from warm-blooded vertebrates. Moreover, we have studied the compositional correlations of coding sequences with the genomes in which they are contained, as well as the compositional correlations among the codon positions of the genes analyzed.The distribution of GC levels of the third codon positions of genes from cold-blooded vertebrates are distinctly different from those of warm-blooded vertebrates in that they do not reach the high values attained by the latter. Moreover, coding sequences from cold-blooded vertebrates are either equal, or, in most cases, lower in GC (not only in third, but also in first and second codon positions) than homologous coding sequences from warm-blooded vertebrates; higher values are exceptional. These results at the gene level are in agreement with the compositional differences between cold-blooded and warm-blooded vertebrates previously found at the whole genome (DNA) level (Bernardi and Bernardi 1990a,b).Two linear correlations were found: one between the GC levels of coding sequences (or of their third codon positions) and the GC levels of the genomes of cold-blooded vertebrates containing them; and another between the GC levels of third and first+ second codon positions of genes from cold-blooded vertebrates. The first correlation applies to the genomes (or genome compartments) of all vertebrates and the second to the genes of all living organisms. These correlations are tantamount to a genomic code.  相似文献   

9.
The number of completely sequenced archaeal genomes has been sufficient for a large-scale bioinformatic study.We have conducted analyses for each coding region from 36 archaeal genomes using the original CGS algorithm by calculating the total GC content(G+C),GC content in first,second and third codon positions as well as in fourfold and twofold degenerated sites from third codon positions,levels of arginine codon usage(Arg2:AGA/G;Arg4:CGX),levels of amino acid usage and the entropy of amino acid content distribution.In archaeal genomes with strong GC pressure,arginine is coded preferably by GC-rich Arg4 codons,whereas in most of archaeal genomes with G+C0.6,arginine is coded preferably by AT-rich Arg2 codons.In the genome of Haloquadratum walsbyi,which is closely related to GC-rich archaea,GC content has decreased mostly in third codon positions,while Arg4Arg2 bias still persists.Proteomes of archaeal species carry characteristic amino acid biases:levels of isoleucine and lysine are elevated,while levels of alanine,histidine,glutamine and cytosine are relatively decreased.Numerous genomic and proteomic biases observed can be explained by the hypothesis of previously existed strong mutational AT pressure in the common predecessor of all archaea.  相似文献   

10.
Abstract

The nucleotide contents of the three codon positions show a number of statistical pairwise correlations, some of which are universal for all analysed genomes. Among the most prominent of these correlations are negative correlations between G and T contents found in genes of all species analysed. The pair A/C, which is complementary to G/T shows similar negative correlation in genes of most species. In the genes of several species including all mammalian genes studied, positive correlations between A and T contents, and G and C contents are found. Since these regularities are observed in all three codon positions they are connected with amino-acid content of proteins. Such correlations may origin from features of the mutation process or/and translation reading frame check. The well-known bias of the preference for G in the first codon position and its deficiency in the second is accompanied by opposite bias in T content. In the third codon position there is no general nucleotide preference, but its content is often biased with regard to GC content of the gene. G and T contents in this case are always shifted in the opposite directions Several ideas are drawn to explain this preference.  相似文献   

11.
Base composition varies among and within eukaryote genomes. Although mutational bias and selection have initially been invoked, more recently GC-biased gene conversion (gBGC) has been proposed to play a central role in shaping nucleotide landscapes, especially in yeast, mammals, and birds. gBGC is a kind of meiotic drive in favor of G and C alleles, associated with recombination. Previous studies have also suggested that gBGC could be at work in grass genomes. However, these studies were carried on third codon positions that can undergo selection on codon usage. As most preferred codons end in G or C in grasses, gBGC and selection can be confounded. Here we investigated further the forces that might drive GC content evolution in the rice genus using both coding and noncoding sequences. We found that recombination rates correlate positively with equilibrium GC content and that selfing species (Oryza sativa and O. glaberrima) have significantly lower equilibrium GC content compared with more outcrossing species. As recombination is less efficient in selfing species, these results suggest that recombination drives GC content. We also detected a positive relationship between expression levels and GC content in third codon positions, suggesting that selection favors codons ending with G or C bases. However, the correlation between GC content and recombination cannot be explained by selection on codon usage alone as it was also observed in noncoding positions. Finally, analyses of polymorphism data ruled out the hypothesis that genomic variation in GC content is due to mutational processes. Our results suggest that both gBGC and selection on codon usage affect GC content in the Oryza genus and likely in other grass species.  相似文献   

12.
The GC contents of 2670 prokaryotic genomes that belong to diverse phylogenetic lineages were analyzed in this paper. These genomes had GC contents that ranged from 13.5% to 74.9%. We analyzed the distance of base frequencies at the three codon positions, codon frequencies, and amino acid compositions across genomes with respect to the differences in the GC content of these prokaryotic species. We found that although the phylogenetic lineages were remote among some species, a similar genomic GC content forced them to adopt similar base usage patterns at the three codon positions, codon usage patterns, and amino acid usage patterns. Our work demonstrates that in prokaryotic genomes: a) base usage, codon usage, and amino acid usage change with GC content with a linear correlation; b) the distance of each usage has a linear correlation with the GC content difference; and c) GC content is more essential than phylogenetic lineage in determining base usage, codon usage, and amino acid usage. This work is exceptional in that we adopted intuitively graphic methods for all analyses, and we used these analyses to examine as many as 2670 prokaryotes. We hope that this work is helpful for understanding common features in the organization of microbial genomes.  相似文献   

13.
Guo X  Bao J  Fan L 《FEBS letters》2007,581(5):1015-1021
Two gene classes characterized by high and low GC content have been found in rice and other cereals, but not dicot genomes. We used paralogs with high and low GC contents in rice and found: (a) a greater increase in GC content at exonic fourfold-redundant sites than at flanking introns; (b) with reference to their orthologs in Arabidopsis, most substitution sites between the two kinds of paralogs are found at 2- and 4-degenerate sites with a T-->C mode, while A-->C and A-->G play major roles at 0-degenerate sites; and (c) high-GC genes have greater bias and codon usage is skewed toward codons that are preferred in highly expressed genes. We believe this is strong evidence for selectively driven codon usage in rice. Another cereal, maize, also showed the same trend as in rice. This represents a potential evolutionary process for the origin of genes with a high GC content in rice and other cereals.  相似文献   

14.
Zhang SH  Wang L 《Genomics》2011,97(5):330-331
It has been reported that there is a majority triplet profile among genomes, which was considered as a reflection of general mechanisms of genome evolution (Albrecht-Buehler, 2007). However, there are actually, according to our further analysis and at least among prokaryotic genomes, two common triplet profiles: one is from low-GC content genomes; the other is from high-GC content genomes. Both common profiles would be direct reflections of GC content variations and strand symmetry of genomic sequences.  相似文献   

15.
Patterns of codon usage bias in three dicot and four monocot plant species   总被引:9,自引:0,他引:9  
Codon usage in nuclear genes of four monocot and three dicot species was analyzed to find general patterns in codon choice of plant species. Codon bias was correlated with GC content at the third codon position. GC contents were higher in monocot species than in dicot species at all codon positions. The high GC contents of monocot species might be the result of relatively strong mutational bias that occurred in the lineage of the Poaceae species. In both dicot and monocot species, the effective number of codons (ENCs) for most genes was similar to that for the expected ENCs based on the GC content at the third codon positions. G and C ending codons were detected as the "preferred" codons in monocot species, as in Drosophila. Also, many "preferred" codons are the same in dicot species. Pyrimidine (C and T) is used more frequently than purine (G and A) in four-fold degenerate codon groups.  相似文献   

16.
Analysis of codon usage pattern is important to understand the genetic and evolutionary characteristics of genomes. We have used bioinformatic approaches to analyze the codon usage bias (CUB) of the genes located in human Y chromosome. Codon bias index (CBI) indicated that the overall extent of codon usage bias was low. The relative synonymous codon usage (RSCU) analysis suggested that approximately half of the codons out of 59 synonymous codons were most frequently used, and possessed a T or G at the third codon position. The codon usage pattern was different in different genes as revealed from correspondence analysis (COA). A significant correlation between effective number of codons (ENC) and various GC contents suggests that both mutation pressure and natural selection affect the codon usage pattern of genes located in human Y chromosome. In addition, Y-linked genes have significant difference in GC contents at the second and third codon positions, expression level, and codon usage pattern of some codons like the SPANX genes in X chromosome.  相似文献   

17.
武伟  刘洪斌  张泽  鲁成 《生物信息学》2007,5(3):102-105
利用93个节肢动物线粒体基因组数据,分析了线粒体基因组的碱基组成,及对氨基酸组成的影响。研究表明:(1)节肢动物线粒体基因组GC含量较低,分布范围较窄(13.28%~39.64%)。基因组GC含量与密码子第三位置的GC含量间的相关性(r=0.9432,p<0.01)比密码子第一、二位置上的相关性强。(2)在密码子的三个不同位置上均可以观察到C<->T和A<->G相互取代的现象。(3)从NC.004529和NC.003979两个序列的对比研究中可以发现碱基组成变化会引起氨基酸组成的变化,这种变化不仅体现在不同的物种之间,而且也体现在同一基因组内部的不同基因之间,这些影响可能是相互的。表明节肢动物线粒体基因组中的碱基变化是受多种因素共同作用的结果。  相似文献   

18.
Genomes of the herpes simplex viruses are extremely enriched with GC. Elevated G+C level in genomes of the simplex viruses is a result of their long-term evolution under the influence of the mutational pressure. We counted the rates of nucleotide substitutions from gene coding major capsid protein (MCP) (G+C = 0.68, 3GC = 0.89) of human simplex virus 1 (HSV-1) to the MCP gene (G+C = 0.70, 3GC = 0.91) of HSV-2 (the first pair of genes) and from the same MCP gene of HSV-1 to the homologous gene (G+C = 0.73, 3GC = 0.99) from cercopithecine herpes virus 16 (the second pair of genes). The rates of transitions from A-T to G-C base pairs increases 2.17-, 3.09-, and 1.27-fold in the first, second, and third codon positions, respectively, if compared those rates between the second and first pair of genes (the growth of GC-richness is only 3%). This effect is due to an approximately 90% GC-richness of the third codon positions in all those genes. Transitions caused by the strong mutational pressure (from A-T to G-C base pairs) have a low probability to occur in the third positions, but high probability to occur in the first and second positions. For MCP gene of human herpes 3, the probability of the occurrence of transition caused by mutational pressure in the third codon position is 2.36 times higher than in MCP gene of HSV1, and 3 times higher than in MCP gene of HSV2. These data could provide an explanation of rarely occurring relapses of herpes Zoster infection and frequently occurring relapses of herpes simplex infection.  相似文献   

19.
《Gene》1998,215(2):405-413
Biases in the codon usage and base compositions at three codon sites in different genes of A+T-rich Gram-negative bacterium Haemophillus influenzae and G+C-rich Gram-positive bacterium Mycobacterium tuberculosis have been examined to address the following questions: (1) whether the synonymous codon usage in organisms having highly skewed base compositions is totally dictated by the mutational bias as reported previously (Sharp, P.M., Devine, K.M., 1989. Codon usage and gene expression level in Dictyostelium discoideum: highly expressed genes do `prefer' optimal codons. Nucleic Acids Res. 17, 5029–5039), or is also controlled by translational selection; (2) whether preference of G in the first codon positions by highly expressed genes, as reported in Escherichia coli (Gutierrez, G., Marquez, L., Marin, A., 1996. Preference for guanosine at first codon position in highly expressed Escherichia coli genes. A relationship with translational efficiency. Nucleic Acids Res. 24, 2525–2527), is true in other bacteria; and (3) whether the usage of bases in three codon positions is species-specific. Result presented here show that even in organisms with high mutational bias, translational selection plays an important role in dictating the synonymous codon usage, though the set of optimal codons is chosen in accordance with the mutational pressure. The frequencies of G-starting codons are positively correlated to the level of expression of genes, as estimated by their Codon Adaptation Index (CAI) values, in M. tuberculosis as well as in H. influenzae in spite of having an A+T-rich genome. The present study on the codon preferences of two organisms with oppositely skewed base compositions thus suggests that the preference of G-starting codons by highly expressed genes might be a general feature of bacteria, irrespective of their overall G+C contents. The ranges of variations in the frequencies of individual bases at the first and second codon positions of genes of both H. influenzae and M. tuberculosis are similar to those of E. coli, implying that though the composition of all three codon positions is governed by a selection-mutation balance, the mutational pressure has little influence in the choice of bases at the first two codon positions, even in organisms with highly biased base compositions.  相似文献   

20.
Wada and colleagues have shown that, whether prokaryotic or eukaryotic, each gene has a "homostabilising propensity" to adopt a relatively uniform GC percentage (GC%). Accordingly, each gene can be viewed as a "microisochore" occupying a discrete GC% niche of relatively uniform base composition amongst its fellow genes. Although first, second and third codon positions usually differ in GC%, each position tends to maintain a uniform, gene-specific GC% value. Thus, within a genome, genic GC% values can cover a wide range. This is most evident at third codon positions, which are least constrained by amino acid encoding needs. In 1991, Wada and colleagues further noted that, within a phylogenetic group, genomic GC% values can also cover a wide range. This is again most evident at third codon positions. Thus, the dispersion of GC% values among genes within a genome matches the dispersion of GC% values among genomes within a phylogenetic group. Wada described the context-independence of plots of different codon position GC% values against total GC% as a "universal" characteristic. Several studies relate this to recombination. We have confirmed that third codon positions usually relate more to the genes that contain them than to the species. However, in genomes with extreme GC% values (low or high), third codon positions tend to maintain a constant GC%, thus relating more to the species than to the genes that contain them. Genes in an extreme-GC% genome collectively span a smaller GC% range, and mainly rely on first and second codon positions for differentiation as "microisochores". Our results are consistent with the view that differences in GC% serve to recombinationally isolate both genome sectors (facilitating gene duplication) and genomes (facilitating genome duplication, e.g. speciation). In intermediate-GC% genomes, conflict between the needs of the species and the needs of individual genes within that species is minimal. However, in extreme-GC% genomes there is a conflict, which is settled in favour of the species (i.e. group selection) rather than in favour of the gene (genic selection).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号