首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.  相似文献   

2.
Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana   总被引:21,自引:0,他引:21  
Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130-140 Mb), excellent physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes 4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing of nearly 2 Mb within the genetically defined centromere revealed a low density of recognizable genes, and a high density and diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a continuous stretch of 75% of the mitochondrial genome into chromosome 2.  相似文献   

3.
Arabidopsis thaliana is an important model system for plant biologists. In 1996 an international collaboration (the Arabidopsis Genome Initiative) was formed to sequence the whole genome of Arabidopsis and in 1999 the sequence of the first two chromosomes was reported. The sequence of the last three chromosomes and an analysis of the whole genome are reported in this issue. Here we present the sequence of chromosome 3, organized into four sequence segments (contigs). The two largest (13.5 and 9.2 Mb) correspond to the top (long) and the bottom (short) arms of chromosome 3, and the two small contigs are located in the genetically defined centromere. This chromosome encodes 5,220 of the roughly 25,500 predicted protein-coding genes in the genome. About 20% of the predicted proteins have significant homology to proteins in eukaryotic genomes for which the complete sequence is available, pointing to important conserved cellular functions among eukaryotes.  相似文献   

4.
A physical map of the mouse genome   总被引:1,自引:0,他引:1  
A physical map of a genome is an essential guide for navigation, allowing the location of any gene or other landmark in the chromosomal DNA. We have constructed a physical map of the mouse genome that contains 296 contigs of overlapping bacterial clones and 16,992 unique markers. The mouse contigs were aligned to the human genome sequence on the basis of 51,486 homology matches, thus enabling use of the conserved synteny (correspondence between chromosome blocks) of the two genomes to accelerate construction of the mouse map. The map provides a framework for assembly of whole-genome shotgun sequence data, and a tile path of clones for generation of the reference sequence. Definition of the human-mouse alignment at this level of resolution enables identification of a mouse clone that corresponds to almost any position in the human genome. The human sequence may be used to facilitate construction of other mammalian genome maps using the same strategy.  相似文献   

5.
Short interspersed nuclear elements (SINEs) are widespread among eukaryotic genomes. They are repetitive DNA sequences that have been amplified by retrotransposition. In this study, a class of SINEs were isolated from the Opsariichthys bidens genome, and named Opsar. Sequence analysis confirmed that Opsar is a new class of typical SINEs derived from tRNA molecules. With the tRNA-derived region of Opsar and through BLASTN search, we further identified Zb-SINEs from the zebrafish genome, which includes two groups: Zb-SINE-A and Zb-SINE-B. The Zb-SINE-A group comprises subfamilies of -A1—-A5, and the Zb-SINE-B group is a dimer of the tRNAAla-derived region and shares a similar dimeric composition to Alu. Zb-SINEs are composed of three distinct regions: a 5′end tRNA-derived region, a tRNA-unrelated region and a 3′end AT-rich region. The flanking regions are AT rich. The average length of Zb-SINEs elements is about 340 bp. Zb-SINEs account for as much as 0.1% of the whole zebrafish genome. About 70% of the Zb-SINEs are on chromosomes 11, 18, and 19. These Zb-SINEs were characterized by PCR and dot hybridization. The distribution pattern of Zb-SINEs in genome strongly supports the master genes model. The tRNA-derived regions of Opsar and Zb-SINEs were compared with the tRNAAla gene, and they showed 76% similarity, indicating that Opsar and Zb-SINEs originated from an inactive tRNAAla sequence or a tRNAAla—like sequence. In view of the evolutionary status of zebrafish in the Cyprinidae, we deduced that Zb-SINEs were a very old class of interspersed sequences.  相似文献   

6.
Short interspersed nuclear elements (SINEs) are widespread among eukaryotic genomes. They are repetitive DNA sequences that have been amplified by retrotransposition. In this study, a class of SINEs were isolated from the Opsariichthys bidens genome, and named Opsar. Sequence analysis confirmed that Opsar is a new class of typical SINEs derived from tRNA molecules. With the tRNA-derived region of Opsar and through BLASTN search, we further identified Zb-SINEs from the zebrafish genome, which includes two groups: Zb-SINE-A and Zb-SINE-B. The Zb-SINE-A. group comprises subfamilies of -Al--A5, and the Zb-SINE-B group is a dimer of the tRNA -derived region and shares a similar dimeric composition to Alu. Zb-SINEs are composed of three distinct regions: a 5' end tRNA-derived region, a tRNA-unrelated region and a 3' end AT-rich region. The flanking regions are AT rich. The average length of Zb-SINEs elements is about 340 bp. Zb-SINEs account for as much as 0.1 % of the whole zebrafish genome. About 70% of the Zb-SINEs are on chromosomes 11, 18, and 19. These Zb-SINEs were characterized by PCR and dot hybridization. The distribution pattern of Zb-SINEs in genome strongly supports the master genes model. The tRNA-derived regions of Opsar and Zb-SINEs were compared with the tRNAAla gene, and they showed 76% similarity, indicating that Opsar and Zb-SINEs originated from an inactive tRNA sequence or a tRNA -like sequence. In view of the evolutionary status of zebrafish in the Cyprinidae, we deduced that Zb-SINEs were a very old class of interspersed sequences.  相似文献   

7.
The International Human Genome Sequencing Consortium (IHGSC) recently completed a sequence of the human genome. As part of this project, we have focused on chromosome 8. Although some chromosomes exhibit extreme characteristics in terms of length, gene content, repeat content and fraction segmentally duplicated, chromosome 8 is distinctly typical in character, being very close to the genome median in each of these aspects. This work describes a finished sequence and gene catalogue for the chromosome, which represents just over 5% of the euchromatic human genome. A unique feature of the chromosome is a vast region of approximately 15 megabases on distal 8p that appears to have a strikingly high mutation rate, which has accelerated in the hominids relative to other sequenced mammals. This fast-evolving region contains a number of genes related to innate immunity and the nervous system, including loci that appear to be under positive selection--these include the major defensin (DEF) gene cluster and MCPH1, a gene that may have contributed to the evolution of expanded brain size in the great apes. The data from chromosome 8 should allow a better understanding of both normal and disease biology and genome evolution.  相似文献   

8.
9.
为揭示球状轮藻叶绿体全基因组的特征以及探究其在轮藻科内的系统发育关系,本研究基于高通量测序技术对其叶绿体全基因组进行组装和序列分析.结果表明:球状轮藻叶绿体基因组全长180 652 bp, GC含量26.6%,具有典型的四分体环状结构,与普生轮藻十分类似;球状轮藻叶绿体基因组共注释出137个基因,其中包括94个蛋白质编码基因、37个tRNA基因和6个rRNA基因,比无色丽藻多2个蛋白质编码基因和3个tRNA基因,与高等植物相比具有rpl12、trnL(gag)、rpl19、ycf20四个特殊基因;球状轮藻叶绿体全基因组共检测出87个SSR位点且绝大部分由A和T构成;此外,球状轮藻共包含24 989个密码子且密码子使用更偏好A和T,亮氨酸(Leu)是编码氨基酸最多的密码子;通过邻近法(NJ)对包括球状轮藻在内的5个种的叶绿体全基因组构建系统发育树显示,球状轮藻的亲缘关系与普生轮藻更为接近.本研究对球状轮藻叶绿体全基因组进行了解析,利用现有数据确立其系统发育学地位.  相似文献   

10.
Chromosome 9 is highly structurally polymorphic. It contains the largest autosomal block of heterochromatin, which is heteromorphic in 6-8% of humans, whereas pericentric inversions occur in more than 1% of the population. The finished euchromatic sequence of chromosome 9 comprises 109,044,351 base pairs and represents >99.6% of the region. Analysis of the sequence reveals many intra- and interchromosomal duplications, including segmental duplications adjacent to both the centromere and the large heterochromatic block. We have annotated 1,149 genes, including genes implicated in male-to-female sex reversal, cancer and neurodegenerative disease, and 426 pseudogenes. The chromosome contains the largest interferon gene cluster in the human genome. There is also a region of exceptionally high gene and G + C content including genes paralogous to those in the major histocompatibility complex. We have also detected recently duplicated genes that exhibit different rates of sequence divergence, presumably reflecting natural selection.  相似文献   

11.
Sequence and analysis of rice chromosome 4   总被引:1,自引:0,他引:1  
Feng Q  Zhang Y  Hao P  Wang S  Fu G  Huang Y  Li Y  Zhu J  Liu Y  Hu X  Jia P  Zhang Y  Zhao Q  Ying K  Yu S  Tang Y  Weng Q  Zhang L  Lu Y  Mu J  Lu Y  Zhang LS  Yu Z  Fan D  Liu X  Lu T  Li C  Wu Y  Sun T  Lei H  Li T  Hu H  Guan J  Wu M  Zhang R  Zhou B  Chen Z  Chen L  Jin Z  Wang R  Yin H  Cai Z  Ren S  Lv G  Gu W  Zhu G  Tu Y  Jia J  Zhang Y  Chen J  Kang H  Chen X  Shao C  Sun Y  Hu Q  Zhang X  Zhang W  Wang L  Ding C  Sheng H  Gu J  Chen S  Ni L  Zhu F  Chen W  Lan L  Lai Y  Cheng Z  Gu M  Jiang J  Li J  Hong G  Xue Y  Han B 《Nature》2002,420(6913):316-320
Rice is the principal food for over half of the population of the world. With its genome size of 430 megabase pairs (Mb), the cultivated rice species Oryza sativa is a model plant for genome research. Here we report the sequence analysis of chromosome 4 of O. sativa, one of the first two rice chromosomes to be sequenced completely. The finished sequence spans 34.6 Mb and represents 97.3% of the chromosome. In addition, we report the longest known sequence for a plant centromere, a completely sequenced contig of 1.16 Mb corresponding to the centromeric region of chromosome 4. We predict 4,658 protein coding genes and 70 transfer RNA genes. A total of 1,681 predicted genes match available unique rice expressed sequence tags. Transposable elements have a pronounced bias towards the euchromatic regions, indicating a close correlation of their distributions to genes along the chromosome. Comparative genome analysis between cultivated rice subspecies shows that there is an overall syntenic relationship between the chromosomes and divergence at the level of single-nucleotide polymorphisms and insertions and deletions. By contrast, there is little conservation in gene order between rice and Arabidopsis.  相似文献   

12.
Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.  相似文献   

13.
Species of malaria parasite that infect rodents have long been used as models for malaria disease research. Here we report the whole-genome shotgun sequence of one species, Plasmodium yoelii yoelii, and comparative studies with the genome of the human malaria parasite Plasmodium falciparum clone 3D7. A synteny map of 2,212 P. y. yoelii contiguous DNA sequences (contigs) aligned to 14 P. falciparum chromosomes reveals marked conservation of gene synteny within the body of each chromosome. Of about 5,300 P. falciparum genes, more than 3,300 P. y. yoelii orthologues of predominantly metabolic function were identified. Over 800 copies of a variant antigen gene located in subtelomeric regions were found. This is the first genome sequence of a model eukaryotic parasite, and it provides insight into the use of such systems in the modelling of Plasmodium biology and disease.  相似文献   

14.
After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chromosomes. Chromosome 3 comprises just four contigs, one of which currently represents the longest unbroken stretch of finished DNA sequence known so far. The chromosome is remarkable in having the lowest rate of segmental duplication in the genome. It also includes a chemokine receptor gene cluster as well as numerous loci involved in multiple human cancers such as the gene encoding FHIT, which contains the most common constitutive fragile site in the genome, FRA3B. Using genomic sequence from chimpanzee and rhesus macaque, we were able to characterize the breakpoints defining a large pericentric inversion that occurred some time after the split of Homininae from Ponginae, and propose an evolutionary history of the inversion.  相似文献   

15.
Human chromosome 12 contains more than 1,400 coding genes and 487 loci that have been directly implicated in human disease. The q arm of chromosome 12 contains one of the largest blocks of linkage disequilibrium found in the human genome. Here we present the finished sequence of human chromosome 12, which has been finished to high quality and spans approximately 132 megabases, representing approximately 4.5% of the human genome. Alignment of the human chromosome 12 sequence across vertebrates reveals the origin of individual segments in chicken, and a unique history of rearrangement through rodent and primate lineages. The rate of base substitutions in recent evolutionary history shows an overall slowing in hominids compared with primates and rodents.  相似文献   

16.
A primary physical map of rice chromosome 12 was constructed using marker-based chromosome landing and chromosome walking. A BAC library from IR64 was screened using 84 RFLP markers, 4 STS markers and 6 microsatellite markers on chromosome 12 by colony hybridization and polymerase chain reaction (PCR) amplification. A total of 59 contigs consisting of 419 BAC clones including 5 single-clones were physically aligned on rice chromosome 12 with the largest BAC contig covering 855 kb. The whole physical map had a size of ∼16 Mb and covered about 52% of rice chromosome 12. This physical map will be certainly helpful for map-based gene cloning of agronomically and biological important genes and understanding the genome structure of the chromosome. Foundation item: Supported by Rockefeller Foundation Biography: FU Bin-Ying (1965-), male, Ph. D. candidate, Reseach direction: plant molecular genetics.  相似文献   

17.
18.
We report a high-quality draft of the genome sequence of the grey, short-tailed opossum (Monodelphis domestica). As the first metatherian ('marsupial') species to be sequenced, the opossum provides a unique perspective on the organization and evolution of mammalian genomes. Distinctive features of the opossum chromosomes provide support for recent theories about genome evolution and function, including a strong influence of biased gene conversion on nucleotide sequence composition, and a relationship between chromosomal characteristics and X chromosome inactivation. Comparison of opossum and eutherian genomes also reveals a sharp difference in evolutionary innovation between protein-coding and non-coding functional elements. True innovation in protein-coding genes seems to be relatively rare, with lineage-specific differences being largely due to diversification and rapid turnover in gene families involved in environmental interactions. In contrast, about 20% of eutherian conserved non-coding elements (CNEs) are recent inventions that postdate the divergence of Eutheria and Metatheria. A substantial proportion of these eutherian-specific CNEs arose from sequence inserted by transposable elements, pointing to transposons as a major creative force in the evolution of mammalian gene regulation.  相似文献   

19.
20.
Krasilnikov AS  Yang X  Pan T  Mondragón A 《Nature》2003,421(6924):760-764
RNase P is the only endonuclease responsible for processing the 5' end of transfer RNA by cleaving a precursor and leading to tRNA maturation. It contains an RNA component and a protein component and has been identified in all organisms. It was one of the first catalytic RNAs identified and the first that acts as a multiple-turnover enzyme in vivo. RNase P and the ribosome are so far the only two ribozymes known to be conserved in all kingdoms of life. The RNA component of bacterial RNase P can catalyse pre-tRNA cleavage in the absence of the RNase P protein in vitro and consists of two domains: a specificity domain and a catalytic domain. Here we report a 3.15-A resolution crystal structure of the 154-nucleotide specificity domain of Bacillus subtilis RNase P. The structure reveals the architecture of this domain, the interactions that maintain the overall fold of the molecule, a large non-helical but well-structured module that is conserved in all RNase P RNA, and the regions that are involved in interactions with the substrate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号