首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein names and how to find them   总被引:6,自引:0,他引:6  
A prerequisite for all higher level information extraction tasks is the identification of unknown names in text. Today, when large corpora can consist of billions of words, it is of utmost importance to develop accurate techniques for the automatic detection, extraction and categorization of named entities in these corpora. Although named entity recognition might be regarded a solved problem in some domains, it still poses a significant challenge in others. In this work we focus on one of the more difficult tasks, the identification of protein names in text. This task presents several interesting difficulties because of the named entities variant structural characteristics, their sometimes unclear status as names, the lack of common standards and fixed nomenclatures, and the specifics of the texts in the molecular biology domain in which they appear. We describe how we approached these and other difficulties in the implementation of Yapex, a system for the automatic identification of protein names in text. We also evaluate Yapex under four different notions of correctness and compare its performance to that of another publicly available system for protein name recognition.  相似文献   

2.
Cloning of DNA corresponding to four different measles virus genomic regions   总被引:15,自引:0,他引:15  
  相似文献   

3.
4.
We have developed a quantitative assay to determine repair of structurally different DNA lesions at defined genomic sites. This assay depends on the fact that many different types of damage are repaired by the same nucleotide excision repair (NER) pathway which includes synthesis of short DNA fragments at the sites of damage. After exposure to damaging agents, cells are treated with 5-bromodeoxyuridine (BrdUrd) to label the regions undergoing repair with the presumption that regions that have been more efficiently repaired would incorporate more BrdUrd than regions that were less effectively repaired. Thus, the abundance of the different sequences in the BrdUrd-containing DNA would be a direct and quantitative measure for the repair rates of the corresponding regions. The BrdUrd-containing, repaired DNA was isolated by CsCl gradient centrifugation and immunoprecipitation with anti-BrdUrd antibody and was used as template in quantitative PCR in which the amount of the product was directly proportional to the amount of template. This approach was used to address the question whether DNA repair after UV-irradiation occurs in an uniform, random manner or with preferences for certain regions. We found out that there was a higher repair efficiency at the 5′-end of the mouse β-globin domain in Ehrlich ascites tumor cells.  相似文献   

5.
6.
Genomic sequencing has provided a tremendous amount of information that can be useful in vaccine target identification. The sheer volume of information available necessitates the use of new research disciplines and techniques. Using bioinformatics, researchers sift through available data to identify appropriate candidates for biological analysis. This review provides an overview of available bioinformatic techniques for vaccine candidate identification and a few examples of how these techniques are being applied to specific bacterial pathogens.  相似文献   

7.
Nishida K  Kimura Y  Kawasaki T  Fujie M  Yamada T 《Virology》1999,255(2):376-384
A physical map of the Chlorella virus CVK2 genomic DNA has been constructed based on a cosmid contig covering the entire genomic region. By using Southern blot analysis with 22 gene probes, the gene arrangement along the genome was compared between CVK2 and PBCV-1, the prototypic member of Phycodnaviridae, whose genomic sequence is now available. The major rearrangements were (1) an insertion of a 20-kbp region around the left end of CVK2 DNA, (2) a duplication of the gene for major capsid protein in CVK2 DNA, (3) deletions/insertions of some open reading frames, and (4) divergence in the terminal inverted repeat sequences. Despite these changes, extensive colinearity was revealed between most of the genes along the CVK2 and PBCV-1 genomes. These data imply that the Chlorella virus genome has an overall high degree of genomic stability, encompassing specific islands of rearrangements.  相似文献   

8.
Sequence organization in regulatory regions of DNA of minute virus of mice   总被引:4,自引:0,他引:4  
  相似文献   

9.
10.
The mouse tumor necrosis factor receptor (TNFR)-I gene was cloned,sequenced, and characterized. The nucleotide sequence analysisshows that the TNFR-I Is composed of 10 exons and nine Introns.The first Intron includes two simple dinucleotide repeat sequences,(GA)8, and (TC)9(TG)19. The (TC)9, (TG)19 tandem repeat wasfound to be polymorphic In its length among various mouse strains.The nucleotide sequence of the 1076 bp 5' flanking region ofthe TNFR-I was also determined. Various possible regulatorysequences were identified in the 5' flanking region of the TNFR-Igene. For functional analysis, the 5' flanking region of theTNFR-I gene was isolated, ligated upstream of the luciferasereporter gene, and translently transfected Into L929, Hela,and a T cell hybridoma cell line. The results show that theIsolated 5' flanking region has functional promoter activityand is responsible for constitutive expression of the TNFR-Igene. A series of truncated promoter constructs were generatedand studied in a translent transfectlon system. Analysis oftranslent expression in L929 cells shows that the regions-1076/-939,-615/-425,and-425/-198 include positive regulatory elements, while theregion-939/-615 may contain negative cis-actlng elements forthe constitutive expression of the TNFR-I. The shortest constructcontaining 198 bp of the 5' flanking region still has significantpromoter activity, suggesting that the two GC-rich elementsin this region may play an important role in the constitutiveexpression of the TNFR-I gene.  相似文献   

11.
12.
Evidence based medicine (EBM) represents an attempt to assist healthcare providers in basing clinical decisions on the best available evidence. That evidence in the treatment realm usually takes the form of clinical trials (CTs), with the randomized controlled clinical trial (CCT or RCT) being the gold standard. Many specialties such as internal medicine have embraced EBM. Medical geneticists who care for patients with inborn errors of metabolism (IEM) have by and large not benefited from the EBM movement. IEM are rare genetic conditions, many of which are treatable. Therefore, the principles of EBM should be applicable to IEM. Notably, Archibald Cochrane, one of the founders of EBM, suffered from porphyria, an IEM. The principles of EBM as applied to IEM are explored herein. The author hypothesized that EBM has not infiltrated the specialty of medical genetics, that few controlled trials for IEM have been published, and that where CTs have been carried out in IEM they can be difficult to find with electronic bibliographic database searches. To test the hypothesis, MEDLINE searches for CTs were carried out for a few representative IEM. The search results support the hypothesis. In this article, the principles of EBM are introduced and its history reviewed as background information to lay the groundwork for further discussion. Next, the dearth of evidence base in IEM, impediments to the application of EBM to IEM, steps to be taken to improve the evidence base for IEM, and finally strategies to make it easier to find CTs for IEM in database searches are all discussed.  相似文献   

13.
A major portion of most eukaryotic genomes are transposable elements (TEs). During evolution, TEs have introduced profound changes to genome size, structure, and function. As integral parts of genomes, the dynamic presence of TEs will continue to be a major force in reshaping genomes. Early computational analyses of TEs in genome sequences focused on filtering out “junk” sequences to facilitate gene annotation. When the high abundance and diversity of TEs in eukaryotic genomes were recognized, these early efforts transformed into the systematic genome-wide categorization and classification of TEs. The availability of genomic sequence data reversed the classical genetic approaches to discovering new TE families and superfamilies. Curated TE databases and their accurate annotation of genome sequences in turn facilitated the studies on TEs in a number of frontiers including: (1) TE-mediated changes of genome size and structure, (2) the influence of TEs on genome and gene functions, (3) TE regulation by host, (4) the evolution of TEs and their population dynamics, and (5) genomic scale studies of TE activity. Bioinformatics and genomic approaches have become an integral part of large-scale studies on TEs to extract information with pure in silico analyses or to assist wet lab experimental studies. The current revolution in genome sequencing technology facilitates further progress in the existing frontiers of research and emergence of new initiatives. The rapid generation of large-sequence datasets at record low costs on a routine basis is challenging the computing industry on storage capacity and manipulation speed and the bioinformatics community for improvement in algorithms and their implementations.  相似文献   

14.
The comet assay is a sensitive method for measuring DNA strand breaks in eukaryotic cells. After embedding in agarose, cells are lysed and electrophoresed at high pH. DNA loops containing breaks (in which supercoiling is relaxed) escape from the nucleoid comet head to form a tail. Oligonucleotide probes were designed for 5' and 3' regions of the genes for dihydrofolate reductase (DHFR) and O6-methylguanine DNA methyltransferase (MGMT), both from the Chinese hamster, and the human tumour suppressor p53 gene. Alternate ends were labelled with either biotin or fluorescein. These probes were hybridized to the DNA of comets from Chinese hamster ovary (CHO) cells or human lymphocytes treated with H2O2 or photosensitizer plus light to induce oxidative damage. Amplification with Texas red- and fluorescein-tagged antibodies led, in the case of p53 in human cells, to red and green signals located in the comet tail (as well as in the head), indicating the presence of breaks in the vicinity of the gene. However, only one end of the MGMT gene appeared in the tail and almost no signals from the DHFR gene, either red or green, were in the tail of comets from CHO cells. Restriction on movement from the head to tail may result from the presence of a 'matrix-associated region' in the gene. The kinetics of repair of oxidative damage were followed; strand breaks in the p53 gene were repaired more rapidly than total DNA. Thus, fluorescent in situ hybridization in combination with the comet assay provides a powerful method for studying repair of specific genes in relation to chromatin structure.  相似文献   

15.
Research Institute of Clinical Psychiatry, All-Union Mental Health Research Center, Academy of Medical Sciences of the USSR, Moscow. (Presented by Academician of the Academy of Medical Sciences of the USSR M. E. Vartanyan.) Translated from Byulleten' Éksperimental'noi Biologii i Meditsiny, Vol. 110, No. 11, p. 525, November, 1990.  相似文献   

16.
17.
18.
The origin of the severe acute respiratory syndrome-coronavirus (SARS-CoV) remains unclear. Evidence based on Bayesian scanning plots and phylogenetic analysis using maximum likelihood (ML) and Bayesian methods indicates that SARS-CoV, for the largest part of the genome ( approximately 80%), is more closely related to Group II coronaviruses sequences, whereas in three regions in the ORF1ab gene it shows no apparent similarity to any of the previously characterized groups of coronaviruses. There is discordant phylogenetic clustering of SARS-CoV and coronaviruses sequences, throughout the genome, compatible with either ancient recombination events or altered evolutionary rates in different lineages, or a combination of both.  相似文献   

19.
20.
Nucleotide sequencing of approximately 400 basepairs upstream from exon 1 of the DPB1 gene and sequence specific oligonucleotide hybridisation identified eight nucleotide positions to be polymorphic which were in linkage disequilibrium (LD) with DPA1 and DPB1 alleles. Substitutions at two sites (-230 and -224) formed three genotypes (DP-PRO1-3). DP-PRO 1 was the most common genotype and was in LD with DPA1*0103, *0202 and DPB1*0401, *0501. DP-PRO 2 was observed in LD with DPB1*02012, *1601, *1701 and DPA1*0104. DP-PRO3 was in LD with DPB1*0901, *1001 and DPA1*0201. Electrophoretic Mobility Shift Assays (EMSA) performed with restriction enzyme fragments showed substitutions at -230 and -224 not to be involved in binding nuclear proteins. Six substitutions were found on a single genotype (DP-PRO4) which was observed in seven samples; 67% of DP-PRO4 inferred haplotypes were HLA-A2-B46, DRB1*0901, DQB1*03032, DPA1*0401, DPB1*1301. Three of the substitutions occurred in conserved regulatory region boxes, W', X and Y, and three in the signal and leader sequences. EMSA competitive binding assays performed with oligonucleotide probes for the substitutions showed no difference in binding affinity for W' and X probes. The DP-PRO4 Y box had a decreased nuclear protein binding affinity compared to DP-PRO1-3. Whether the sum of the differences in DP-PRO4 relate to a change in the cell surface expression of HLA-DP is yet to be determined.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号