首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
 本文对蛋白质序列的肽键进行了统计分析,计算了二肽构象参数P_α、P_β、P_c和三肽构象参数Q_α、Q_β、Q_c。在此基础上提出了由氨基酸序列预测二级结构的规则。预测的正确率达90%,优于Chou-Fasman方法。这个结果表明二肽(三肽)关联在形成蛋白质二级结构中具有明显的重要性。  相似文献   

2.
单链构象多态性(SSCP)分析是一种简便,快速检测DNA突变的方法,它在基因突变检测、遗传分析、进化研究等领域有着广泛的应用价值.但是这种方法的突变检出率随DNA序列不同而变化,一般只能达到70%~80%.这主要是有的碱基突变对单链DNA的构象影响较小,不能通过SSCP检测出来.将计算机对DNA二级结构的预测结果和实验结果作了对比,发现二者有很高的一致性.这一结果表明计算机的DNA单链二级结构预测分析可用于PCR-SSCP分析的辅助设计,提高SSCP的突变检出率.  相似文献   

3.
本文提出能预测单链核酸分子的具有最小自由能的二级结构的计算方法。方法的基础是拓扑平面图的最大C—匹配原理和现有的单链核酸分子折叠构象的热力学数据资料。为了说明算法的能力,对免疫球蛋白r1重链的mRNA片段序列(459个核苷酸残基)大肠杆菌16s rRNA片段序列(567一883)以及脊髓灰白质炎病毒RNA片段序列(1—74O)的二级结构进行了计算机预测并同现有的结构模型进行了比较和讨论。由计算机预测的大肠杆菌16s rRNA中心域的二级结构与Noller和Woese提出的结构模型基本一致。  相似文献   

4.
张斌  尹京苑  薛丹 《生物信息学》2011,9(3):224-228,234
蛋白质二级结构对于研究其功能具有重要作用。采用主成分分析方法对氨基酸的基本物化属性及其二级结构倾向性进行降维降噪处理,使用径向基神经网络对蛋白质二级结构进行预测。主成分分析使得之前 20 ×12 矩阵变为 20 ×4 矩阵,极大地减少了神经网络输入端的维数。在仿真过程中,当窗口大小为 21,扩展函数为 7 时,预测精确度达到了 71. 81%。实验结果表明 RBF 神经网络可以有效的用于蛋白质二级结构的预测。  相似文献   

5.
本文首次将基于位置权重矩阵的打分函数用于蛋白质二级结构的预测中.我们选取CB513数据库作为基准数据库,首先在库中截取11残基和21残基片段,依据中心残基的二级结构类型分成3个集合;然后分别建立20种氨基酸、以及依据亲疏水性约化成的6种氨基酸在3个集合中的位置权重矩阵;对于任意一个待测的序列片段X通过和3个位置权重矩阵比较,应用打分函数得到3个不同的分值,比较哪个分值最大X就属于哪一类;最后在误差允许范围内对预测结果进行修正,得到的预测精度Q_3最高达到了80%.  相似文献   

6.
转录因子结合位点的计算预测是研究基因转录调控的重要环节,但常用的位置特异得分矩阵方法预测特异性偏低.通过深入分析结合位点的生物特征,提出了一种综合利用序列保守模体和局部构象信息的结合位点预测方法,以极大相关得分矩阵作为保守模体的描述模型,并根据二苷参数模型计算位点序列的局部构象,将两类信息得分组合为多维特征向量,在二次判别分析的框架下进行训练和滑动预测.预测过程中还引入了位置信息量以优化似然得分和过滤备选结果.针对大肠杆菌CRP和Fis结合位点数据的留一法测试结果表明,描述模型的改进和多种信息的融合能有效地改善预测方法的性能,大幅度提高特异性.  相似文献   

7.
曹晨  马堃 《生物信息学》2016,14(3):181-187
蛋白质二级结构是指蛋白质骨架结构中有规律重复的构象。由蛋白质原子坐标正确地指定蛋白质二级结构是分析蛋白质结构与功能的基础,二级结构的指定对于蛋白质分类、蛋白质功能模体的发现以及理解蛋白质折叠机制有着重要的作用。并且蛋白质二级结构信息广泛应用到蛋白质分子可视化、蛋白质比对以及蛋白质结构预测中。目前有超过20种蛋白质二级结构指定方法,这些方法大体可以分为两大类:基于氢键和基于几何,不同方法指定结果之间的差异较大。由于尚没有蛋白质二级结构指定方法的综述文献,因此,本文主要介绍和总结已有蛋白质二级结构指定方法。  相似文献   

8.
水稻类金属硫蛋白(rgMT)的两端是高度保守的半胱氨酸富含区的结构域(CR区),中间是不含半胱氨酸的间隔区,呈典型的三段式结构.本研究分别采用距离几何算法和同源建模相结合的方法对水稻类金属硫蛋白进行三级结构建模.在排列出CR区的所有可能的半胱氨酸-金属硫络合的组合方式,并对每一种组合方式给出一定的限制条件后各生成20个随机构象.根据生成的随机构象是否能形成金属硫络合结构,从900个随机构象中最终选出6个构象(N端4种,C端2种组合)作为可能的结构模型.另一方面,采用GOR方法对间隔区进行了二级结构预测,随后用同源建模法对其建模.将上述建成的三部分模型连接起来后形成rgMT的整体三维构象.结果表明rgMT能像哺乳动物MT蛋白一样,可形成两个独立的、在结构和能量上均没有障碍的金属-硫络合结构.介于所有植物类金属硫蛋白都具有典型的三段式结构,其中的一部分还具有与rgMT相同的半胱氨酸排列方式,所以rgMT三维结构模型的建立对于其他植物类金属硫蛋白的结构研究具有重要的参考价值.  相似文献   

9.
根据活性类似物活性基团在能量补偿许可范围内能够到达受体活性部位与各作用位点结合 ,而无 (低 )活性类似物的活性基团由于构象限制因素等不能与之结合的原理 ,建立一种新的搜寻药物分子的活性构象程序ACSBAIA .该程序包括 4个子系统 :构象抽样、活性构象限定、无活性构象排除、活性预测 .用此方法搜寻了烯丙胺类抗真菌药物的活性构象 ,并对 2个高活性和无活性的类似物进行了预测和检验 ,证明该方法较科学实用 .该方法的应用不受受体三维结构知识多少的限制 ,对目前绝大多数受体三维结构未知的情况尤其适用  相似文献   

10.
蛋白质结构构象呈现明显的规律,研究其在特定构象空间的分布对蛋白质结构预测和模拟具有重要意义.本文以449个非冗余的高分辨率蛋白质结构为材料,以Cα-Cα距离向量代表蛋白质片段.然后利用主成分分析方法,建立蛋白质片段构象空间的可视化构图,并且单个蛋白质分子可以映射到该空间形成一个顺序连接的路径.从而,可以很直观的分析各长度片段(4~9个残基的片段)的分布情况及其内在的连接关系.图形显示了明显的聚集性,以及各种类型片段与二级结构明显的对应关系.  相似文献   

11.
McGuffin LJ  Jones DT 《Proteins》2003,52(2):166-175
If secondary structure predictions are to be incorporated into fold recognition methods, an assessment of the effect of specific types of errors in predicted secondary structures on the sensitivity of fold recognition should be carried out. Here, we present a systematic comparison of different secondary structure prediction methods by measuring frequencies of specific types of error. We carry out an evaluation of the effect of specific types of error on secondary structure element alignment (SSEA), a baseline fold recognition method. The results of this evaluation indicate that missing out whole helix or strand elements, or predicting the wrong type of element, is more detrimental than predicting the wrong lengths of elements or overpredicting helix or strand. We also suggest that SSEA scoring is an effective method for assessing accuracy of secondary structure prediction and perhaps may also provide a more appropriate assessment of the "usefulness" and quality of predicted secondary structure, if secondary structure alignments are to be used in fold recognition.  相似文献   

12.
A novel method for predicting the secondary structures of proteins from amino acid sequence has been presented. The protein secondary structure seqlets that are analogous to the words in natural language have been extracted. These seqlets will capture the relationship between amino acid sequence and the secondary structures of proteins and further form the protein secondary structure dictionary. To be elaborate, the dictionary is organism-specific. Protein secondary structure prediction is formulated as an integrated word segmentation and part of speech tagging problem. The word-lattice is used to represent the results of the word segmentation and the maximum entropy model is used to calculate the probability of a seqlet tagged as a certain secondary structure type. The method is markovian in the seqlets, permitting efficient exact calculation of the posterior probability distribution over all possible word segmentations and their tags by viterbi algorithm. The optimal segmentations and their tags are computed as the results of protein secondary structure prediction. The method is applied to predict the secondary structures of proteins of four organisms respectively and compared with the PHD method. The results show that the performance of this method is higher than that of PHD by about 3.9% Q3 accuracy and 4.6% SOV accuracy. Combining with the local similarity protein sequences that are obtained by BLAST can give better prediction. The method is also tested on the 50 CASP5 target proteins with Q3 accuracy 78.9% and SOV accuracy 77.1%. A web server for protein secondary structure prediction has been constructed which is available at http://www.insun.hit.edu.cn:81/demos/biology/index.html.  相似文献   

13.
1 Introduction The prediction of protein structure and function from amino acid sequences is one of the most impor-tant problems in molecular biology. This problem is becoming more pressing as the number of known pro-tein sequences is explored as a result of genome and other sequencing projects, and the protein sequence- structure gap is widening rapidly[1]. Therefore, com-putational tools to predict protein structures are needed to narrow the widening gap. Although the prediction of three dim…  相似文献   

14.
C A Orengo  N P Brown  W R Taylor 《Proteins》1992,14(2):139-167
A fast method is described for searching and analyzing the protein structure databank. It uses secondary structure followed by residue matching to compare protein structures and is developed from a previous structural alignment method based on dynamic programming. Linear representations of secondary structures are derived and their features compared to identify equivalent elements in two proteins. The secondary structure alignment then constrains the residue alignment, which compares only residues within aligned secondary structures and with similar buried areas and torsional angles. The initial secondary structure alignment improves accuracy and provides a means of filtering out unrelated proteins before the slower residue alignment stage. It is possible to search or sort the protein structure databank very quickly using just secondary structure comparisons. A search through 720 structures with a probe protein of 10 secondary structures required 1.7 CPU hours on a Sun 4/280. Alternatively, combined secondary structure and residue alignments, with a cutoff on the secondary structure score to remove pairs of unrelated proteins from further analysis, took 10.1 CPU hours. The method was applied in searches on different classes of proteins and to cluster a subset of the databank into structurally related groups. Relationships were consistent with known families of protein structure.  相似文献   

15.
Bayesian segmentation of protein secondary structure.   总被引:12,自引:0,他引:12  
We present a novel method for predicting the secondary structure of a protein from its amino acid sequence. Most existing methods predict each position in turn based on a local window of residues, sliding this window along the length of the sequence. In contrast, we develop a probabilistic model of protein sequence/structure relationships in terms of structural segments, and formulate secondary structure prediction as a general Bayesian inference problem. A distinctive feature of our approach is the ability to develop explicit probabilistic models for alpha-helices, beta-strands, and other classes of secondary structure, incorporating experimentally and empirically observed aspects of protein structure such as helical capping signals, side chain correlations, and segment length distributions. Our model is Markovian in the segments, permitting efficient exact calculation of the posterior probability distribution over all possible segmentations of the sequence using dynamic programming. The optimal segmentation is computed and compared to a predictor based on marginal posterior modes, and the latter is shown to provide significant improvement in predictive accuracy. The marginalization procedure provides exact secondary structure probabilities at each sequence position, which are shown to be reliable estimates of prediction uncertainty. We apply this model to a database of 452 nonhomologous structures, achieving accuracies as high as the best currently available methods. We conclude by discussing an extension of this framework to model nonlocal interactions in protein structures, providing a possible direction for future improvements in secondary structure prediction accuracy.  相似文献   

16.
It has been many years since position-specific residue preference around the ends of a helix was revealed. However, all the existing secondary structure prediction methods did not exploit this preference feature, resulting in low accuracy in predicting the ends of secondary structures. In this study, we collected a relatively large data set consisting of 1860 high-resolution, non-homology proteins from the PDB, and further analyzed the residue distributions around the ends of regular secondary structures. It was found that there exist position-specific residue preferences (PSRP) around the ends of not only helices but also strands. Based on the unique features, we proposed a novel strategy and developed a tool named E-SSpred that treats the secondary structure as a whole and builds models to predict entire secondary structure segments directly by integrating relevant features. In E-SSpred, the support vector machine (SVM) method is adopted to model and predict the ends of helices and strands according to the unique residue distributions around them. A simple linear discriminate analysis method is applied to model and predict entire secondary structure segments by integrating end-prediction results, tri-peptide composition, and length distribution features of secondary structures, as well as the prediction results of the most famous program PSIPRED. The results of fivefold cross-validation on a widely used data set demonstrate that the accuracy of E-SSpred in predicting ends of secondary structures is about 10% higher than PSIPRED, and the overall prediction accuracy (Q(3) value) of E-SSpred (82.2%) is also better than PSIPRED (80.3%). The E-SSpred web server is available at http://bioinfo.hust.edu.cn/bio/tools/E-SSpred/index.html.  相似文献   

17.
This paper develops mathematical methods for describing and analyzing RNA secondary structures. It was motivated by the need to develop rigorous yet efficient methods to treat transitions from one secondary structure to another, which we propose here may occur as motions of loops within RNAs having appropriate sequences. In this approach a molecular sequence is described as a vector of the appropriate length. The concept of symmetries between nucleic acid sequences is developed, and the 48 possible different types of symmetries are described. Each secondary structure possible for a particular nucleotide sequence determines a symmetric, signed permutation matrix. The collection of all possible secondary structures is comprised of all matrices of this type whose left multiplication with the sequence vector leaves that vector unchanged. A transition between two secondary structures is given by the product of the two corresponding structure matrices. This formalism provides an efficient method for describing nucleic acid sequences that allows questions relating to secondary structures and transitions to be addressed using the powerful methods of abstract algebra. In particular, it facilitates the determination of possible secondary structures, including those containing pseudoknots. Although this paper concentrates on RNA structure, this formalism also can be applied to DNA.  相似文献   

18.
Abstract

This paper develops mathematical methods for describing and analyzing RNA secondary structures. It was motivated by the need to develop rigorous yet efficient methods to treat transitions from one secondary structure to another, which we propose here may occur as motions of loops within RNAs having appropriate sequences. In this approach a molecular sequence is described as a vector of the appropriate length. The concept of symmetries between nucleic acid sequences is developed, and the 48 possible different types of symmetries are described. Each secondary structure possible for a particular nucleotide sequence determines a symmetric, signed permutation matrix. The collection of all possible secondary structures is comprised of all matrices of this type whose left multiplication with the sequence vector leaves that vector unchanged. A transition between two secondary structures is given by the product of the two corresponding structure matrices. This formalism provides an efficient method for describing nucleic acid sequences that allows questions relating to secondary structures and transitions to be addressed using the powerful methods of abstract algebra. In particular, it facilitates the determination of possible secondary structures, including those containing pseudoknots. Although this paper concentrates on RNA structure, this formalism also can be applied to DNA  相似文献   

19.
Detection of common motifs in RNA secondary structures.   总被引:2,自引:2,他引:0       下载免费PDF全文
We describe a novel computerized system for comparison of RNA secondary structures and demonstrate its use for experimental studies. The system is able to screen a very large number of structures, to cluster similar structures and to detect specific structural motifs. In particular, the system is useful for detecting mutations with specific structural effects among all possible point mutations, and for predicting compensatory mutations that will restore the wild type structure. The algorithms are independent of the folding rules that are used to generate the secondary structures.  相似文献   

20.
Multiprotein systems mediate most regulatory processes in living organisms. Although the structures of the individual proteins are often defined, less is known of the structures of multiprotein systems. Computational methods for predicting interfaces, using evolutionary conservation and/or physicochemical data, have been developed. Here we consider the use of solvent accessibility, residue propensity, and hydrophobicity, in conjunction with secondary structure data, as prediction parameters. We analyze the influence of residue type and secondary structure on solvent accessibility and define a measure of "relative exposedness." Clustering abnormally high scoring residues provides a basis for predicting interaction sites. The analysis is extended to investigate abnormally exposed secondary structure elements, particularly beta-sheet strands. We show that surface-exposed beta-strands lacking protective features are more likely to be found at protein-protein interfaces, allowing us to create an algorithm with approximately 68% and approximately 75% accuracy in differentiating between interacting and edge strands in isolated beta-strands and beta-sheet strands, respectively. These methods of identifying abnormally exposed surface regions are combined in an algorithm, which, on a data set of 77 unbound and disjoint (single chain extracted from complex) structures, predicts 79% of the protein-protein interfaces correctly. If enzyme-inhibitor complexes, where the inhibitor mimics a nonprotein substrate, are excluded, the accuracy increases to 85%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号