首页 | 官方网站   微博 | 高级检索  
     

基于加权自相关函数特征提取法的多类蛋白质同源寡聚体分类研究
引用本文:张绍武,潘泉,赵春晖,程咏梅.基于加权自相关函数特征提取法的多类蛋白质同源寡聚体分类研究[J].生物医学工程学杂志,2007,24(4):721-726.
作者姓名:张绍武  潘泉  赵春晖  程咏梅
作者单位:西北工业大学,自动化学院,西安,710072
基金项目:国家自然科学基金;西北工业大学校科研和教改项目
摘    要:我们提出一种新的特征提取方法,即用蛋白质序列的氨基酸组成成分和一系列的氨基酸残基指数加权自相关函数构成特征向量,表示蛋白质序列,与支持向量机算法组合对蛋白质同源二聚体、同源三聚体、同源四聚体、同源六聚体进行分类研究,得到较好的分类结果。在Jackknife检验下,采用支持向量机算法,基于此新特征提取法所构成的参数集QIANA、QIANB、MEEJ、ROBB和SNEP的总分类精度分别为77.63%、77.16%、76.46%、76.70%、75.06%,分别比传统氨基酸组成成分特征提取法(参数集为COMP)提高6.39、5.92、5.22、5.46、3.82个百分点。对于参数集QIANA,支持向量机的总分类精度为77.63%,比协方差算法提高16.29个百分点。这些结果表明:(1新特征提取法是有效和可行的,基于此特征提取法构成的特征向量包含蛋白质四级结构信息,且可能捕获了埋藏在缔合亚基作用部位接触表面的基本信息;(2)对于蛋白质同源寡聚体分类研究,支持向量机是非常有效的。

关 键 词:特征提取  加权自相关函数  支持向量机  同源寡聚体
修稿时间:2005-03-312005-10-10

Classification of Multi-class Homo-oligomer Based on a Novel Method of Feature Extraction from Protein Primary Structure
Zhang Shaowu,Pan Quan,Zhao Chunhui,Cheng Yongmei.Classification of Multi-class Homo-oligomer Based on a Novel Method of Feature Extraction from Protein Primary Structure[J].Journal of Biomedical Engineering,2007,24(4):721-726.
Authors:Zhang Shaowu  Pan Quan  Zhao Chunhui  Cheng Yongmei
Affiliation:School of Automatic Control, Northwestern Polytechnic University, Xi'an 710072, China. zhangsw@nwpu.edu.cn
Abstract:A novel method of feature extraction from protein primary structure has been proposed and applied to classify the protein homodimer, homotrimer, homotetramer and homohexamer, i.e. one protein sequence can be represented by a feature vector composed of amino acid compositions and a set of weighted auto-correlation function factors of amino acid residue index. As a result, high classification accuracies are obtained. For example, with the same support vector machine (SVM), the total accuracies of QIANA, AIANB, MEEJ, ROBB and SNEP sets based on this novel feature extraction method are 77.63, 77.16, 76.46, 76.70 and 75.06% respectively in Jackknife test, which are 6.39, 5.92, 5.22, 5.46 and 3.82 percent points respectively higher than that of COMP set based on the conventional method composed of amino acid compositions. With the same QIANA set, the total accuracy of SVM is 77.63%, which is 16.29 percent points higher than that of covariant discriminant algorithm. These results show:(1) The novel feature extraction method is effective and feasible, and the feature vectors based on this method may contain more protein quaternary structure information and appear to capture essential information about the composition and hydrophobicity of residues in the surface patches buried in the interfaces of associated subunits; (2) SVM can be referred as a powerful computational tool for classifying the homo-oligomers of proteins.
Keywords:Feature extraction Weighted auto-correlation function Support vector machine(SVM) Homo-oligomers
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号