首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
王健  谢冬  杨志豪  林鸿飞 《计算机科学》2011,38(12):232-235
蛋白质关系网络的研究在生物医学领域中已成为一个热点。研究者通过对蛋白质关系网络进行分析和聚类,能够发现其中的复合体,进一步理解细胞组织原理。在对关系网络进行分析的过程中,将网络拓扑显示为图形,以直观地表示出关系网络的结构,便于对比聚类方法,辅助关系网络的研究。利用网络建模与可视化工具包JUNG设计并实现了一个蛋白质关系网络可视化系统,它能够解析多种格式的蛋白质关系网络数据,集成了几种有效的图聚类算法,并实现了一种基于蛋白质功能标注的发现复合体的聚类算法。用户能够通过二维网络视图方便地观察原始网络和聚类后的结果。  相似文献   

2.
随着可获得的大规模蛋白质相互作用数据的迅速增长,从系统水平上对细胞机制的基本组件和结构的理解成为了一种可能。如今所面临的最大挑战是如何通过分析此类复杂的相互作用数据来反映细胞组织、进程以及功能的规律。基于图理论的聚类方法是分析蛋白质相互作用数据的有效手段。本文将从蛋白质相互作用网络(PPI网络)的图模型、聚类算法、评估方法及应用几个方面描述PPI网络聚类分析的最新研究进展。最后,讨论该方向研究所面临的挑战及进一步的研究方向。  相似文献   

3.
蛋白质复合物对于生物学家有效了解细胞组织和功能具有重要意义,如何通过计算方法从蛋白质-蛋白质相互作用(PPI)网络中识别复合物是当前研究热点之一。然而,由于PPI网络中存在大量假阴性和假阳性噪声数据且现有已知蛋白质复合物并不完整,使得如何克服PPI网络的噪声问题,以及更好地利用已知蛋白质复合物,成为蛋白质复合物识别亟待解决的关键问题。为此,该文提出一种基于蛋白质复合物拓扑信息,利用监督学习进行蛋白质复合物识别的算法(NOBEL)。首先,NOBEL根据蛋白质的生物信息和拓扑信息构建加权PPI网络,降低了网络中的噪声问题;然后,通过加权PPI网络和未加权PPI网络提取复合物拓扑信息作为特征,并根据提取的特征训练监督学习模型,使得监督学习模型能有效学习复合物蕴含的信息;最后,将训练好的模型应用于PPI网络识别蛋白质复合物。作者在四种真实PPI网络上进行了实验,实验结果表明,NOBEL与其他七种蛋白质复合物识别算法相比,在F-measure方面分别至少提高了4.39%(Gavin)、1.32%(DIP)、2.39%(WI-PHI_core)和2.34%(WI-PHI_extend)。  相似文献   

4.
实验产生的蛋白质相互作用数据不可避免地伴随着假阳性和假阴性,因而,基于蛋白质相互作用数据预测蛋白质复合物的计算方法天然具有较大的误差。为了弥补这种数据先天性不足,基因表达谱被结合进来,构造了新的加权蛋白质网络。为了验证网络的生物学意义,马尔可夫聚类算法被用于从加权与非加权网络中预测蛋白质复合物,预测到的复合物与基准复合物进行匹配分析。分析结果表明,加权网络比非加权网络具有更高的生物学意义。  相似文献   

5.
In this paper, we describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface), based on the identity of the target residue and its ten sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease-inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to predict residues involved in protein-protein interactions from sequence information alone.  相似文献   

6.
蛋白质复合体是由两条或多条相关联的多肽链组成, 在生物过程中起着重要作用. 假如用图表示蛋白质–蛋白质相互作用(protein-protein interactions, PPI)网络数据, 那么从中找出紧密耦合的蛋白质复合体是非常困难的, 特别是在近年来PPI网络的容量大大增加的情况下. 在本文中, 通过对称非负矩阵分解, 针对蛋白质复合体检测问题提出了一种图聚类方法, 该方法可以有效地从复杂网络中检测密集的连通子图. 并且将此方法和当前最先进的一些方法在3个PPI数据集中用同一个基准进行比较. 实验结果表明, 本文的方法在3个拥有不同大小和密度的数据集中均显著优于其它方法.  相似文献   

7.
Rapidly identifying protein complexes is significant to elucidate the mechanisms of macromolecular interactions and to further investigate the overlapping clinical manifestations of diseases.To date,existing computational methods majorly focus on developing unsupervised graph clustering algorithms,sometimes in combination with prior biological insights,to detect protein complexes from protein-protein interaction(PPI)networks.However,the outputs of these methods are potentially structural or functional modules within PPI networks.These modules do not necessarily correspond to the actual protein complexes that are formed via spatiotemporal aggregation of subunits.In this study,we propose a computational framework that combines supervised learning and dense subgraphs discovery to predict protein complexes.The proposed framework consists of two steps.The first step reconstructs genome-scale protein co-complex networks via training a supervised learning model of l2-regularized logistic regression on experimentally derived co-complexed protein pairs;and the second step infers hierarchical and balanced clusters as complexes from the co-complex networks via effective but computationally intensive k-clique graph clustering method or efficient maximum modularity clustering(MMC)algorithm.Empirical studies of cross validation and independent test show that both steps achieve encouraging performance.The proposed framework is fundamentally novel and excels over existing methods in that the complexes inferred from protein co-complex networks are more biologically relevant than those inferred from PPI networks,providing a new avenue for identifying novel protein complexes.  相似文献   

8.
孟军  张信 《计算机应用》2015,35(6):1637-1642
针对单一数据源预测蛋白质功能效果不佳以及蛋白质相互作用网络信息不完全等问题,提出一种多数据源融合和基于双重索引矩阵的随机游走的蛋白质功能预测(MSI-RWDIM)算法。该算法使用了蛋白质序列、基因表达和蛋白质相互作用数据预测蛋白质功能,并根据这些数据源特性构建相应的相互作用加权网络;然后融合各数据源加权网络并结合功能相关性网络构建双重索引矩阵,使用随机游走算法计算得分进而预测蛋白质功能。在酵母数据集的五折交叉验证中,MSI-RWDIM算法具有较高的准确率和较低的覆盖率,还可降低功能标签损失率。研究结果表明,MSI-RWDIM算法的总体性能优于常用的k-近邻、直推式多标签集成分类和快速同步加权方法。  相似文献   

9.
Protein complexes play important roles in integrating individual gene products to perform useful cellular functions.The increasing mount of protein–protein interaction(PPI)data has enabled us to predict protein complexes.In spite of the advances in these computational approaches and experimental techniques,it is impossible to construct an absolutely reliable PPI network.Taking into account the reliability of interactions in the PPI network,we have constructed a weighted protein–protein interaction(WPPI)network,in which the reliability of each interaction is represented as a weight using the topology of the PPI network.As overlaps are likely to have biological importance,we proposed a novel method named WN-PC(weighted network-based method for predicting protein complexes)to predict overlapping protein complexes on the WPPI network.The proposed algorithm predicts neighborhood graphs with an aggregation coefficient over a threshold as candidate complexes,and binds attachment proteins to candidate complexes.Finally,we have filtered redundant complexes which overlap other complexes to a very high extent in comparison to their density and size.A comprehensive comparison between competitive algorithms and our WN-PC method has been made in terms of the F-measure,coverage rate,and P-value.We have applied WN-PC to two different yeast PPI data sets,one of which is a huge PPI network consisting of over 6000 proteins and 200000 interactions.Experimental results show that WN-PC outperforms the state-of-the-art methods.We think that our research may be helpful for other applications in PPI networks.  相似文献   

10.
鉴于多标签传播算法在发现社会网络的社区结构研究上具有快速、高效的求解能力,提出融合多源蛋白质生物学知识的基于多标签传播机制的蛋白质相互作用(PPI)网络功能模块检测算法.首先,结合PPI网络功能信息和结构信息初始化节点的标签.然后,利用基因表达数据描述蛋白质间的共表达性,依据共表达性构建标签集合,从中选择标签以实现标签在节点间真实可靠的传播.最后,将具有相同标识符的节点划分到同一功能模块中,获得最终结果.实验表明文中算法不仅具有良好的时间性能,而且在检测精度上也具有一定的竞争性.  相似文献   

11.
探测蛋白质相互作用网络中的功能模块对于理解生物系统的组织和功能具有重要的意义。目前,普遍的做法是将蛋白质相互作用网络表示成一个图,利用各种图聚类算法来挖掘功能模块。本文采用了基于模块度优化的图聚类算法来探测蛋白质相互作用网络中的集团,从具有2617个节点11855个相互作用的酵母蛋白相互作用网络中探测出68个集团。对于得到的集团,首先从拓扑结构的角度验证其的确是内部连接稠密的子图,然后分析了MIPS数据库中ComplexCat提供的已知的蛋白质复合体与这些集团的重叠情况,发现很多蛋白质复合体完全包含在某些集团中,最后使用超几何聚集分布的P值来分析一个集团对某个特定功能的富集程度,并根据最小的P值对应的功能来注释该集团的主要功能,发现集团中大部分的蛋白质具有相同的功能。研究结果表明,该方法探测的集团具有重要的生物学功能意义。  相似文献   

12.
对蛋白质相互作用的研究不仅能够理解生命的过程,也能为疾病治疗提供线索.通过对现有蛋白质相互作用预测计算方法的分析,将计算方法和生命科学相结合,在利用现有的知名生物数据库获得大量蛋白质相互作用关系数据的基础上,建立人类蛋白质相互作用网络,通过计算来预测可能导致帕金森病的蛋白质.在总结前人算法的基础上,利用改进的APM[1]算法,实现了对蛋白质一级网络以及二级网络的预测工作.  相似文献   

13.
In the post-genomic era, proteomics has achieved significant theoretical and practical advances with the development of high-throughput technologies. Especially the rapid accumulation of protein-protein interactions (PPIs) provides a foundation for constructing protein interaction networks (PINs), which can furnish a new perspective for understanding cellular organizations, processes, and functions at network level. In this paper, we present a comprehensive survey on three main characteristics of PINs: centrality, modularity, and dynamics. 1) Different centrality measures, which are used to calculate the importance of proteins, are summarized based on the structural characteristics of PINs or on the basis of its integrated biological information; 2) Different modularity definitions and various clustering algorithms for predicting protein complexes or identifying functional modules are introduced; 3) The dynamics of proteins, PPIs and sub-networks are discussed, respectively. Finally, the main applications of PINs in the complex diseases are reviewed, and the challenges and future research directions are also discussed.  相似文献   

14.
Given an undirected/directed large weighted data graph and a similar smaller weighted pattern graph, the problem of weighted subgraph matching is to find a mapping of the nodes in the pattern graph to a subset of nodes in the data graph such that the sum of edge weight differences is minimum. Biological interaction networks such as protein-protein interaction networks and molecular pathways are often modeled as weighted graphs in order to account for the high false positive rate occurring intrinsically during the detection process of the interactions. Nonetheless, complex biological problems such as disease gene prioritization and conserved phylogenetic tree construction largely depend on the similarity calculation among the networks. Although several existing methods provide efficient methods for graph and subgraph similarity measurement, they produce nonintuitive results due to the underlying unweighted graph model assumption. Moreover, very few algorithms exist for weighted graph matching that are applicable with the restriction that the data and pattern graph sizes are equal. In this paper, we introduce a novel algorithm for weighted subgraph matching which can effectively be applied to directed/undirected weighted subgraph matching. Experimental results demonstrate the superiority and relative scalability of the algorithm over available state of the art methods.  相似文献   

15.
多关系蛋白质网络构建及其应用研究   总被引:3,自引:0,他引:3  
考虑到不同类型的相互作用对于功能预测的作用各不相同, 结合蛋白质相互作用网络和蛋白质结构域信息构建多关系蛋白质网络, 并为每种类型的相互作用赋予不同的遍历优先级.基于多关系网络, 提出一种蛋白质功能预测方法FPM (Functions prediction based on multi-relational networks).对于未注释的蛋白质, 算法遍历与该蛋白质相连的, 具有最高优先级的所有相互作用, 形成一个候选邻居节点集合.最后根据邻居节点集合形成预测的功能集合, 并为每一项功能评分、排序.与其他算法对比结果表明, FPM方法的性能优于其他的功能预测方法.  相似文献   

16.
胡赛  熊慧军  赵碧海  李学勇  王晶 《自动化学报》2015,41(11):1893-1900
一个蛋白质可能在不同条件或不同时刻与不同的蛋白质发生相互作用,这称为蛋白质的动态特性.蛋白质在分子处理的不同阶段参与到不同的模块,与其他的蛋白质共同完成某项功能.因此, 动态蛋白质相互作用的研究有助于提高蛋白质功能预测的准确率.结合蛋白质相互作用网络和时间序列基因表达数据,构建动态蛋白质相互作用网络.为降低PPI网络中假阴性对功能预测产生的负面影响,结合结构域信息和复合物信息,预测和产生新的相互作用,并对相互作用加权.基于构建的动态加权网络,提出一种功能预测方法D-PIN (Dynamic protein interaction networks). 基于三个不同的酵母相互作用网络实验结果表明, D-PIN 方法的综合性能比现有方法提高了14%以上.结果验证了构建的动态加权蛋白质相互网络的有效性.  相似文献   

17.
研究蛋白质相互作用网络的演化机制及模型对于理解生物系统的进化及组织形成过程具有重要的意义。到目前为止,已经出现了多种依赖不同演化机制的蛋白质相互作用网络演化模型,这些模型有针对性地体现了真实蛋白质相互作用网络中出现的某些拓扑特征,但同时也具有一定的局限性。通过对典型蛋白质相互作用网络演化模型进行研究,从模型的构建机理、演化模型及真实蛋白质相互作用网络的拓扑特征等方面进行了分析和比较,并总结了各个模型的特点。最后,对蛋白质网络演化模型的进一步发展提出了自己的看法,为深入理解蛋白质相互作用网络演化模型提供有益参考。  相似文献   

18.
Over recent years, advances in proteomic technologies have led to the rapid generation of vast volumes of data regarding many clinical applications, acquired from cells through to patients. High-throughput analysis of protein-protein interactions, quantification of protein abundance, global analysis of posttranslational modifications and other approaches have addressed the complexity of cellular regulation in health and disease, which has opened the way to systems-level clinical research. The dysregulation of cell adhesion plays a key role in many disease states. The image represents a systematic analysis of integrin adhesion complexes, which were isolated from erythroleukemia cells and analyzed bymass spectrometry (see Byron et al., Science Signaling 2011, 4, pt2). To understand how specific adhesion complexes may function, the composition of different complexes was compared using hierarchical clustering, and a protein-protein interaction network model of the core receptor-bound subcomplex was constructed. Data-driven, global investigations such as this could pave the road to new therapeutic targets and personalized medicines. Cover image created by Adam Byron (University of Manchester,Manchester, UK); cover design by SCHULZ Grafik-Design.  相似文献   

19.
20.
朱海湾 《计算机应用研究》2020,37(2):390-397,420
针对基于蚁群聚类的蛋白质复合物挖掘算法中,静态PPI网络难以真实反映细胞的动态特性,收敛速度较慢、聚类准确性和召回率不高等问题,提出一种基于模糊粒度和紧密度的蚁群聚类动态加权PPI网络复合物挖掘方法(FGCDACC-DPC)。首先基于动态PPI网络的拓扑特性和生物特性设计了综合性权值度量(comprehensive weight metric,CWM),准确描述了蛋白质之间的相互作用;其次根据复合物的基本特征,构建一组稠密且高度共表达的复合核,然后设计模糊粒度和紧密度的拾起放下模型对其余节点聚类,降低了计算复杂度和随机性,加快聚类速度;最后基于功能信息传递和时序功能相关的思想分别构建了局部和全局权值更新策略,实现不同代蚁群和不同时刻网络之间的功能信息传递,提高聚类准确性。将FGCDACC-DPC算法应用在DIP数据上进行复合物挖掘,实验结果表明该算法的精度和召回率较高,能够较准确地识别蛋白质复合物。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号