首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
在海量网页中进行自动的主题识别是网页信息分析挖掘的重要研究方向,具有重要的理论和应用意义。提出一种基于集成学习的网页主题识别算法框架,由异质网页属性集构建不同的最大间隔分类器,使用集成学习对基分类器的信息进行融合。在基准数据集上进行测试,其结果表明该算法对网页主题识别是有效的。  相似文献   

2.
网页分类存在着新词多、特征维数高的问题,提出了一种新的网页分类方法。首先利用桥本体对分类领域本体进行集成,建立多本体语义标注模型,对文本特征进行降维。在此基础上,对不同类标号的关健词进行聚类,解决新词无法识别的问题,同时考虑网页标签的不同权重,用改进的SVM模型对中文网页进行分类。实验结果表明,上述方’法改进了传统SVM分类器的性能。  相似文献   

3.
为了提高货币识别率,提出了用负相关学习算法来提高神经网络集成的泛化能力.将紫外光照射下的纸币图片作为实验样本,将负相关学习法的集成神经网络用于分类器设计,选择6种面额纸币在不同噪声下的样本共300个作为训练样本,对单个神经网络分类器和神经网络集成分类器进行了MATLAB仿真,并对仿真所得的可靠性、识别率进行对比.实验结果表明,基于负相关学习的神经网络集成对货币识别分类有很好的效果,与应用单个神经网络的系统和独立训练个体网络的集成神经网络相比,它的识别率平均可以高出4%.  相似文献   

4.
《现代电子技术》2019,(24):140-145
为了进一步提高基于深度神经网络短文本分类性能,提出将集成学习方法应用于5种不同的神经网络文本分类器,即卷积神经网络、双向长短时记忆网络、卷积循环神经网络、循环卷积神经网络、分层注意力机制神经网络,分别对两种集成学习方法(Bagging,Stacking)进行了测试。实验结果表明:将多个神经网络短文本分类器进行集成的分类性能要优于单一文本分类模型;进一步两两集成的实验验证了单个模型对短文本分类性能的贡献率。  相似文献   

5.
罗会兰  杜连平 《电视技术》2012,36(23):39-42
针对单分类器没有充分考虑数据集的特征而不能很好地完成分类识别,提出了一种基于集成学习技术的SVM集成的图像分类方法。该方法是在基于较为流行的词袋(Bag-of-Words,BOW)模型的图像分类方法的基础上,利用训练生成的不同SVM分类器分类测试图像,并将分类结果采用集成学习算法进行集成。分别采用传统的BOW模型的图像分类方法和本文提出的方法进行分类实验,实验结果表明采用SVM集成的图像分类方法明显提高了分类精度,具有一定的稳健性。  相似文献   

6.
周进登  王晓丹  权文  许燕  姚旭 《电子学报》2011,39(7):1514-1522
 纠错输出编码作为解决多类分类问题的通用集成框架,能有效的把多类问题分解为二类问题从而使问题得以简化.然而在生成基分类器的过程中,经常面临提高基分类器之间的差异性和增加各基分类器与集成分类器学习的一致性的矛盾,称之为consistent-diverse平衡问题.在保证差异性的前提下减小由学习不一致性引起的分类错误率是解决该平衡问题的一个出发点,在此利用加权解码,通过对加权系数矩阵的再学习进而减弱和消除由基分类器学习不一致性产生的误差.实验利用人工数据集和UCI数据集分别加以验证,结果表明以集成分类器的分类错误率为适应度函数的遗传算法搜索出的最优加权系数矩阵相比其它方法产生的系数矩阵在解决consistent-diverse平衡问题更具有优越性.  相似文献   

7.
由于计算机内存资源限制,分类器组合的有效性及最优性选择是机器学习领域的主要研究内容。经典的集成分类算法在处理小数据集时,拥有较高的分类准确性,但面对大量数据时,由于多基分类器学习、分类共用1台计算机资源,导致运算效率较低,这显然不适合处理当今的海量数据。针对已有集成分类算法只适合作用于小规模数据集的缺点,剖析了集成分类器的特性,采用基于聚合方式的集成分类器和云计算的MapReduce技术设计了并行集成分类算法(EMapReduce),达到并行处理大规模数据的目的。并在Amazon计算集群上模拟实验,实验结果表明该算法具有一定的高效性和可行性。  相似文献   

8.
要丽娟  郭银芳 《激光杂志》2023,(11):147-151
针对光纤光栅传感网络结构复杂,入侵行为检测难度较高的问题,研究基于集成学习的光纤光栅传感网络入侵行为检测方法。选取支持向量机作为集成学习算法的基分类器,计算各基分类器分类光纤光栅传感网络入侵行为样本的误差率,依据基分类器的误差率确定基分类器的重要程度。利用AdaBoost集成学习算法,依据各基分类器的重要程度集成各基分类器,构建最终的集成分类器,利用所构建集成分类器,输出光纤光栅传感网络入侵行为检测结果。实验结果表明,该方法可以精准检测光纤光栅传感网络的远程入侵、拒绝服务入侵等入侵行为,数据丢弃量较低,提升了光纤光栅传感网络的通信性能。  相似文献   

9.
提出了一种多贝叶斯网络集成的分类和预测方法.把专家知识作为"疫苗",利用免疫遗传算法和约束信息熵适应度函数相结合的方法进行贝叶斯网络结构的学习,得到多个反映同一样本数据集的、网络结构复杂度折衷的、满意的贝叶斯网络结构.然后,给出了多贝叶斯网络分类器集成模型,把学习得到的贝叶斯网络进行集成,代表"专家"对未知类别的不完全数据进行群决策的分类和预测,提升贝叶斯网络分类器的泛化能力.最后,结合贝叶斯推理工具GeNIe软件,通过实例说明该方法的合理性和有效性.  相似文献   

10.
一种基于集成学习和特征融合的遥感影像分类新方法   总被引:1,自引:1,他引:0  
针对多源遥感数据分类的需要,提出了一种基于全极化SAR影像、极化相干矩阵特征、光学遥感影像光谱和纹理的多种特征融合和多分类器集成的遥感影像分类新方法.对全极化PALSAR数据进行预处理和极化相干矩阵特征提取,利用灰度共生矩阵计算光学和SAR影像的对比度、逆差距、二阶距、差异性等纹理特征参数,并与光谱特征结合,形成6种组合策略.利用集成学习方法对随机森林分类器、子空间分类器、最小距离分类器、支持向量机分类器、反向传播神经网络分类器等分类器进行组合,对不同组合策略的遥感影像特征集进行分类.结果表明提出的基于多种特征和多分类器集成的新方法很好地利用了主被动遥感数据在不同地表景观类型提取上的潜力,综合了多种算法的优势,能够有效地提高总体精度和各类别的分类精度.  相似文献   

11.
The existing anti-phishing approaches use the blacklist methods or features based machine learning techniques. Blacklist methods fail to detect new phishing attacks and produce high false positive rate. Moreover, existing machine learning based methods extract features from the third party, search engine, etc. Therefore, they are complicated, slow in nature, and not fit for the real-time environment. To solve this problem, this paper presents a machine learning based novel anti-phishing approach that extracts the features from client side only. We have examined the various attributes of the phishing and legitimate websites in depth and identified nineteen outstanding features to distinguish phishing websites from legitimate ones. These nineteen features are extracted from the URL and source code of the website and do not depend on any third party, which makes the proposed approach fast, reliable, and intelligent. Compared to other methods, the proposed approach has relatively high accuracy in detection of phishing websites as it achieved 99.39% true positive rate and 99.09% of overall detection accuracy.  相似文献   

12.

In recent times, a phishing attack has become one of the most prominent attacks faced by internet users, governments, and service-providing organizations. In a phishing attack, the attacker(s) collects the client’s sensitive data (i.e., user account login details, credit/debit card numbers, etc.) by using spoofed emails or fake websites. Phishing websites are common entry points of online social engineering attacks, including numerous frauds on the websites. In such types of attacks, the attacker(s) create website pages by copying the behavior of legitimate websites and sends URL(s) to the targeted victims through spam messages, texts, or social networking. To provide a thorough understanding of phishing attack(s), this paper provides a literature review of Artificial Intelligence (AI) techniques: Machine Learning, Deep Learning, Hybrid Learning, and Scenario-based techniques for phishing attack detection. This paper also presents the comparison of different studies detecting the phishing attack for each AI technique and examines the qualities and shortcomings of these methodologies. Furthermore, this paper provides a comprehensive set of current challenges of phishing attacks and future research direction in this domain.

  相似文献   

13.
朴素贝叶斯分类算法由于其计算高效在生活中应用广泛。本文根据集成算法的差异性特征,聚类算法聚类点的选择方式的可变性,提出了基于K-medoids聚类技术的贝叶斯集成算法,朴素贝叶斯的泛化性能得到了提升。首先,通过样本集训练出多个朴素贝叶斯基分类器模型;然后,为了增大基分类器之间的差异性,利用K-medoids算法对基分类器在验证集上的预测结果进行聚类;最后,从每个聚类簇中选择泛化性能最佳的基分类器进行集成学习,最终结果由简单投票法得出。将该算法应用于UCI数据集,并与其他类似算法进行比较可得,本文提出的基于K-medoids聚类的贝叶斯集成算法(NBKME)提高了数据集的分类准确率。  相似文献   

14.
In this paper, an empirical study of the development and application of a committee of neural networks on online pattern classification tasks is presented. A multiple classifier framework is designed by adopting an Adaptive Resonance Theory-based (ART) autonomously learning neural network as the building block. A number of algorithms for combining outputs from multiple neural classifiers are considered, and two benchmark data sets have been used to evaluate the applicability of the proposed system. Different learning strategies coupling offline and online learning approaches, as well as different input pattern representation schemes, including the "ensemble" and "modular" methods, have been examined experimentally. Benefits and shortcomings of each approach are systematically analyzed and discussed. The results are comparable, and in some cases superior, with those from other classification algorithms. The experiments demonstrate the potentials of the proposed multiple neural network systems in offering an alternative to handle online pattern classification tasks in possibly nonstationary environments.  相似文献   

15.
基于集成学习提出了一种新的模糊分类规则的产生算法。将分类规则的前件、后件模糊化,在自适应提升(Adaptive Boosting,AdaBoost)算法的迭代中,调整训练实例的分布,利用遗传算法产生模糊分类规则。并在规则学习的适应度函数中引入训练实例的分布,使得模糊分类规则在产生阶段就考虑相互之间的协作,产生具有互补性的分类规则集。从而改善了模糊分类规则的整体识别能力,提高了分类识别精度。  相似文献   

16.
Internet technology is so pervasive today, for example, from online social networking to online banking, it has made people’s lives more comfortable. Due the growth of Internet technology, security threats to systems and networks are relentlessly inventive. One such a serious threat is “phishing”, in which, attackers attempt to steal the user’s credentials using fake emails or websites or both. It is true that both industry and academia are working hard to develop solutions to combat against phishing threats. It is therefore very important that organisations to pay attention to end-user awareness in phishing threat prevention. Therefore, aim of our paper is twofold. First, we will discuss the history of phishing attacks and the attackers’ motivation in details. Then, we will provide taxonomy of various types of phishing attacks. Second, we will provide taxonomy of various solutions proposed in literature to protect users from phishing based on the attacks identified in our taxonomy. Moreover, we have also discussed impact of phishing attacks in Internet of Things (IoTs). We conclude our paper discussing various issues and challenges that still exist in the literature, which are important to fight against with phishing threats.  相似文献   

17.
Although phishing is a form of cybercrime that internet users get confronted with rather frequently, many people still get deceived by these practices. Since receiving phishing e-mails is an important prerequisite of victimization, this study focusses on becoming a phishing target. More precisely, we use an integrative lifestyle exposure model to study the effects of risky online routine activities that make a target more likely to come across a motivated offender. Insights of the lifestyle exposure model are combined with propensity theories in order to determine which role impulsivity plays in phishing targeting. To achieve these objectives, data collected in 2016 from a representative sample (n?=?723) were used. Support was found for a relationship between both online purchasing behavior and digital copying behavior, and phishing targeting. Moreover, a relationship was found between all online activities (except for online purchasing behavior) and impulsivity. The present study thus suggests that especially online shoppers and users who often share and use copied files online should be trained to deal with phishing attacks appropriately.  相似文献   

18.
Based on the similarity of the layout structure between the phishing sites and real sites,an approach to discover phishing sites was presented.First,the tag with link attribute as a feature was extracted,and then based on the feature,the page tag sequence branch to identify website was extracted,followed by the page layout similarity-HTMLTagAntiPhish,the alignment of page tag sequence tree into the alignment of page tag sequence branches was converted,this converted two-dimention tree structure into one-dimention string structure,and finally through the substitution matrix of bioinfor-matics BLOSUM62 coding,alignment score quickly to improve the phishing sites detection efficiency was computed.A series of simulation experiments show that this approach is feasible and has higher precision and recall rates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号