首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
基于类别比例因子和类内均分度的χ2统计改进   总被引:1,自引:1,他引:0  
张瑜  张德贤 《电子科技》2010,23(12):70-72
针对χ2统计特征选择方法的两大局限:对低文档频的特征选择不合理,以及过分强调那些在指定类低频出现,而在其他类中高频出现的特征项在该类中的权重。提出基于类别比例因子与类内均分度的χ2统计特征选择的改进方法。实验结果表明,改进方法的分类效果优于传统方法。  相似文献   

2.
刘洺辛  陈晶  王麒媛 《电信科学》2018,34(10):85-95
提出了结合情感词典的改进信息增益特征选择方法。首先,针对现有的信息增益特征选择存在注重特征词的文档频率而忽视语料均衡等问题,提出了改进方法。其次,考虑情感词对文本分类的影响,提出了基于情感词典的特征选择(information gain combining sentiment classification,IGSC)算法进行文本分类。该算法通过对文本情感词进行匹配并结合情感词赋权重,实现了特征降维并解决了文本数据稀疏影响分类性能的问题;最后,针对旅游评论数据集对所提出的特征选择方法进行了实验验证及分析。实验结果表明,本文提出的改进文本情感分类特征选择方法在分类准确率、召回率和F值方面均得到了提升,并且具有较好的分类稳定性。  相似文献   

3.
关于热词的分析提取,主要根据特征词的权重评估一字词对于一个文件集或一个语料库中的重要程度。在信息技术中,传统的TF-IDF函数得到广泛运用。在文中分类中,有学者提出TF-IGM函数,即量化词项使用词频和文档重力力矩来判断一个特征是否有区分度,文中针对TF-IGM函数进行改进研究。  相似文献   

4.
随着深度学习技术在自然语言处理领域的广泛应用,短文本情感分类技术得到显著发展。该文提出了一种融合TextCNN-BiGRU的多因子权重文本情感分类算法。算法通过引入词语情感类别分布、情感倾向以及情感强度三个关键因子改进了词语的向量表示。将基于词向量表示的短文本分别作为TextCNN和BiGRU模型的输入,提取文本关键局部特征以及文本上下文的全局特征,将两种特征进行线性融合,实现中文短文本的情感分类。在公开的两个情感分类数据集上验证了多因子权重向量表示方法和融合TextCNN-BiGRU的情感分类模型的有效性,实验结果表明,文中提出的算法较单一模型在短文本情感分类准确率上提高了2%。  相似文献   

5.
基于声韵母基元的嵌入式中文语音合成系统   总被引:1,自引:0,他引:1  
张皖志  陶建华 《信号处理》2005,21(Z1):216-219
引入声韵母作为合成系统的基本单元,从根本上提升了音库的可压缩性.采用分类与回归树(CART)结合高层音韵学信息对音库中的声韵母单元进行预分类,并提出了一种改进的ISODATA聚类方法,分别以Mel频标倒谱参数(MFCC)和基频包络作为特征,对分类后的声母和韵母进行聚类压缩.系统中引入发音组块作为声谱平滑单元,从而有效地提高了合成结果的音质.听辩实验及统计分析表明,在音库容量大幅压缩的情况下,该系统的合成结果接近于桌面系统.  相似文献   

6.
0634385一种新的空间索引方法-TG索引[刊,中]/李昕//信息技术与信息化.-2006,(4).-112-113,121(D) 0634386 PACS中基于图像内容的检索[刊,中]/陈戏墨//计算机工程与设计.-2006.27(18).-3382-3384(L) 0634387基于主颜色的图像检索技术研究[刊,中]/李贤成//交通与计算机.-2006,24(4).-86-88.91(D) 0634388基于文本分类TFIDF方法的改进与应用[刊,中]/张玉芳//计算机工程.-2006,32(19).-76-78(E) TFIDF是文档特征权值表示常用方法。该方法简单易行,但低估了在一个类中频繁出现的词条,该词条是能够代表这个类的文本特征的.应该赋予其较高的权重。通过修改TFIDF中IDF的表达式,来增加那些在一个类中频繁出现的词条的权重,用改进的TFIDF选择特征词条、用遗传算法训练分类器来验证其有效性。该方法优于其它算法,实验表明了改进的策略是可行的。参10  相似文献   

7.
文档区块图像分类对于文档版面图像的理解和分析至关重要。在传统机器学习分类模型中,直接使用图像作为输入会导致模型参数量过大因此无法进行训练。为了克服这个困难,我们在本文中针对文档区块图像设计了一组有效的特征,并提出了基于这些特征和机器学习的文档区块分类算法。在特征设计上,我们提取了几何、灰度、区域、纹理和内容五方面在内的32维特征,以增强特征针对区块类别的分辨能力。在分类器方面,我们在所提出的特征上对传统机器学习分类模型、自动机器学习方法以及深度学习均进行了实验。在公开数据集上的实验结果表明,我们提出的文档版面区块分类算法具有很高的分类准确率,并且十分高效。另外我们实现了一个简单的分步文档版面分析算法,以展示所提出的区块分类算法的推广能力。   相似文献   

8.
基于图像的特征点检测器在静态图像上取得了卓越的性能,然而这些方法应用于视频或序列图像时其精度和稳定性显著降低。配准监督(Supervision-by-Registration, SBR)算法利用光流算法(Lucas-Kanade, LK)追踪,可通过无标注视频训练针对视频的特征点检测器,已取得较好的结果,但LK算法仍存在一定局限性,导致检测的特征点序列在时空上的连贯性不强。为获得精准、稳定、连贯的人脸特征点序列检测效果,提出了平滑一致性损失函数、权重掩码函数对传统SBR网络模型进行改进。网络中添加长短期记忆网络(Long Short-Term Memory, LSTM)提高模型训练鲁棒性,在模型训练中使用平滑一致性损失函数提供稳定性约束,获得准确且稳定的人脸视频特征点检测器。在300VW、Youtube Celebrities数据集上的验证显示,SBR改进模型将人脸视频特征点检测的标准化平均误差(Normalized Mean Error, NME)从4.74降低至4.56,且视觉上人脸特征点检测的抖动显著减少。  相似文献   

9.
基于改进权重计算的话题跟踪   总被引:1,自引:0,他引:1  
话题跟踪(Topic Tracking)任务是话题识别与跟踪(Topic Detection and Tracking,简称TDT)中的一个子任务,它的目的是监控新闻报道流识别出与预先给定的几个新闻报道所表述的话题相关的后继报道。特征项权重的计算方法是话题跟踪中的一个重要问题,计算方法的选择关系到话题跟踪的效果。提供了一种改进的权重计算方法,该方法的主要思想是:在计算特征项的权重时考虑了特征项的位置信息,将特征项的位置信息作为加权来计算特征项的权重。实验结果表明该方法有效,并提高了跟踪系统的性能。  相似文献   

10.
提出了一种光谱角匹配(SAM)加权核特征空间分离变换(KEST)高光谱异常检测算法.在基于核的特征空间分离变换(KEST)算法基础上,利用光谱角匹配(SAM)测度对高维特征空间中检测点邻域差异相关矩阵(DCOR)中的每个样本引入权重因子,各样本权重因子取决于该样本光谱向量与检测窗口数据中心向量夹角,从而抑制检测窗口中的病态数据,突出主成分数据的贡献,使得DCOR矩阵能够更好地描述目标、背景数据分布差异.通过理论分析和对模拟、实际数据实验比较,证明该算法较传统异常检测算法和KEST算法具有更高的检测率.  相似文献   

11.
Online reviews and comments are important information resources for people. A new model, called Sentiment Vector Space Model (SVSM), for feature selection and weighting is proposed to predict the sentiment orientation of comments and reviews, e.g., sorting out positive reviews from negative ones. Different from that of topic oriented classification, feature selection of sentiment orientation prediction focuses on language characteristics. Different from traditional algorithms for sentiment classification, this model integrates grammatical knowledge and takes topic correlations into account. Features are extracted, and the similarity between these features and the topic are also computed. The feature similarity is taken as a factor when evaluating the polarity of opinions. The experimental results show that the proposed model is more effective in identifying sentiment orientation than most of the traditional techniques.  相似文献   

12.

Research in financial domain has shown that sentiment aspects of stock news have a profound impact on volume trades, volatility, stock prices and firm earnings. In-depth analysis of stock news is now sourced from financial reviews by various social networking and marketing sites to help improve decision making. Nonetheless, such reviews are in the form of unstructured text, which requires natural language processing (NLP) in order to extract the sentiments. Accordingly, in this study we investigate the use of NLP tasks in effort to improve the performance of sentiment classification in evaluating the information content of financial news as an instrument in investment decision support system. At present, feature extraction approach is mainly based on the occurrence frequency of words. Therefore low-frequency linguistic features that could be critical in sentiment classification are typically ignored. In this research, we attempt to improve current sentiment analysis approaches for financial news classification by focusing on low-frequency but informative linguistic expressions. Our proposed combination of low and high-frequency linguistic expressions contributes a novel set of features for sentiment classification. The experimental results show that an optimal Ngram feature selection (combination of optimal unigram and bigram features) enhances sentiment classification accuracy as compared to other types of feature sets.

  相似文献   

13.
This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sen-timent (JAS) model, to jointly extract aspects and aspect-dependent sentiment lexicons from online customer reviews. An as-pect-dependent sentiment lexicon refers to the aspect-specific opinion words along with their aspect-aware sentiment polarities with respect to a specific aspect. We then apply the extracted aspect- dependent sentiment lexi-cons to a series of aspect-level opinion mining tasks, including implicit aspect identification, aspect-based extractive opinion summarization, and aspect-level sentiment classification. Experimental results demonstrate the effectiveness of the JAS model in learning aspect- dependent sentiment lexicons and the practical values of the extracted lexicons when applied to these practical tasks.  相似文献   

14.
马力  宫玉龙 《电子科技》2014,27(11):180-184
对文本情感分析的研究现状与进展进行总结。从情感分析的任务情感分类、情感检索、情感抽取3个方面详细介绍了相关研究和技术方法,重点阐述了基于语义的情感词典分类方法和基于机器学习的情感分类方法,并介绍了文本情感分析的评测,提出了未来的研究方向。  相似文献   

15.
16.
With the rapid development of Web 2.0, travelers have started sharing their travel experiences on websites. The expanding amount of online hotel reviews results in the problem of information overload. Therefore, the effective identification of helpful reviews has become an important research issue. In this study, online hotel reviews were collected from TripAdvisor.com, and the helpfulness of these reviews was comprehensively investigated from the aspects of review quality, review sentiment, and reviewer characteristics. Review helpfulness prediction models were also developed by using classification techniques. The results indicate that reviewer characteristics are good predictors of review helpfulness, whereas review quality and review sentiment are poor predictors of review helpfulness.  相似文献   

17.
结合网购评论文本的特点,分别从网购评论文本情感信息的抽取、分类以及情感信息的检索与归纳三个方面来阐述文本情感分析在网购评论领域的实际应用前景.其中,网购评论文本情感信息的抽取和分类是进行网购评论文本情感信息检索与归纳的基础,而网购评论文本情感信息检索与归纳是与用户直接交互的接口,也是最具有实用价值和商业价值的部分.  相似文献   

18.
Wireless Personal Communications - Predictive recommendations based on online reviews are considered one of the recent sentiment analysis tasks that classify the emotion expressed in online reviews...  相似文献   

19.
An increasing number of social media and networking platforms have been widely used. People usually post the online comments to share their own opinions on the networking platforms with social media. Business companies are increasingly seeking effective ways to mine what people think and feel regarding their products and services. How to correctly understand the online customers’ reviews becomes an important issue. This study aims to propose a method with the aspect-oriented Petri nets (AOPN) to improve the examination correctness without changing any process and program. We collect those comments from the online reviews with Scrapy tools, perform sentiment analysis using SnowNLP, and examine the analysis results to improve the correctness. In this paper, we apply our method for a case of the online movie comments. The experimental results have shown that AOPN is helpful for the sentiment analysis and verifying its correctness.  相似文献   

20.
Sentiment analysis incorporates natural language processing and artificial intelligence and has evolved as an important research area. Sentiment analysis on product reviews has been used in widespread applications to improve customer retention and business processes. In this paper, we propose a method for performing an intensified sentiment analysis on customer product reviews. The method involves the extraction of two feature sets from each of the given customer product reviews, a set of acoustic features (representing emotions) and a set of lexical features (representing sentiments). These sets are then combined and used in a supervised classifier to predict the sentiments of customers. We use an audio speech dataset prepared from Amazon product reviews and downloaded from the YouTube portal for the purposes of our experimental evaluations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号