首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 261 毫秒
1.
随着大数据技术的快速发展,多标签文本分类在司法领域也催生出诸多应用.在法律文本中通常存在多个要素标签,标签之间往往具有相互依赖性或相关性,准确识别这些标签需要多标签分类方法的支持.因此,文中提出融合标签关系的法律文本多标签分类方法.方法构建标签的共现矩阵,利用图卷积网络捕捉标签之间的依赖关系,并结合标签注意力机制,计算法律文本和标签每个词的相关程度,得到特定标签的法律文本语义表示.最后,融合标签图构建的依赖关系和特定标签的法律文本语义表示,对文本进行综合表示,实现文本的多标签分类.在法律数据集上的实验表明,文中方法获得较好的分类精度和稳定性.  相似文献   

2.
多标签学习广泛应用于文本分类、标签推荐、主题标注等.最近,基于深度学习技术的多标签学习受到广泛关注,针对如何在多标签学习中有效挖掘并利用高阶标签关系的问题,提出一种基于图卷积网络探究标签高阶关系的模型TMLLGCN.该模型采用GCN的映射函数从数据驱动的标签表示中生成对象分类器挖掘标签高阶关系.首先,采用深度学习方法提取文本特征,然后以数据驱动方式获得基础标签关联表示矩阵,为更好地建模高阶关系及提高模型效果,在基础标签关联表示矩阵上考虑未标记标签集对已知标签集的影响进行标签补全,并以此相关性矩阵指导GCN中标签节点之间的信息传播,最后将提取的文本特征应用到学习高阶标签关系的图卷积网络分类器进行端到端训练,综合标签关联和特征信息作为最终的预测结果.在实际多标签数据集上的实验结果表明,提出的模型能够有效建模标签高阶关系且提升了多标签学习的效果.  相似文献   

3.
多标签文本分类旨在从若干标签中选取最相关的标签子集来标记一个样本点.传统的研究倾向于探讨标签间关系而忽略标签语义,造成信息提取不完整,因此如何利用标签元数据有效提取样本中的关键信息是需要解决的一个重要问题.为解决上述问题,本文首先提出从现有数据集中生成标签语义元数据的方法,利用注意力模型对样本中混杂的语义进行筛选和清洗,生成标签的语义信息,解决了标签语义获取困难的问题.其次提出combined-attention模型用以提取样本中的关键信息,此模型将标签语义和标签关系结合起来共同提取样本中的信息,并且其内部设置了自适应融合单元,将以上两种关键信息根据其在分类结果中的关键程度自适应分配权重,进一步提升了模型的分类能力.3个英文数据集上的实验结果表明本模型优于最先进的基线方法,在分类精度上最高提升了5.68%,在真实的中文法律数据集上也实现了优异的分类效果.  相似文献   

4.
刘苏祺  白光伟  沈航 《计算机科学》2016,43(7):224-229, 239
模式层知识对于语义万维网的发展非常重要,然而当前开放链接数据(LOD)中模式层知识的数量十分有限,为突破这一局限,提出一种基于社交网络中用户自描述标签的层次分类体系构建方法。该方法首先设计基于搜索引擎的标签分块算法,将描述相同话题的标签划分到同一标签块中,然后采用基于半监督学习的标签传播算法挖掘相同标签块中标签间的上下位关系,最后运用基于启发式规则的贪心算法来构建层次分类体系,从而在社交站点中构建出大规模且高质量的层次分类体系。实验结果表明,该构建方法与现有相关工作相比在准确率、召回率以及F值上均有明显提高。  相似文献   

5.
在多标记分类中,标签与标签之间的相关关系是影响分类效果的一个重要因子。而传统的经典多标签分类方法如BR算法,ML-KNN算法等,忽略了标签之间的相关关系对实际分类的影响,分类效果一直不能令人满意。面对类别关联度极高的不良信息的多标签分类,分类效果更是大打折扣。针对上述问题,本文改进一种经典的多标签分类算法RAkEL,首先根据训练文本计算出各标签之间的相似度系数,然后再根据自定义不良信息层次关系计算出综合标签相似度系数矩阵,最后在RAkEL算法投票过程中根据综合标签相似度与中心标签重新确定最终的结果标签集合。与传统的分类方法在真实的语料库上进行多标签分类效果对比,实验证明,该方法在对不良信息分类具有较好的效果。  相似文献   

6.
层次标签文本分类是自然语言处理领域中一项具有挑战性的任务,每个文档需要被正确分类到对应具有层次结构的多个标签中。然而在标签集中,由于标签包含的语义信息不充分,同时被归类到深层次标签的文档数量过少,深层次标签训练不充分,导致显著的标签训练不平衡问题。基于此,提出了深层次标签辅助分类任务的层次标签文本分类方法(DLAC)。该方法提出了一种深层次标签辅助分类器,在标签语义增强的基础上有效利用文本特征与深层次标签对应的父标签结点(即浅层次标签的丰富特征)来提升深层次标签的分类性能。与11种算法在三个数据集上的对比实验结果表明,模型能够有效提升深层次标签的分类性能,并取得良好效果。  相似文献   

7.
陈艳  陈佳晴  陈星 《计算机科学》2021,48(z1):306-312
随着机器学习的兴起,算子数目飞速增长,组装算子需要搜索的解空间增大,流程组装时间指数倍增长,如何降低搜索解空间,从而降低组装时间,实现支持适应用户功能性需求的机器学习流程组装成为当前研究的热点.文中提出了一种基于层次标签、支持机器学习领域的流程组装方法.首先,从算子语义中提取标签,根据标签包含语义范围确定层次标签模型;其次,根据机器学习领域发现标签关系,确立领域组装模型,按照用户确定的功能性需求,确定最终领域标签模型;最后领域内算子与标签语义绑定,确定领域内算子关系模型,根据组装规则组装算子,形成满足用户功能性需求的全部算子流程.最后给出了支持该方法的实例,用以说明该方法的可行性;提出结果验证标准,用以说明结果的正确性与完整性.  相似文献   

8.
在多标签文本分类任务中,每个给定的文档都对应一组相关标签。目前主要面临以下三方面问题:(1)对标签-文本和标签-标签关系的联合建模不充分;(2)对标签本身语义的挖掘不足;(3)忽略了对标签内部结构信息的利用。对于以上问题,提出了一种基于联合注意力和共享语义空间的多标签文本分类方法。提出了融合多头注意力机制,该方法旨在同步地对标签与文档的关系和标签之间的关系进行建模,利用两者交互信息的同时避免误差传递。提出了解耦的共享语义空间嵌入方法,改进了利用标签语义信息的方法,使用共享参数的编码器提取标签和文档的语义表示,减少其在建模相关性阶段的偏差。提出了一种基于先验知识的层次提示方法,利用预训练模型中的先验知识增强标签层次结构信息。实验结果表明,该方法在公开数据集上优于目前最先进的多标签文本分类模型。  相似文献   

9.
网络嵌入是在保持网络性质不变的前提下,将节点转换为低维向量,以便下游任务的求解.现有网络嵌入方法的研究大多关注于网络结构、节点属性信息或单层次标签信息等方面.然而,许多真实世界的网络节点通常具有丰富的层次标签信息,这些层次标签信息对获取高效的网络嵌入具有重要价值.由于不同层次的标签之间的信息很难相互关联或继承,如何合理利用层次标签信息进行网络嵌入,获得更高效的向量表示是亟待研究的问题.针对上述问题,提出了一种新的基于层次标签的属性网络嵌入框架(HLANE),该框架利用层次注意力机制将层次标签信息融入网络嵌入中.HLANE框架首先通过现有的网络嵌入方法获取结构和/或属性信息初始化节点的嵌入向量.然后通过层次注意力机制层建立多层次标签的父节点和子节点之间的联系,并依此指导网络节点初始化嵌入向量在不同层次的学习,最终生成网络节点的多层次嵌入向量表示.在真实数据集上的实验表明,与对比算法相比,HLANE框架具有更好的网络节点嵌入表示.  相似文献   

10.
探地雷达(GPR)图像双曲波提取是分析地下目标位置和结构的重要方法,但在真 实环境中,由于噪声和杂波的干扰,使得提取出的双曲波存在结构不完整、碎片化和形状异 常等问题,不利于数据分析和三维建模等后续操作。为此,提出了一种基于多标签层次聚类 的双曲波提取方法(MHCE)。首先通过信息熵评价像素邻域的稳定性,构造了基于信息熵的 距离度量来进行层次聚类;然后利用聚类后的邻接空间进行多标签聚类以降低杂波和噪声对 双曲波提取的影响;最后结合多标签聚类结果的拟合形状和纹理方向提取双曲波。实验表明, 该方法对于真实GPR 图像双曲波具有较好的鲁棒性,能够获得规范化的双曲波形状和位置 参数。  相似文献   

11.
In this paper, we evaluate the effectiveness of a semantic smoothing technique to organize folksonomy tags. Folksonomy tags have no explicit relations and vary because they form uncontrolled vocabulary. We discriminates so-called subjective tags like “cool” and “fun” from folksonomy tags without any extra knowledge other than folksonomy triples and use the level of tag generalization to form the objective tags into a hierarchy. We verify that entropy of folksonomy tags is an effective measure for discriminating subjective folksonomy tags. Our hierarchical tag allocation method guarantees the number of children nodes and increases the number of available paths to a target node compared to an existing tree allocation method for folksonomy tags.  相似文献   

12.
Topic-based ranking in Folksonomy via probabilistic model   总被引:1,自引:0,他引:1  
Social tagging is an increasingly popular way to describe and classify documents on the web. However, the quality of the tags varies considerably since the tags are authored freely. How to rate the tags becomes an important issue. Most social tagging systems order tags just according to the input sequence with little information about the importance and relevance. This limits the applications of tags such as information search, tag recommendation, and so on. In this paper, we pay attention to finding the authority score of tags in the whole tag space conditional on topics and put forward a topic-sensitive tag ranking (TSTR) approach to rank tags automatically according to their topic relevance. We first extract topics from folksonomy using a probabilistic model, and then construct a transition probability graph. Finally, we perform random walk over the topic level on the graph to get topic rank scores of tags. Experimental results show that the proposed tag ranking method is both effective and efficient. We also apply tag ranking into tag recommendation, which demonstrates that the proposed tag ranking approach really boosts the performances of social-tagging related applications.  相似文献   

13.
A folksonomy consists of three basic entities, namely users, tags and resources. This kind of social tagging system is a good way to index information, facilitate searches and navigate resources. The main objective of this paper is to present a novel method to improve the quality of tag recommendation. According to the statistical analysis, we find that the total number of tags used by a user changes over time in a social tagging system. Thus, this paper introduces the concept of user tagging status, namely the growing status, the mature status and the dormant status. Then, the determining user tagging status algorithm is presented considering a user’s current tagging status to be one of the three tagging status at one point. Finally, three corresponding strategies are developed to compute the tag probability distribution based on the statistical language model in order to recommend tags most likely to be used by users. Experimental results show that the proposed method is better than the compared methods at the accuracy of tag recommendation.  相似文献   

14.
Folksonomy, considered a core component for Web 2.0 user-participation architecture, is a classification system made by user’s tags on the web resources. Recently, various approaches for image retrieval exploiting folksonomy have been proposed to improve the result of image search. However, the characteristics of the tags such as semantic ambiguity and non-controlledness limit the effectiveness of tags on image retrieval. Especially, tags associated with images in a random order do not provide any information about the relevance between a tag and an image. In this paper, we propose a novel image tag ranking system called i-TagRanker which exploits the semantic relationships between tags for re-ordering the tags according to the relevance with an image. The proposed system consists of two phases: 1) tag propagation phase, 2) tag ranking phase. In tag propagation phase, we first collect the most relevant tags from similar images, and then propagate them to an untagged image. In tag ranking phase, tags are ranked according to their semantic relevance to the image. From the experimental results on a Flickr photo collection about over 30,000 images, we show the effectiveness of the proposed system.  相似文献   

15.
李劲  张华  吴浩雄  向军  辜希武 《计算机应用》2012,32(5):1335-1339
社会标注是一种用户对网络资源的大众分类,蕴含了丰富的语义信息,因此将社会标注应用到信息检索技术中有助于提高信息检索的质量。研究了一种基于社会标注的文本分类改进算法以提高网页分类的效果。由于社会标注属于大众分类,标注的产生具有很大的随意性,标注的质量差别很大,因此首先利用文档间的语义相似度以及标注间的语义相似度来对标注的质量进行量化评估。在此基础上对标注进行质量过滤,利用质量相对较好的标注对文档向量空间模型进行扩展,将文档表示成由文档单词以及文档标注信息组成的扩展向量。同时采用支持向量机分类算法进行分类实验。实验结果表明,通过对标注进行质量评估并过滤质量差的标注,同时结合文档内容以及标注来对文档能提高分类的效果,同传统的基于文档内容的分类算法相比,分类结果的F1度量值提高了6.2%。  相似文献   

16.
In this paper, we proposed a novel approach based on topic ontology for tag recommendation. The proposed approach intelligently generates tag suggestions to blogs. In this approach, we construct topic ontology through enriching the set of categories in existing small ontology called as Open Directory Project. To construct topic ontology, a set of topics and their associated semantic relationships is identified automatically from the corpus‐based external knowledge resources such as Wikipedia and WordNet. The construction relies on two folds such as concept acquisition and semantic relation extraction. In the first fold, a topic‐mapping algorithm is developed to acquire the concepts from the semantic of Wikipedia. A semantic similarity‐clustering algorithm is used to compute the semantic similarity measure to group the set of similar concepts. The second is the semantic relation extraction algorithm, which derives associated semantic relations between the set of extracted topics from the lexical patterns between synsets in WordNet. A suitable software prototype is created to implement the topic ontology construction process. A Jena API framework is used to organize the set of extracted semantic concepts and their corresponding relationship in the form of knowledgeable representation of Web ontology language. Thus, Protégé tool provides the platform to visualize the automatically constructed topic ontology successfully. Using the constructed topic ontology, we can generate and suggest the most suitable tags for the new resource to users. The applicability of topic ontology with a spreading activation algorithm supports efficient recommendation in practice that can recommend the most popular tags for a specific resource. The spreading activation algorithm can assign the interest scores to the existing extracted blog content and tags. The weight of the tags is computed based on the activation score determined from the similarity between the topics in constructed topic ontology and content of the existing blogs. High‐quality tags that has the highest activation score is recommended to the users. Finally, we conducted experimental evaluation of our tag recommendation approach using a large set of real‐world data sets. Our experimental results explore and compare the capabilities of our proposed topic ontology with the spreading activation tag recommendation approach with respect to the existing AutoTag mechanism. And also discuss about the improvement in precision and recall of recommended tags on the data sets of Delicious and BibSonomy. The experiment shows that tag recommendation using topic ontology results in the folksonomy enrichment. Thus, we report the results of an experiment mean to improve the performance of the tag recommendation approach and its quality.  相似文献   

17.
The advent of internet has led to a significant growth in the amount of information available, resulting in information overload, i.e. individuals have too much information to make a decision. To resolve this problem, collaborative tagging systems form a categorization called folksonomy in order to organize web resources. A folksonomy aggregates the results of personal free tagging of information and objects to form a categorization structure that applies utilizes the collective intelligence of crowds. Folksonomy is more appropriate for organizing huge amounts of information on the Web than traditional taxonomies established by expert cataloguers. However, the attributes of collaborative tagging systems and their folksonomy make them impractical for organizing resources in personal environments.This work designs a desktop collaborative tagging (DCT) system that enables collaborative workers to tag their documents. This work proposes an application in patent analysis based on the DCT system. Folksonomy in DCT is built by aggregating personal tagging results, and is represented by a concept space. Concept spaces provide synonym control, tag recommendation and relevant search. Additionally, to protect privacy of authors and to decrease the transmission cost, relations between tagged and untagged documents are constructed by extracting document’s features rather than adopting the full text.Experimental results reveal that the adoption rate of recommended tags for new documents increases by 10% after users have tagged five or six documents. Furthermore, DCT can recommend tags with higher adoption rates when given new documents with similar topics to previously tagged ones. The relevant search in DCT is observed to be superior to keyword search when adopting frequently used tags as queries. The average precision, recall, and F-measure of DCT are 12.12%, 23.08%, and 26.92% higher than those of keyword searching.DCT allows a multi-faceted categorization of resources for collaborative workers and recommends tags for categorizing resources to simplify categorization easier. Additionally, DCT system provides relevance searching, which is more effective than traditional keyword searching for searching personal resources.  相似文献   

18.
In this paper we present the contextual tag cloud system: a novel application that helps users explore a large scale RDF dataset. Unlike folksonomy tags used in most traditional tag clouds, the tags in our system are ontological terms (classes and properties), and a user can construct a context with a set of tags that defines a subset of instances. Then in the contextual tag cloud, the font size of each tag depends on the number of instances that are associated with that tag and all tags in the context. Each contextual tag cloud serves as a summary of the distribution of relevant data, and by changing the context, the user can quickly gain an understanding of patterns in the data. Furthermore, the user can choose to include RDFS taxonomic and/or domain/range entailment in the calculations of tag sizes, thereby understanding the impact of semantics on the data. In this paper, we describe how the system can be used as a query building assistant, a data explorer for casual users, or a diagnosis tool for data providers. To resolve the key challenge of how to scale to Linked Data, we combine a scalable preprocessing approach with a specially-constructed inverted index, use three approaches to prune unnecessary counts for faster online computations, and design a paging and streaming interface. Together, these techniques enable a responsive system that in particular holds a dataset with more than 1.4 billion triples and over 380,000 tags. Via experimentation, we show how much our design choices benefit the responsiveness of our system.  相似文献   

19.
随着互联网技术的发展, 个性化标签推荐系统在海量信息或资源过滤中起着重要的角色. 在新浪微博平台中, 用户可以自主的给自己添加标签来表明自己的兴趣爱好. 同时, 用户也可以通过标签来搜索与自己兴趣爱好相似的用户. 针对新浪微博中大部分用户没有添加标签或添加标签数目较少的问题, 提出了一种基于RBLDA模型和交互关系的微博标签推荐算法, 它首先利用RBLDA模型来产生用户的初始标签列表, 然后再结合用户的交互关系而形成的交互图来预测用户标签的算法. 通过在新浪微博真实数据集上的实验发现, 该方案与传统的标签推荐算法相比, 取得了良好的实验效果.  相似文献   

20.
Nowadays social tagging has become a popular way to annotate, search, navigate and discover online resources, in turn leading to the sheer amount of user-generated metadata. This paper addresses the problem of recommending suitable tags during folksonomy development from a graph-based perspective. The proposed approach adapts the Katz measure, a path-ensemble based proximity measure, for the use in social tagging systems. We model a folksonomy as a weighted, undirected tripartite graph. We then apply the Katz measure to this graph, and exploit it to provide tag recommendations for individual users. We evaluate our method on two real-world folksonomies collected from CiteULike and Last.fm. The experimental results demonstrate that the proposed method improves the recommendation performance and is effective for both active taggers and cold-start taggers compared to existing algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号