首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Ontology-driven web-based semantic similarity   总被引:1,自引:0,他引:1  
Estimation of the degree of semantic similarity/distance between concepts is a very common problem in research areas such as natural language processing, knowledge acquisition, information retrieval or data mining. In the past, many similarity measures have been proposed, exploiting explicit knowledge—such as the structure of a taxonomy—or implicit knowledge—such as information distribution. In the former case, taxonomies and/or ontologies are used to introduce additional semantics; in the latter case, frequencies of term appearances in a corpus are considered. Classical measures based on those premises suffer from some problems: in the first case, their excessive dependency of the taxonomical/ontological structure; in the second case, the lack of semantics of a pure statistical analysis of occurrences and/or the ambiguity of estimating concept statistical distribution from term appearances. Measures based on Information Content (IC) of taxonomical concepts combine both approaches. However, they heavily depend on a properly pre-tagged and disambiguated corpus according to the ontological entities in order to compute accurate concept appearance probabilities. This limits the applicability of those measures to other ontologies –like specific domain ontologies- and massive corpus –like the Web-. In this paper, several of the presented issues are analyzed. Modifications of classical similarity measures are also proposed. They are based on a contextualized and scalable version of IC computation in the Web by exploiting taxonomical knowledge. The goal is to avoid the measures’ dependency on the corpus pre-processing to achieve reliable results and minimize language ambiguity. Our proposals are able to outperform classical approaches when using the Web for estimating concept probabilities.  相似文献   

2.
Estimation of the semantic likeness between words is of great importance in many applications dealing with textual data such as natural language processing, knowledge acquisition and information retrieval. Semantic similarity measures exploit knowledge sources as the base to perform the estimations. In recent years, ontologies have grown in interest thanks to global initiatives such as the Semantic Web, offering an structured knowledge representation. Thanks to the possibilities that ontologies enable regarding semantic interpretation of terms many ontology-based similarity measures have been developed. According to the principle in which those measures base the similarity assessment and the way in which ontologies are exploited or complemented with other sources several families of measures can be identified. In this paper, we survey and classify most of the ontology-based approaches developed in order to evaluate their advantages and limitations and compare their expected performance both from theoretical and practical points of view. We also present a new ontology-based measure relying on the exploitation of taxonomical features. The evaluation and comparison of our approach’s results against those reported by related works under a common framework suggest that our measure provides a high accuracy without some of the limitations observed in other works.  相似文献   

3.
Semantic-oriented service matching is one of the challenges in automatic Web service discovery. Service users may search for Web services using keywords and receive the matching services in terms of their functional profiles. A number of approaches to computing the semantic similarity between words have been developed to enhance the precision of matchmaking, which can be classified into ontology-based and corpus-based approaches. The ontology-based approaches commonly use the differentiated concept information provided by a large ontology for measuring lexical similarity with word sense disambiguation. Nevertheless, most of the ontologies are domain-special and limited to lexical coverage, which have a limited applicability. On the other hand, corpus-based approaches rely on the distributional statistics of context to represent per word as a vector and measure the distance of word vectors. However, the polysemous problem may lead to a low computational accuracy. In this paper, in order to augment the semantic information content in word vectors, we propose a multiple semantic fusion (MSF) model to generate sense-specific vector per word. In this model, various semantic properties of the general-purpose ontology WordNet are integrated to fine-tune the distributed word representations learned from corpus, in terms of vector combination strategies. The retrofitted word vectors are modeled as semantic vectors for estimating semantic similarity. The MSF model-based similarity measure is validated against other similarity measures on multiple benchmark datasets. Experimental results of word similarity evaluation indicate that our computational method can obtain higher correlation coefficient with human judgment in most cases. Moreover, the proposed similarity measure is demonstrated to improve the performance of Web service matchmaking based on a single semantic resource. Accordingly, our findings provide a new method and perspective to understand and represent lexical semantics.  相似文献   

4.
Knowledge management in biomedical libraries: A semantic web approach   总被引:1,自引:0,他引:1  
In recent years, technological advances in high-throughput techniques and efficient data gathering methods, coupled with a world-wide effort in computational biology, have resulted in an enormous amount of life science data available in repositories devoted to biomedical literature. These repositories lack the ability to attain an effective and accurate search. Using semantic technologies as the key for interoperation enables searching and processing of biomedical literature in a more efficient way. However, emerging semantic applications take for granted specific knowledge that biomedical researchers may not have. This paper presents design principles for easy-to-use biomedical semantic applications by means of ontology-based annotations and faceted search. The proposed approach is backed with a usable prototype that shows the breakthroughs of adding these principles to a biomedical digital library where identifying and searching information are critical aspects for non-semantic Web experts.  相似文献   

5.
The information content (IC) of a concept provides an estimation of its degree of generality/concreteness, a dimension which enables a better understanding of concept’s semantics. As a result, IC has been successfully applied to the automatic assessment of the semantic similarity between concepts. In the past, IC has been estimated as the probability of appearance of concepts in corpora. However, the applicability and scalability of this method are hampered due to corpora dependency and data sparseness. More recently, some authors proposed IC-based measures using taxonomical features extracted from an ontology for a particular concept, obtaining promising results. In this paper, we analyse these ontology-based approaches for IC computation and propose several improvements aimed to better capture the semantic evidence modelled in the ontology for the particular concept. Our approach has been evaluated and compared with related works (both corpora and ontology-based ones) when applied to the task of semantic similarity estimation. Results obtained for a widely used benchmark show that our method enables similarity estimations which are better correlated with human judgements than related works.  相似文献   

6.
In recent studies, ontology related concepts have been introduced into FIPA ACL content language to convey information for agent communication. However, these works have only applied ontology-based knowledge representation in communication message and then demonstrated the advantage of this association. In fact, although ontology can represent semantic implications needed for decidable reasoning support, it has no mechanism for defining complex rule-based representation to support inference. The motivation of this study is to address this issue by developing a semantic-based infrastructure to integrate Semantic Web technologies into ACL message contents. This semantic-based infrastructure defines two different semantic frameworks: the three-tier knowledge representation framework for message content and the Multi-layer Ontology Architecture for content language. The former is developed based on Semantic Web stack to support ontology-based reasoning and rule-based inference. The latter is adopted to develop a Lightweight Ontology-based Content Language (LOCL) to describe agent communication messages in an unambiguous and computer-interpretable way Jena reasoner is used in an application scenario that exploits agent communication with LOCL as content language, OWL as ontology language, and SWRL as rule language to demonstrate the feasibility of the proposed infrastructure.  相似文献   

7.
In the past decade, existing and new knowledge and datasets have been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three gene ontology slims (plant, yeast, and candida, among which the latter two belong to the same kingdom - fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performances of Jiang- Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by 1) consistently showing that yeast and candida are more similar (as compared to plant) at different scales, and 2) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.   相似文献   

8.
基于领域本体的概念语义相似度计算研究   总被引:9,自引:4,他引:9  
通过对领域本体参照下传统概念的3种语义相似度的计算模型研究,针对这3种计算模型的优缺点和领域本体所特有的性质,提出了一种改进的基于领域本体的概念语义相似度计算模型.实验结果表明,该计算模型通过定量的分析利用本体构词所描述的概念、特性之间的相似度,可以指导基于领域知识本体的语义查询中概念集扩充和查询结果排序,为概念之间的语义关系提供一种有效的量化.  相似文献   

9.

Semantic similarity assessment between concepts is an important task in many language related applications. In the past, many approaches to assess similarity of concepts have been proposed by using one knowledge source. In this paper, some limitations of the existing similarity measures are identified. To tackle these problems, we propose an extensive study for semantic similarity of concepts from which a unified framework for semantic similarity computation is presented. Based on our framework, we give some generic and flexible approaches to semantic similarity measures resulting from instantiations of the framework. In particular, we obtain some new approaches to similarity measures that existing methods cannot deal with by introducing multiple knowledge sources. The evaluation based on eight benchmarks, three widely used benchmarks (i.e., M&C, R&G, and WordSim-353 benchmarks) and five benchmarks developed in ourselves (i.e, Jiang-1, Jiang-2, Jiang-3, Jiang-4, and Jiang-5 benchmarks), sustains the intuitions with respect to human judgements. Overall, some methods proposed in this paper have a good human correlation (Pearson correlation with human judgments and Spearman correlation with human judgments) and constitute some effective ways of determining semantic similarity between concepts.

  相似文献   

10.
在语义网上不断出现的链接数据能够为社会网络分析提供大规模的数据资源。尤其是,它能够用来对特定的域结构进行社会社区结构的探索。使用基于本体的知识结构,通过从域中的链接数据来发现特定的属性,并结合提出的距离计算方法和聚类方法,能够改进域中人之间的相关性和聚类的定制,从而从链接数据中发现域中包含的社会社区结构。通过在真实的域中的链接数据上进行测试,结果证明方法能够在各个不同的域中(音乐,电影)发现可靠有价值的社会社区。  相似文献   

11.
Ontology languages for the Semantic Web   总被引:1,自引:0,他引:1  
Ontologies have proven to be an essential element in many applications. They are used in agent systems, knowledge management systems, and e-commerce platforms. They can also generate natural language, integrate intelligent information, provide semantic-based access to the Internet, and extract information from texts in addition to being used in many other applications to explicitly declare the knowledge embedded in them. However, not only are ontologies useful for applications in which knowledge plays a key role, but they can also trigger a major change in current Web contents. This change is leading to the third generation of the Web-known as the Semantic Web-which has been defined as the conceptual structuring of the Web in an explicit machine-readable way. New ontology-based applications and knowledge architectures are developing for this new Web. A common claim for all of these approaches is the need for languages to represent the semantic information that this Web requires-solving heterogeneous data exchange in this heterogeneous environment. Our goal is to help developers find the most suitable language for their representation needs.  相似文献   

12.
基于多智能体系统的面向对象本体研究   总被引:1,自引:0,他引:1  
智能体间的信息交互和行为协调是共同完成被委托任务的必要条件,论文提出了在多智能体系统中智能体本身必须建立领域模型的技术要求,即用本体支持运行时的语义交互。为此,文中用面向对象的知识表示方法描述并建立本体,并以此为基础形成领域操作代数系统和智能体服务描述语言。结合开放购买的仿真案例,表明在一个完整的情景语义交互中,服务提供方需要以智能体服务描述语言表述自己提供服务的方法和过程,而接受服务方必须在理解智能体服务描述语言的基础上,获取某一具体服务。  相似文献   

13.
Complex queries are widely used in current Web applications. They express highly specific information needs, but simply aggregating the meanings of primitive visual concepts does not perform well. To facilitate image search of complex queries, we propose a new image reranking scheme based on concept relevance estimation, which consists of Concept-Query and Concept-Image probabilistic models. Each model comprises visual, web and text relevance estimation. Our work performs weighted sum of the underlying relevance scores, a new ranking list is obtained. Considering the Web semantic context, we involve concepts by leveraging lexical and corpus-dependent knowledge, such as Wordnet and Wikipedia, with co-occurrence statistics of tags in our Flickr corpus. The experimental results showed that our scheme is significantly better than the other existing state-of-the-art approaches.  相似文献   

14.
Before undertaking new biomedical research, identifying concepts that have already been patented is essential. A traditional keyword-based search on patent databases may not be sufficient to retrieve all the relevant information, especially for the biomedical domain. This paper presents BioPatentMiner, a system that facilitates information retrieval and knowledge discovery from biomedical patents. The system first identifies biological terms and relations from the patents and then integrates the information from the patents with knowledge from biomedical ontologies to create a semantic Web. Besides keyword search and queries linking the properties specified by one or more RDF triples, the system can discover semantic associations between the Web resources. The system also determines the importance of the resources to rank the results of a search and prevent information overload while determining the semantic associations.  相似文献   

15.
As modern search engines are approaching the ability to deal with queries expressed in natural language, full support of natural language interfaces seems to be the next step in the development of future systems. The vision is that of users being able to tell a computer what they would like to find, using any number of sentences and as many details as requested. In this article we describe our effort to move towards this future using currently available technology. The Semantic Web framework was chosen as the best means to achieve this goal. We present our approach to building a complete Semantic Web Search Using Natural Language (SWSNL) system. We cover the complete process which includes preprocessing, semantic analysis, semantic interpretation, and executing a SPARQL query to retrieve the results. We perform an end-to-end evaluation on a domain dealing with accommodation options. The domain data come from an existing accommodation portal and we use a corpus of queries obtained by a Facebook campaign. In our paper we work with written texts in the Czech language. In addition to that, the Natural Language Understanding (NLU) module is evaluated on another domain (public transportation) and language (English). We expect that our findings will be valuable for the research community as they are strongly related to issues found in real-world scenarios. We struggled with inconsistencies in the actual Web data, with the performance of the Semantic Web engines on a decently sized knowledge base, and others.  相似文献   

16.
With the development of the Semantic Web technology, the use of ontologies to store and retrieve information covering several domains has increased. However, very few ontologies are able to cope with the ever-growing need of frequently updated semantic information or specific user requirements in specialized domains. As a result, a critical issue is related to the unavailability of relational information between concepts, also coined missing background knowledge. One solution to address this issue relies on the manual enrichment of ontologies by domain experts which is however a time consuming and costly process, hence the need for dynamic ontology enrichment. In this paper we present an automatic coupled statistical/semantic framework for dynamically enriching large-scale generic ontologies from the World Wide Web. Using the massive amount of information encoded in texts on the Web as a corpus, missing background knowledge can therefore be discovered through a combination of semantic relatedness measures and pattern acquisition techniques and subsequently exploited. The benefits of our approach are: (i) proposing the dynamic enrichment of large-scale generic ontologies with missing background knowledge, and thus, enabling the reuse of such knowledge, (ii) dealing with the issue of costly ontological manual enrichment by domain experts. Experimental results in a precision-based evaluation setting demonstrate the effectiveness of the proposed techniques.  相似文献   

17.
Resolving semantic heterogeneity is one of the major research challenges involved in many fields of study, such as, natural language processing, search engine development, document clustering, geospatial information retrieval, knowledge discovery, etc. When semantic heterogeneity is often considered as an obstacle for realizing full interoperability among diverse datasets, proper quantification of semantic similarity is another challenge to measure the extent of association between two qualitative concepts. The proposed work addresses this issue for any geospatial application where spatial land-cover distribution is crucial to model. Most of the these applications such as: prediction, change detection, land-cover classification, etc. often require to examine the land-cover distribution of the terrain. This paper presents an ontology-based approach to measure semantic similarity between spatial land-cover classes. As land-cover distribution is a qualitative information of a terrain, it is challenging to measure their extent of similarity among each other pragmatically. Here, an ontology is considered as the concept hierarchy of different land-cover classes which is built using domain experts’ knowledge. This work can be considered as the spatial extension of our earlier work presented in [1]. The similarity metric proposed in [1] is utilized here for spatial concepts. A case study with real land-cover ontology is presented to quantify the semantic similarity between every pair of land-covers with semantic hierarchy based similarity measurement (SHSM) scheme [1]. This work may facilitate quantification of semantic knowledge of the terrain for other spatial analyses as well.  相似文献   

18.
产品数据模型的本体知识表达   总被引:7,自引:3,他引:4  
针对数据建模不能很好满足产品知识集成与共享的需求,提出产品数据建模需要引入基于本体的形式化语义信息表达机制;在分析EXPRESS语义的基础上,构建描述逻辑语言ALCNRP(D)表达EXPRESS SCHEMA中本体语义知识,为产品全生命周期知识交换与共享建立描述和推理的基础.  相似文献   

19.
Sentence similarity based on semantic nets and corpus statistics   总被引:3,自引:0,他引:3  
Sentence similarity measures play an increasingly important role in text-related research and applications in areas such as text mining, Web page retrieval, and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high-dimensional space and are consequently inefficient, require human input, and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.  相似文献   

20.
Engineering material selection intensively depends on domain knowledge. In the face of the large number and wide variety of engineering materials, it is very necessary to research and develop an open, shared, and scalable knowledge framework for implementing domain-oriented and knowledge-based material selection. In this paper, the fundamental concepts and relationships involved in all aspects of material selection are analyzed in detail. A novel ontology-based knowledge framework is presented. The ontology-based Semantic Web technology is introduced into the semantic representation of material selection knowledge. The implicit material selection knowledge is represented as a set of labeled instances and RDF instance graphs in terms of the concept model, which provides a formal approach to organizing the captured material selection knowledge. A knowledge retrieval and reasoning approach integrating ontology concepts, instances, knowledge rules, and semantic queries encoded with Query-enhanced Web Rule Language (SQWRL) is proposed. The presented knowledge framework can provide powerful knowledge services for material selection. Finally, based on this knowledge framework, a case study on constructing a mold material selection knowledge system is provided. This work is a new attempt to build an open and shared knowledge framework for engineering material selection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号