首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
面向服务架构中,分布式网络计算的实现依赖于服务交互问题的有效解决。为此,服务接口必须采用机器可理解的方式描述,从而为服务的动态发现和组合提供底层支持。服务语义标注技术满足了上述需求,它是指通过共享域本体中机器可理解的元数据表示服务元素。本文将服务语义标注过程分解为域标注和概念标注两个阶段,重点针对域标注注问题,并提出了一种基于机器学习的域标注算法,对实际服务的标注实验验证了该算法的有效性  相似文献   

2.
传统网上信息检索是用户被动地依靠浏览超级链接网页而获取的。文中提出基于本体的主动元数据挖掘系统以及在果品领域的应用,在主动搜索、元数据生成、借助本体作用于数据的语义描述等方面,其效果是客观的。使得对信息数据的搜索从被动地获取到主动依靠计算机自动搜索;从依靠关键字作为查询依据到借助本体的作用获取语义描述的信息数据,进而提高了信息查询效率及查询的准确率,这也是当前信息检索研究的热门课题。实验证明,通过主动元数据挖掘实例可以实现语义的扩充,如同义、近义及上下位关系。同时也验证了本体对实施语义智能检索所带来的客观效果。  相似文献   

3.
The TV-Anytime standard describes the structures of categories of digital TV program metadata, as well as user profile metadata for TV programs. We describe a natural language (NL) model for the users to interact with the TV-Anytime metadata and preview TV programs from their mobile devices. The language utilises completely the TV-Anytime metadata specifications (upper ontologies), as well as domain-specific ontologies. The interaction model does not use clarification dialogues, but it uses the user profiles as well as TV-Anytime metadata information and ontologies to rank the possible responses in case of ambiguities. We describe implementations of the model that run on a PDA and on a mobile phone, and manage the metadata on a remote TV-Anytime-compatible TV set. We present user evaluations of the approach. Finally, we propose a generalised implementation framework that can be used to easily provide NL interfaces for mobile devices for different applications and ontologies.  相似文献   

4.
5.
In this paper we present a new approach for building metadata schemas by integrating existing ontologies and structured vocabularies (thesauri). This integration is based on the specification of inclusion relationships between thesaurus terms and ontology concepts and results in application-specific metadata schemas incorporating the structural views of ontologies and the deep classification schemes provided by thesauri. We will also show how the result of this integration can be used for RDF schema creation and metadata querying. In our context, (metadata) queries exploit the inclusion semantics of term relationships, which introduces some recursion. We will present a fairly simple database-oriented solution for querying such metadata which avoids a (recursive) tree traversal and is based on a linear encoding of thesaurus hierarchies. Published online: 22 September 2000  相似文献   

6.
矿业环境安全元数据标准研究   总被引:1,自引:0,他引:1       下载免费PDF全文
王建  郝彦彬 《计算机工程》2010,36(2):272-274
针对矿业信息元数据标准不统一的问题,在参考相关元数据标准的前提下,考虑矿业环境安全数据的特点,采用5W1H方法确定矿业环境安全元数据核心元素集。参照美国联邦地理数据委员会元数据标准得到元数据框架,研究运用模型管理实现不同模式描述的元数据之间转换的算法。应用实例证明,该算法有利于矿业环境安全数据的合理组织和管理。  相似文献   

7.
针对已有的本体映射方法在处理大规模本体映射任务时效率和有效性较低的问题,文中提出了一个基于数据场的本体映射算法.该算法首先使用高效的相似度算法,建立本体中元素对另一本体的初始相关度;然后,利用数据场势函数引入周围本体元素对当前元素的影响,修正初始相关度,并最终确定本体间的相关子本体;最后,利用针对性的方法对上述相关子本体进行更有效的映射.实验结果表明,该算法可以在提高映射结果质量的同时保证较高的映射效率.  相似文献   

8.
Provenance information in eScience is metadata that's critical to effectively manage the exponentially increasing volumes of scientific data from industrial-scale experiment protocols. Semantic provenance, based on domain-specific provenance ontologies, lets software applications unambiguously interpret data in the correct context. The semantic provenance framework for eScience data comprises expressive provenance information and domain-specific provenance ontologies and applies this information to data management. The authors' "two degrees of separation" approach advocates the creation of high-quality provenance information using specialized services. In contrast to workflow engines generating provenance information as a core functionality, the specialized provenance services are integrated into a scientific workflow on demand. This article describes an implementation of the semantic provenance framework for glycoproteomics.  相似文献   

9.
10.
分析了XML Schema和DAML文档,发掘二者在组成结构上的相似性,提出了一种联系WSDL文件和DAML本体描述文件的中间数据模型,通过将XML Schema格式的WSDL文件和DAML描述的本体文件映射到这种公共的数据模型上,使二者可以进行比较匹配,从而为自动化的语义注释提供支持。实验结果证明,该方法能为Web服务描述文件自动地添加语义信息。  相似文献   

11.
为解决现有数据集成方法的集成成本过高、缺乏语义信息等问题,对带有语义信息的轻量级数据集成方法开展研究.对本体、元数据等相关理论进行概述,给出了一种基于语义的轻量级数据集成方法,并详细分析了其中的两个主要过程:本体的识别和元数据的抽取、本体映射和基于本体映射的元数据集成,并进行了实例分析.分析结果表明,所给方法切实可行,用元数据集成替代数据集成能够避免大量数据的移动和存储,有效降低数据集成的成本,使数据集成过程轻量化,同时,语义信息的融入能够更好地为上层应用提供支持.  相似文献   

12.
This paper shows by analyzing a subject field (the classification of conference halls and the results of their elementary functional analysis) that the performance of assigned functions and the achievement of conference-service output features make it necessary to involve a large amount of information and the development of the structure of initial data and metadata, as well as the choice of the methods for creating metadata at different levels of ontologies. A universal approach to forming a metadata ontology is suggested, which can be used to develop the concept of equipping media-industry enterprises.  相似文献   

13.
With the size digital collections are currently reaching, retrieving the best match of a document from large collections by comparing hundreds of tags is a task that involves considerable algorithm complexity, even more so if the number of tags in the collection is not fixed. For these cases, similarity search appears to be the best retrieval method, but there is a lack of techniques suited for these conditions. This work presents a combination of machine learning algorithms put together to find the most similar object of a given one in a set of pre-processed objects based only on their metadata tags. The algorithm represents objects as character frequency curves and is capable of finding relationships between objects without an apparent association. It can also be parallelized using MapReduce strategies to perform the search. This method can be applied to a wide variety of documents with metadata tags. The case-study used in this work to demonstrate the similarity search technique is that of a collection of image objects in JavaScript Object Notation (JSON) containing metadata tags.  相似文献   

14.
在服务网格中,分布式网络计算的实现依赖于如何在OGSA下实现服务交互问题的有效解决.为此,服务接口必须采用机器可理解的方式描述,从而为网格服务的动态发现和组合提供底层支持.服务语义标注技术满足了上述需求,它提出使用共享域本体中机器可理解的元数据标注服务资源描述.提出了一种有效的服务资源自动语义标注方法,该方法将服务语义标注过程分解为域标注和概念标注两个阶段,重点针对域标注问题,提出了基于机器学习的域标注算法,对实际服务资源的标注实验验证了该算法的有效性.  相似文献   

15.
16.
在CAS系统中,提出了将多媒体对象的存储元数据和内容元数据进行整合分析,然后根据属性值的不同将对象归类存储。并且为方便用户使用,使用了Inotify对文件系统进行实时监控,自动提取对象的各项元数据信息。对象的元数据信息使用标准的XML文件和MYSQL数据库分别保存,并且各项属性能在CAS系统中很好地体现出来。整合分析自动提取的元数据信息可以极大地帮助用户提高搜索和管理多媒体数据的效率。  相似文献   

17.
本体构造就是利用各种数据源以半自动方式新建或扩充改编已有本体以构建一个新本体。现有的本体构造方法大都以大量领域文本和背景语料库为基础抽取大量概念术语,然后从中选出领域概念构造出一个本体。Cluster-Merge算法首先对领域文档先用k-means聚类算法进行聚类,然后根据文档聚类的结果来构造本体,最后根据本体相似度进行本体合并得到最终的输出本体。通过实验可证明用Cluster-Merge算法得出的本体可以提高查全率、查准率。  相似文献   

18.
Personalization is increasingly vital especially for enterprises to be able to reach their customers. The key challenge in supporting personalization is the need for rich metadata, such as metadata about structural relationships, subject/concept relations between documents and cognitive metadata about documents (e.g. difficulty of a document). Manual annotation of large knowledge bases with such rich metadata is not scalable. As well as, automatic mining of cognitive metadata is challenging since it is very difficult to understand underlying intellectual knowledge about document automatically. On the other hand, the Web content is increasing becoming multilingual since growing amount of data generated on the Web is non-English. Current metadata extraction systems are generally based on English content and this requires to be revolutionized in order to adapt to the changing dynamics of the Web. To alleviate these problems, we introduce a novel automatic metadata extraction framework, which is based on a novel fuzzy based method for automatic cognitive metadata generation and uses different document parsing algorithms to extract rich metadata from multilingual enterprise content using the newly developed DocBook, Resource Type and Topic ontologies. Since the metadata generation process is based upon DocBook structured enterprise content, our framework is focused on enterprise documents and content which is loosely based on the DocBook type of formatting. DocBook is a common documentation formatting to formally produce corporate data and it is adopted by many enterprises. The proposed framework is illustrated and evaluated on English, German and French versions of the Symantec Norton 360 knowledge bases. The user study showed that the proposed fuzzy-based method generates reasonably accurate values with an average precision of 89.39% on the metadata values of document difficulty, document interactivity level and document interactivity type. The proposed fuzzy inference system achieves improved results compared to a rule-based reasoner for difficulty metadata extraction (∼11% enhancement). In addition, user perceived metadata quality scores (mean of 5.57 out of 6) found to be high and automated metadata analysis showed that the extracted metadata is high quality and can be suitable for personalized information retrieval.  相似文献   

19.
基于本体的ETL设计研究   总被引:1,自引:0,他引:1  
吴飞  邢桂芬  邢玉萍 《计算机工程与设计》2007,28(7):1517-1519,1571
提出了一种基于本体的ETL设计方法,通过建立各数据源的局部本体和目标数据仓库的全局本体以及本体间的映射,得出以OWL表示的各数据源和目标的映射关系.用本体元数据指导数据抽取,转换和加载过程,解决数据源ETL过程中的语义异构问题,实现了企业数据语义程度的集成.  相似文献   

20.
ReDE:一个基于正则表达式的生物数据抽取方法   总被引:4,自引:0,他引:4  
从异构生物数据源抽取数据,建立查询分析平台是目前研究的热点,而抽取过程会涉及大量相互依赖的元数据.充分利用这种依赖关系可降低维护工作量.基于正则表达式(RE)提出了ReDE抽取方法:通过围绕RE组建立分析树,设计了基于RE的关系数据库模式生成算法和通用抽取与组装算法,其特点是:RE是惟一的元数据,易于管理和维护.该方法奠定了生物数据库辅助设计工具和高自动化抽取工具的基础,已用于构建国内第1个整合的生物信息在线数据仓库.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号