首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
网格环境下基于本体的信息检索体系研究   总被引:2,自引:0,他引:2  
为了优化基于本体的信息检索的处理过程、提高应用系统的可靠性,提出了一种网格环境中基于本体的信息检索体系模型.利用Globus和OGSA-DAI工具进行计算资源和数据资源的管理,整合了闲置资源,提高了资源利用率,同时,将数据访问服务化,统一了接口访问类型.利用工作流模型管理业务流程的执行,实现了对数据的分布式部署,对业务服务的并行执行,能够在一定程度上解决因为信息量庞大、流程算法复杂带来的检索低效问题,提高系统的客错能力.  相似文献   

2.
In this paper, we present an ontology-based information extraction and retrieval system and its application in the soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inferencing and rules. Scalability is achieved by adapting a semantic indexing approach and representing the whole world as small independent models. The system is implemented using the state-of-the-art technologies in Semantic Web and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inferencing. Finally, we show how we use semantic indexing to solve simple structural ambiguities.  相似文献   

3.
We describe the current status of an ongoing research effort to develop a geographic information system based on quadtrees. Quadtree encodings were constructed for area, point and line features for a small area in Northern California. The encoding used was a variant of the linear quadtree. The implementation used a B-tree to organize the list of leaves and allow management of trees too large to fit in core memory. Several database query functions have been implemented, including set operations, region property computations, map editing functions and map subset and windowing functions. A user of the system may access the database via an English-like query language.  相似文献   

4.
Towards an ontology-based retrieval of UML Class Diagrams   总被引:1,自引:0,他引:1  

Context

Software Reuse has always been an important area amongst software companies in order to increase their productivity and the quality of their products, but code reuse is not the only answer for this. Nowadays, reuse techniques proposals include software designs or even software specifications. Therefore, this research focuses on software design, specifically on UML Class Diagrams. A semantic technology has been applied to facilitate the retrieval process for an effective reuse.

Objective

This research proposes an ontology-based retrieval technique by semantic similarity in order to support effective retrieval process for UML Class Diagrams. Since UML Class Diagrams are a de facto standard in the design stages of a Software Development Process, a good technique is needed to reuse them, i.e. reusing during the design stage instead of just the coding stages.

Method

An application ontology modeled using UML specifications was designed to compare UML Class Diagram element types. To measure their similarity, a survey was conducted amongst UML experts. Query expansion was improved by a domain ontology supporting the retrieval phase. The calculus of minimal distances in ontologies was solved using a shortest path algorithm.

Results

The case study shows the domain ontology importance in the UML Class Diagram retrieval process as well as the importance of an element type expansion method, such as an application ontology. A correlation between the query complexity and retrieved elements has been identified, by analyzing results. Finally, a positive Return of Investment (ROI) was estimated using Poulin’s Model.

Conclusion

Because Software Reuse has not to be limited to the coding stage, approaches to reuse design stage must be developed, i.e. UML Class Diagrams reuse. This approach proposes a technique for UML Class Diagrams retrieval, which is one important step towards reuse. Semantic technology combined with information retrieval improves the retrieval results.  相似文献   

5.
6.
Exploiting syntactic analysis of queries for information retrieval   总被引:1,自引:0,他引:1  
Up to now, the results of applying sophisticated NL techniques to information retrieval (IR) have been mostly disappointing. Our research aims at investigating in detail the role of syntactic analysis in IR and at finding answers to the question why it works better for some queries and worse for others. The final goal is a hybrid algorithm that selectively applies syntactic analysis to certain classes of queries while relying on standard statistical techniques otherwise.  相似文献   

7.
In recent years, spatial data infrastructures (SDIs) have gained great popularity as a solution to facilitate interoperable access to geospatial data offered by different agencies. In order to enhance the data retrieval process, current infrastructures usually offer a catalog service. Nevertheless, such catalog services still have important limitations that make it difficult for users to find the geospatial data that they are interested in. Some current catalog drawbacks include the use of a single record to describe all the feature types offered by a service, the lack of formal means to describe the semantics of the underlying data, and the lack of an effective ranking metric to organize the results retrieved from a query. Aiming to overcome these limitations, this article proposes SESDI (Semantically-Enabled Spatial Data Infrastructures), which is framework that reuses techniques of classic information retrieval to improve geographic data retrieval in a SDI. Moreover, the framework proposes several ranking metrics to solve spatial, semantic, temporal and multidimensional queries.  相似文献   

8.
《Information Systems》2005,30(4):277-298
An important problem in unstructured peer-to-peer (P2P) networks is the efficient content-based retrieval of documents shared by other peers. However, existing searching mechanisms are not scaling well because they are either based on the idea of flooding the network with queries or because they require some form of global knowledge.We propose the Intelligent Search Mechanism (ISM) which is an efficient, scalable yet simple mechanism for improving the information retrieval problem in P2P systems. Our mechanism is efficient since it is bounded by the number of neighbors and scalable because no global knowledge is required to be maintained.ISM consists of four components: A Profiling Structure which logs queryhit messages coming from neighbors, a Query Similarity function which calculates the similarity queries to a new query, RelevanceRank which is an online neighbor ranking function and a Search Mechanism which forwards queries to selected neighbors.We deploy and compare ISM with a number of other distributed search techniques over static and dynamic environments. Our experiments are performed with real data over Peerware, our middleware simulation infrastructure which is deployed on 75 workstations. Our results indicate that ISM outperforms its competitors and that in some cases it manages to achieve 100% recall rate while using only half of the network resources required by its competitors. Further, its performance is also superior with respect to the total query response time and our algorithm exhibits a learning behavior as nodes acquire more knowledge. Finally ISM works well in dynamic network topologies and in environments with replicated data sources.  相似文献   

9.
In this paper, we propose CYBER, a CommunitY Based sEaRch engine, for information retrieval utilizing community feedback information in a DHT network. In CYBER, each user is associated with a set of user profiles that capture his/her interests. Likewise, a document is associated with a set of profiles—one for each indexed term. A document profile is updated by users who query on the term and consider the document as a relevant answer. Thus, the profile acts as a consolidation of users feedback from the same community, and reflects their interests. In this way, as one user finds a document to be relevant, another user in the same community issuing a similar query will benefit from the feedback provided by the earlier user. Hence, the search quality in terms of both precision and recall is improved. Moreover, we further improve the effectiveness of CYBER by introducing an index tuning technique. By choosing the indexing terms more carefully, community-based relevance feedback is utilized in both building/refining indices and re-evaluating queries. We first propose a naive scheme, CYBER+, which involves an index tuning technique based on past queries only, and then re-evaluates queries in a separate step. We then propose a more complex scheme, CYBER+ +, which refines its index based on both past queries and relevance feedback. As the index is built with more selective and accurate terms, the search performance is further improved. We conduct a comprehensive experimental study and the results show the effectiveness of our schemes.  相似文献   

10.
11.
Mapping medical concepts from a terminology system to the concepts in the narrative text of a medical document is necessary to provide semantically accurate information for further processing steps. The MetaMap Transfer (MMTx) program is a semantic annotation system that generates a rough mapping of concepts from the Unified Medical Language System (UMLS) Metathesaurus to free medical text, but this mapping still contains erroneous and ambiguous bits of information. Since manually correcting the mapping is an extremely cumbersome and time-consuming task, we have developed the MapFace editor.The editor provides a convenient way of navigating the annotated information gained from the MMTx output, and enables users to correct this information on both a conceptual and a syntactical level, and thus it greatly facilitates the handling of the MMTx program. Additionally, the editor provides enhanced visualization features to support the correct interpretation of medical concepts within the text. We paid special attention to ensure that the MapFace editor is an intuitive and convenient tool to work with. Therefore, we recently conducted a usability study in order to create a well founded background serving as a starting point for further improvement of the editor's usability.  相似文献   

12.
互联网上大部分的数字化信息都与地球上的地点和位置关联,信息检索查询中大量地包含地理信息,传统的基于关键字匹配方法没有考虑检索中的空间关系,无法满足此类检索需求。地理信息检索根据地理范围从文档中获取空间语义匹配的地理知识文档,成为国内外信息检索和GIS领域的热点研究方向。提出了一个地理信息检索的基本系统框架,依据该框架对地理信息知识库、地理信息抽取、地理信息检索模型、混合索引和检索可视化等关键性技术进行了分类概括总结。在对已有技术进行深入对比分析的基础上,指出了该领域未来的研究工作和面临的挑战,并提供了大量的参考文献。  相似文献   

13.
A variety of legal documents are increasingly being made available in electronic format. Automatic Information Search and Retrieval algorithms play a key role in enabling efficient access to such digitized documents. Although keyword-based search is the traditional method used for text retrieval, they perform poorly when literal term matching is done for query processing, due to synonymy and ambivalence of words. To overcome these drawbacks, an ontological framework to enhance the user’s query for retrieval of truly relevant legal judgments has been proposed in this paper. Ontologies ensure efficient retrieval by enabling inferences based on domain knowledge, which is gathered during the construction of the knowledge base. Empirical results demonstrate that ontology-based searches generate significantly better results than traditional search methods.  相似文献   

14.
Geographical Information Systems or GIS are becoming useful tools in making strategic decisions in a variety of government and business activities in areas such as housing, healthcare, land use, natural resources, environmental monitoring, public health, transportation, retail, and routing. This usefulness emanates from the capability of GIS to present a large amount of data in a short period of time on a map, using a geographical coordinate system. In most cases, spatial datasets required for GIS mapping are already available free from many governmental agencies. GIS use more of computing technology than geographical concepts, however, the capabilities of GIS software did not reach the level of simplicity encountered in most software used on a daily basis. Most organizations perform GIS analysis on their data without getting involved with the mapping technology. A typical GIS analyst faces various challenges while incorporating non-spatial dataset to spatial dataset in order to present resulting dataset on a geographical map. In this paper, we present some data manipulation complexities that are encountered while using a GIS software to provide spatial twists to a large user dataset. We also provide ways to facilitate the data manipulation process through a practical example of asthma epidemiology. The solutions will be beneficial to many GIS users in varieties of industries.  相似文献   

15.
Abstract

Performance measures are frequently used to evaluate user friendliness of a system. An equally important, but often overlooked factor is the users' attitudes towards a system. A prototype interface for information retrieval was developed for presenting engineering manuals online. It was tested on a representative sample of the intended end user community. We found that subjects' expectations were based on their experience with printed materials and other computer systems. Familiar search mechanisms (e.g., table of contents, index) were important for getting them started, even though they switched to other mechanisms as they gained more experience with the system. The fact that the index was more detailed than the one in the printed manual was seen by the subjects as critical for speedy and efficient information retrieval. Keyword search of the database was generally the preferred retrieval mechanism. However, some users preferred the index. The ‘Table of Contents’ which was a tree structured menu based system was found to be of limited use in the electronic medium, in contrast to the printed manual.  相似文献   

16.
A methodology is developed for the prediction of river discharge and surface water quality (indexed by nitrogen loading) of a predominantly rural catchment using simple models in an integrated Geographical Information System (GIS). River discharge is predicted using the Soil Conservation Service (SCS) runoff Curve Number model, and surface water quality by the export coefficient model. Main input variable to these models is information on land-use along with ancillary information such as soils. Land-use is an important parameter that affects both discharge and water quality, and it can be derived from classification of remotely sensed images. Unlike conventional models, the models employed here do not require large amounts of data on several hydro-meteorological variables. The models are applied to a rural catchment in eastern England where major land-use changes have occurred in the recent past. Historical land-use data are derived from a variety of sources including maps, aerial photographs and remotely sensed satellite images for various dates ranging from 1931 to 1989. A GIS is a valuable means to enable large amounts of spatial data to be integrated, and to facilitate data manipulation for the specific application of the models. Results are validated using observed runoff and water quality records, and it is shown that the model predictions are of acceptable accuracy. This study demonstrated an application of a GIS to employ simple models to predict river discharge and water quality.  相似文献   

17.
Performance measures are frequently used to evaluate user friendliness of a system. An equally important, but often overlooked factor is the users' attitudes towards a system. A prototype interface for information retrieval was developed for presenting engineering manuals online. It was tested on a representative sample of the intended end user community. We found that subjects' expectations were based on their experience with printed materials and other computer systems. Familiar search mechanisms (e.g., table of contents, index) were important for getting them started, even though they switched to other mechanisms as they gained more experience with the system. The fact that the index was more detailed than the one in the printed manual was seen by the subjects as critical for speedy and efficient information retrieval. Keyword search of the database was generally the preferred retrieval mechanism. However, some users preferred the index. The 'Table of Contents' which was a tree structured menu based system was found to be of limited use in the electronic medium, in contrast to the printed manual.  相似文献   

18.
地理信息系统中算法的研究   总被引:2,自引:6,他引:2  
我们开发的地理信息系统,可根据查询对象的不同标注出铁路、公路、水路和航空交通网,并可按任意比例缩放,其核心功能是任意选择的两个城市,采用Dijkstra迪杰斯特拉算法,可求出两者之间的最佳路径。本文对该数模的建立及其算法做了详细的论述.  相似文献   

19.
Technology in the field of digital media generates huge amounts of nontextual information, audio, video, and images, along with more familiar textual information. The potential for exchange and retrieval of information is vast and daunting. The key problem in achieving efficient and user-friendly retrieval is the development of a search mechanism to guarantee delivery of minimal irrelevant information (high precision) while insuring relevant information is not overlooked (high recall). The traditional solution employs keyword-based search. The only documents retrieved are those containing user-specified keywords. But many documents convey desired semantic information without containing these keywords. This limitation is frequently addressed through query expansion mechanisms based on the statistical co-occurrence of terms. Recall is increased, but at the expense of deteriorating precision. One can overcome this problem by indexing documents according to context and meaning rather than keywords, although this requires a method of converting words to meanings and the creation of a meaning-based index structure. We have solved the problem of an index structure through the design and implementation of a concept-based model using domain-dependent ontologies. An ontology is a collection of concepts and their interrelationships that provide an abstract view of an application domain. With regard to converting words to meaning, the key issue is to identify appropriate concepts that both describe and identify documents as well as language employed in user requests. This paper describes an automatic mechanism for selecting these concepts. An important novelty is a scalable disambiguation algorithm that prunes irrelevant concepts and allows relevant ones to associate with documents and participate in query generation. We also propose an automatic query expansion mechanism that deals with user requests expressed in natural language. This mechanism generates database queries with appropriate and relevant expansion through knowledge encoded in ontology form. Focusing on audio data, we have constructed a demonstration prototype. We have experimentally and analytically shown that our model, compared to keyword search, achieves a significantly higher degree of precision and recall. The techniques employed can be applied to the problem of information selection in all media types.Received: 7 October 2002, Accepted: 20 May 2003, Published online: 30 September 2003Edited by: E. LochovskyThis research has been funded [or funded in part] by the Integrated Media Systems Center, a National Science Foundation Engineering Research Center, Cooperative Agreement No. EEC-9529152.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号