首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
More people than ever before have access to information with the World Wide Web; information volume and number of users both continue to expand. Traditional search methods based on keywords are not effective, resulting in large lists of documents, many of which unrelated to users’ needs. One way to improve information retrieval is to associate meaning to users’ queries by using ontologies, knowledge bases that encode a set of concepts about one domain and their relationships. Encoding a knowledge base using one single ontology is usual, but a document collection can deal with different domains, each organized into an ontology. This work presents a novel way to represent and organize knowledge, from distinct domains, using multiple ontologies that can be related. The model allows the ontologies, as well as the relationships between concepts from distinct ontologies, to be represented independently. Additionally, fuzzy set theory techniques are employed to deal with knowledge subjectivity and uncertainty. This approach to organize knowledge and an associated query expansion method are integrated into a fuzzy model for information retrieval based on multi-related ontologies. The performance of a search engine using this model is compared with another fuzzy-based approach for information retrieval, and with the Apache Lucene search engine. Experimental results show that this model improves precision and recall measures.  相似文献   

2.
A Knowledge-Based Approach to Effective Document Retrieval   总被引:3,自引:0,他引:3  
This paper presents a knowledge-based approach to effective document retrieval. This approach is based on a dual document model that consists of a document type hierarchy and a folder organization. A predicate-based document query language is proposed to enable users to precisely and accurately specify the search criteria and their knowledge about the documents to be retrieved. A guided search tool is developed as an intelligent natural language oriented user interface to assist users formulating queries. Supported by an intelligent question generator, an inference engine, a question base, and a predicate-based query composer, the guided search collects the most important information known to the user to retrieve the documents that satisfy users' particular interests. A knowledge-based query processing and search engine is devised as the core component in this approach. Algorithms are developed for the search engine to effectively and efficiently retrieve the documents that match the query.  相似文献   

3.
4.
Query recommendation helps users to describe their information needs more clearly so that search engines can return appropriate answers and meet their needs. State-of-the-art researches prove that the use of users’ behavior information helps to improve query recommendation performance. Instead of finding the most similar terms previous users queried, we focus on how to detect users’ actual information need based on their search behaviors. The key idea of this paper is that although the clicked documents are not always relevant to users’ queries, the snippets which lead them to the click most probably meet their information needs. Based on analysis into large-scale practical search behavior log data, two snippet click behavior models are constructed and corresponding query recommendation algorithms are proposed. Experimental results based on two widely-used commercial search engines’ click-through data prove that the proposed algorithms outperform practical recommendation methods of these two search engines. To the best of our knowledge, this is the first time that snippet click models are proposed for query recommendation task.  相似文献   

5.
信息检索的效果很大程度上取决于用户能否输入恰当的查询来描述自身信息需求。很多查询通常简短而模糊,甚至包含噪音。查询推荐技术可以帮助用户提炼查询、准确描述信息需求。为了获得高质量的查询推荐,在大规模“查询-链接”二部图上采用随机漫步方法产生候选集合。利用摘要点击信息对候选列表进行重排序,使得体现用户意图的查询排在比较高的位置。最终采用基于学习的算法对推荐查询中可能存在的噪声进行过滤。基于真实用户行为数据的实验表明该方法取得了较好的效果。  相似文献   

6.
One of the key difficulties for users in information retrieval is to formulate appropriate queries to submit to the search engine. In this paper, we propose an approach to enrich the user’s queries by additional context. We used the Language Model to build the query context, which is composed of the most similar queries to the query to expand and their top-ranked documents. Then, we applied a query expansion approach based on the query context and the Latent Semantic Analyses method. Using a web test collection, we tested our approach on short and long queries. We varied the number of recommended queries and the number of expansion terms to specify the appropriate parameters for the proposed approach. Experimental results show that the proposed approach improves the effectiveness of the information retrieval system by 19.23 % for short queries and 52.94 % for long queries according to the retrieval results using the original users’ queries.  相似文献   

7.
8.
A “softening” of the hard Boolean scheme for information retrieval is presented. In this approach, information retrieval is seen as a multicriteria decision-making activity in which the criteria to be satisfied by the potential solutions, i.e., the archived documents, are the requirements expressed in the query. the retrieval function is then an overall decision function evaluating the degree to which each potential solution satisfies a query consisting of information requirements aggregated by operators. Linguistic quantifiers and a connector dealing with primary and optional criteria are defined and introduced in the query language in order to specify the aggregation criteria of the single query requirements. These criteria make it possible for users to express queries in a simple and self-explanatory manner. In particular, linguistic quantifiers are defined which capture the intrinsic vagueness of information needs. © 1995 John Wiley & Sons, Inc.  相似文献   

9.
For querying structured and semistructured data, data retrieval and document retrieval are two valuable and complementary techniques that have not yet been fully integrated. In this paper, we introduce integrated information retrieval (IIR), an XML-based retrieval approach that closes this gap. We introduce the syntax and semantics of an extension of the XQuery language called XQuery/IR. The extended language realizes IIR and thereby allows users to formulate new kinds of queries by nesting ranked document retrieval and precise data retrieval queries. Furthermore, we detail index structures and efficient query processing approaches for implementing XQuery/IR. Based on a new identification scheme for nodes in node-labeled tree structures, the extended index structures require only a fraction of the space of comparable index structures that only support data retrieval.  相似文献   

10.
Traditional database query languages are based on set theory and crisp first order logic. However, many applications require retrieval-like queries which return result objects associated with a degree of being relevant to the query. Historically, retrieval systems estimate relevance by exploiting hidden object semantics whereas query processing in database systems relies on matching select-conditions with attribute values. Thus, different mechanisms were developed for database and information retrieval systems. In consequence, there is a lack of support for queries involving both retrieval and database search terms. In this work, we introduce the quantum query language (QQL). Its underlying unifying theory is based on the mathematical formalism of quantum mechanics and quantum logic. Van Rijsbergen already discussed the strong relation between the formalism of quantum mechanics and information retrieval. In this work, we interrelate concepts from database query processing to concepts from quantum mechanics and logic. As result, we obtain a common theory which allows us to incorporate seamlessly retrieval search into traditional database query processing.  相似文献   

11.
When performing queries in web search engines, users often face difficulties choosing appropriate query terms. Search engines therefore usually suggest a list of expanded versions of the user query to disambiguate it or to resolve potential term mismatches. However, it has been shown that users find it difficult to choose an expanded query from such a list. In this paper, we describe the adoption of set‐based text visualization techniques to visualize how query expansions enrich the result space of a given user query and how the result sets relate to each other. Our system uses a linguistic approach to expand queries and topic modeling to extract the most informative terms from the results of these queries. In a user study, we compare a common text list of query expansion suggestions to three set‐based text visualization techniques adopted for visualizing expanded query results – namely, Compact Euler Diagrams, Parallel Tag Clouds, and a List View – to resolve ambiguous queries using interactive query expansion. Our results show that text visualization techniques do not increase retrieval efficiency, precision, or recall. Overall, users rate Parallel Tag Clouds visualizing key terms of the expanded query space lowest. Based on the results, we derive recommendations for visualizations of query expansion results, text visualization techniques in general, and discuss alternative use cases of set‐based text visualization techniques in the context of web search.  相似文献   

12.
The exponential growth of information on the Web has introduced new challenges for building effective search engines. A major problem of web search is that search queries are usually short and ambiguous, and thus are insufficient for specifying the precise user needs. To alleviate this problem, some search engines suggest terms that are semantically related to the submitted queries so that users can choose from the suggestions the ones that reflect their information needs. In this paper, we introduce an effective approach that captures the user's conceptual preferences in order to provide personalized query suggestions. We achieve this goal with two new strategies. First, we develop online techniques that extract concepts from the web-snippets of the search result returned from a query and use the concepts to identify related queries for that query. Second, we propose a new two-phase personalized agglomerative clustering algorithm that is able to generate personalized query clusters. To the best of the authors' knowledge, no previous work has addressed personalization for query suggestions. To evaluate the effectiveness of our technique, a Google middleware was developed for collecting clickthrough data to conduct experimental evaluation. Experimental results show that our approach has better precision and recall than the existing query clustering methods.  相似文献   

13.
In this paper, we present a new method for fuzzy query processing in relational database systems based on automatic clustering techniques and weighting concepts. The proposed method allows the query conditions and the weights of query items of users' fuzzy SQL queries to be described by linguistic terms represented by fuzzy numbers. Because the proposed fuzzy query processing method allows the users to construct their fuzzy queries more conveniently, the existing relational database systems will be more intelligent and more flexible to the users.  相似文献   

14.
We introduce the task of mapping search engine queries to DBpedia, a major linking hub in the Linking Open Data cloud. We propose and compare various methods for addressing this task, using a mixture of information retrieval and machine learning techniques. Specifically, we present a supervised machine learning-based method to determine which concepts are intended by a user issuing a query. The concepts are obtained from an ontology and may be used to provide contextual information, related concepts, or navigational suggestions to the user submitting the query. Our approach first ranks candidate concepts using a language modeling for information retrieval framework. We then extract query, concept, and search-history feature vectors for these concepts. Using manual annotations we inform a machine learning algorithm that learns how to select concepts from the candidates given an input query. Simply performing a lexical match between the queries and concepts is found to perform poorly and so does using retrieval alone, i.e., omitting the concept selection stage. Our proposed method significantly improves upon these baselines and we find that support vector machines are able to achieve the best performance out of the machine learning algorithms evaluated.  相似文献   

15.
Information search and retrieval from a remote database (e.g., cloud server) involves a multitude of privacy issues. Submitted search terms and their frequencies, returned responses and order of their relevance, and retrieved data items may contain sensitive information about the users. In this paper, we propose an efficient multi-keyword search scheme that ensures users’ privacy against both external adversaries including other authorized users and cloud server itself. The proposed scheme uses cryptographic techniques as well as query and response randomization. Provided that the security and randomization parameters are appropriately chosen, both search terms in queries and returned responses are protected against privacy violations. The scheme implements strict security and privacy requirements that essentially disallow linking queries featuring identical search terms. We also incorporate an effective ranking capability in the scheme that enables user to retrieve only the top matching results. Our comprehensive analytical study and extensive experiments using both real and synthetic datasets demonstrate that the proposed scheme is privacy-preserving, effective, and highly efficient.  相似文献   

16.
In the InfoBeacons system, a peer-to-peer network of beacons cooperates to route queries to the best information sources. Many internet sources are unwilling to provide more cooperation than simple searching to aid in the query routing.We adapt techniques from information retrieval to deal with this lack of cooperation. In particular, beacons determine how to route queries based on information cached from sources’ responses to queries. In this paper, we examine alternative architectures for routing queries between beacons and to data sources. We also examine how to improve the routing by probing sources in an informed way to learn about their content. Results of experiments using a beacon network to search 2,500 information sources demonstrates the effectiveness of our system; for example, our techniques require contacting up to 71 percent fewer sources than existing peer-to-peer random walk techniques.  相似文献   

17.
李求实  王秋月  王珊 《软件学报》2012,23(8):2002-2017
与纯文本文档集相比,使用语义标签标注的半结构化的XML文档集,有助于信息检索系统更好地理解待检索文档.同样,结构化查询,比如SQL,XQuery和Xpath,相对于纯关键词查询更加清晰地表达了用户的查询意图.这二者都能够帮助信息检索系统获得更好的检索精度.但关键词查询因其简单和易用性,仍被广泛使用.提出了XNodeRelation算法,以自动推断关键词查询的结构化信息(条件/目标节点类型).与已有的推断算法相比,综合了XML文档集的模式和统计信息以及查询关键词出现的上下文及其关联关系等推断用户的查询意图.大量的实验验证了该算法的有效性.  相似文献   

18.
This paper introduces a new approach to realize video databases. The approach consists of a VideoText data model based on free text annotations associated with logical video segments and a corresponding query language. Traditional database techniques are inadequate for exploiting queries on unstructured data such as video, supporting temporal queries, and ranking query results according to their relevance to the query. In this paper, we propose to use information retrieval techniques to provide such features and to extend the query language to accommodate interval queries that are particularly suited to video data. Algorithms are provided to show how user queries are evaluated. Finally, a generic and modular video database architecture which is based on VideoText data model is described.  相似文献   

19.
One of the useful tools offered by existing web search engines is query suggestion (QS), which assists users in formulating keyword queries by suggesting keywords that are unfamiliar to users, offering alternative queries that deviate from the original ones, and even correcting spelling errors. The design goal of QS is to enrich the web search experience of users and avoid the frustrating process of choosing controlled keywords to specify their special information needs, which releases their burden on creating web queries. Unfortunately, the algorithms or design methodologies of the QS module developed by Google, the most popular web search engine these days, is not made publicly available, which means that they cannot be duplicated by software developers to build the tool for specifically-design software systems for enterprise search, desktop search, or vertical search, to name a few. Keyword suggested by Yahoo! and Bing, another two well-known web search engines, however, are mostly popular currently-searched words, which might not meet the specific information needs of the users. These problems can be solved by WebQS, our proposed web QS approach, which provides the same mechanism offered by Google, Yahoo!, and Bing to support users in formulating keyword queries that improve the precision and recall of search results. WebQS relies on frequency of occurrence, keyword similarity measures, and modification patterns of queries in user query logs, which capture information on millions of searches conducted by millions of users, to suggest useful queries/query keywords during the user query construction process and achieve the design goal of QS. Experimental results show that WebQS performs as well as Yahoo! and Bing in terms of effectiveness and efficiency and is comparable to Google in terms of query suggestion time.  相似文献   

20.
Henninger  S. 《Software, IEEE》1994,11(5):48-59
Component libraries are the dominant paradigm for software reuse, but they suffer from a lack of tools that support the problem-solving process of locating relevant components. Most retrieval tools assume that retrieval is a simple matter of matching well-formed queries to a repository. But forming queries can be difficult. A designer's understanding of the problem evolves while searching for a component, and large repositories often use an esoteric vocabulary. CodeFinder is a retrieval system that combines retrieval by reformulation (which supports incremental query construction) and spreading activation (which retrieves items related to the query) to help users find information. I designed it to investigate the hypothesis that this design makes for a more effective retrieval system. My study confirmed that it was more helpful to users seeking relevant information with ill-defined tasks and vocabulary mismatches than other query systems. The study supports the hypothesis that combining techniques effectively satisfies the kind of information needs typically encountered in software design  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号