首页 | 官方网站   微博 | 高级检索  
 共查询到18条相似文献,搜索用时 15 毫秒
网络信息的日益增加迫切需要适宜的检索工具,特别是进行专业信息的检索,需要体现专业词汇特点的搜索引擎。本文在对搜索引擎核心技术进行研究的基础上,提出了石油化工信息搜索引擎的设计方案,开发了网络机器人模块,实现了海量网页的自动获取;采用最短路径分词和正向最大匹配相结合的算法,实现了中文自动分词;开发了信息索引模块,实现了网页的批量索引和增量索引;开发了信息检索模块,提供布尔逻辑查询,实现摘要自动生成。通过系统集成,初步建立了体现石油化工专业特点的搜索引擎。  相似文献   

A typical spoken content retrieval solution integrates multiple technologies that belong to the areas of automatic speech recognition and information retrieval. Due to the rich set of challenges – many of them language specific – as well as widespread impact, numerous research sites in the world are actively engaged in this research area. This special issue highlights some of the recent advances in spoken content retrieval.  相似文献   

张继燕  欧莹元 《软件》2013,34(5):155-156
本文从信息管理与信息系统的专业目标开始分析,确立《信息存储与检索》课程在该专业中的地位,然后阐述《信息存储与检索》课程的跨多学科的特点,分析当前大学的主要教材,选择最适合信息管理与信息系统专业的教材,针对所选教材阐述了该课程的教学内容及教学方式、方法。  相似文献   

A masss of heterogeneous,distributed and dynamic information on the World Wide Web(the Web) has resulted in “information overload“ .It‘s an important and urgent reserach issue to provide users with effective information retrieval service on the Web.Web search enginees attempt to solve this problem,yet their effect is far from satisfying.In this paper,a distributed and cooperative strategy for information retrieval on the Web is proposed to substitute the centralized mode adopted by the current search engines.Then a new information retrieval system model IRSM is presented.which supports the retrieval of metadata about web documents and uses Z39.50 standard protocol to unify the heterogeneous interfaces of uments and uses Z39.50 standard protocol to unify the heterogeneous interfaces of different systems.Based on that,a distributed and cooperative information refieval framework,called DCIRF,is designed to help users in fast and effective information retrieval on the Web.  相似文献   

Considering developments on the measurement of Internet Self-Efficacy literature, a short scale was developed with a focus on web searching across all domains. The Information Retrieval On the Web Self-Efficacy scale (IROWSE) was spread from the General Self-Efficacy Scale [Schwarzer, R. 1994. “Optimism, Vulnerability, and Self-Beliefs as Health Related Cognitions: A Systematic Overview.” Psychology and Health: An International 9: 161–180] and measures the value attributed by an individual to her/his own capacity to organise and execute information searches on the web. In study 1 (N?=?228), we aimed to ensure reliability, explore factorial structure, and check for criterion-related validity of a French form. In study 2 (N?=?534), we aimed to validate an English version among US and international (non-US) sample. From an internal validity point of view, both IROWSE versions turned out satisfactory with a one-factor model of eight items. As expected, the scales were not confused with self-esteem as a trait (study 1), self-reported Internet search skills (study 2) or general attitudes towards the Internet, and stemmed from direct experience with the Internet (study 1 and 2). Overall, slight differences between samples would indicate the cultural sensitivity of IROWSE measure encouraging running studies with a comparative approach. Resorting to the IROWSE measure might enhance the understanding of Internet practices, information retrieval behaviours, and search performance since self-efficacy would thus be assessed at a more domain-specific level.  相似文献   

During a two day strategic workshop in February 2018,22 information retrieval researchers met to discuss the future challenges and opportunities within the field.The outcome is a list of potential research directions,project ideas,and challenges.This report describes the major conclusions we have obtained during the workshop.A key result is that we need to open our mind to embrace a broader IR field by rethink the definition of information,retrieval,user,system,and evaluation of IR.By providing detailed discussions on these topics,this report is expected to inspire our IR researchers in both academia and industry,and help the future growth of the IR research community.  相似文献   

The retrieval facilities of most peer-to-peer (P2P) systems are limited to queries based on a unique identifier or a small set of keywords. The techniques used for this purpose are hardly applicable for content based image retrieval (CBIR) in a P2P network. Furthermore, we will argue that the curse of dimensionality and the high communication overhead prevent the adaptation of multidimensional search trees or fast sequential scan techniques for P2P CBIR. In the present paper we will propose two compact data representations that can be distributed in a P2P network and used as the basis for a source selection. This allows for communicating with only a small fraction of all peers during query processing without deteriorating the result quality significantly. We will also present experimental results confirming our approach.  相似文献   

The main idea of content-based image retrieval (CBIR) is to search on an image’s visual content directly. Typically, features (e.g., color, shape, texture) are extracted from each image and organized into a feature vector. Retrieval is performed by image example where a query image is given as input by the user and an appropriate metric is used to find the best matches in the corresponding feature space. We attempt to bypass the feature selection step (and the metric in the corresponding feature space) by following what we believe is the logical continuation of the CBIR idea of searching visual content directly. It is based on the observation that, since ultimately, the entire visual content of an image is encoded into its raw data (i.e., the raw pixel values), in theory, it should be possible to determine image similarity based on the raw data alone. The main advantage of this approach is its simplicity in that explicit selection, extraction, and weighting of features is not needed. This work is an investigation into an image dissimilarity measure following from the theoretical foundation of the recently proposed normalized information distance (NID) [M. Li, X. Chen, X. Li, B. Ma, P. Vitányi, The similarity metric, in: Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms, 2003, pp. 863–872]. Approximations of the Kolmogorov complexity of an image are created by using different compression methods. Using those approximations, the NID between images is calculated and used as a metric for CBIR. The compression-based approximations to Kolmogorov complexity are shown to be valid by proving that they create statistically significant dissimilarity measures by testing them against a null hypothesis of random retrieval. Furthermore, when compared against several feature-based methods, the NID approach performed surprisingly well.  相似文献   

The phenomenal growth of online Flash movies in recent years has made Flash one of the most prevalent media formats on the Web. The retrieval and management issues of Flash, vital to the utilization of the enormous Flash resource, are unfortunately overlooked by the research community. This paper presents the first piece of work (to the best of our knowledge) in this domain by suggesting an integrated framework for the retrieval of Flash movies based on their content characteristics as well as contextual information. The proposed approach consists of two major components: (1) a content-based retrieval component, which explores the characteristics of Flash movie content at compositional and semantic levels; and (2) a context-based retrieval component, which explores the contextual information including the texts and hyperlinks surrounding the movies. An experimental Flash search engine system has been implemented to demonstrate the feasibility of the suggested framework. The work described in this paper was supported substantially by a grant (Project No. 7001457), and partially by another grant (Project No. 7001564), both from CityU of Hong Kong.  相似文献   

A technique to retrieve images by region matching using a combined feature index based on color, shape, and location is presented within the framework of MPEG-7. Dominant regions within each image are indexed using integrated color, shape, and location features. Various combinations of regions are also indexed. The resulting indices and related metadata are stored in a Hash structure, where similar images tend to form clusters. The retrieval process is non-cascading and images can be retrieved based on color, shape or location and also based on a combined color–shape–location index. Results obtained show that retrieval effectiveness increases in non-cascaded region-based querying by combined index.  相似文献   

The issue of whether or not word sense disambiguation (WSD) can improve information retrieval (IR) results has been intensely debated over the years, with many inconclusive or contradictory results and a majority of skeptical opinions. All three classes of WSD methods (supervised, unsupervised, and knowledge-based) have been considered by the literature with respect to IR. We hereby survey the unsupervised approach which, although relatively rarely used, has provided positive results at a large scale. Unsupervised WSD has already made proof of its utility in IR and it is our belief that it still holds a promise for this field. The two main existing types of unsupervised methods for IR, which are of completely different natures, are presented, within the scientific context in which they were born, and are compared. Regardless of the gap in time between these central approaches, we are of the opinion that the unsupervised solution to the discussed problem remains the most significant for IR applications. By surveying what we consider the most promising existing approach to usage of WSD in IR, and by discussing its possible extensions, we hope to stimulate continuation of this line of research, possibly at an even more successful level.  相似文献   

BankXX: Supporting legal arguments through heuristic retrieval   总被引:2,自引:2,他引:0  
The BankXX system models the process of perusing and gathering information for argument as a heuristic best-first search for relevant cases, theories, and other domain-specific information. As BankXX searches its heterogeneous and highly interconnected network of domain knowledge, information is incrementally analyzed and amalgamated into a dozen desirable ingredients for argument (called argument pieces), such as citations to cases, applications of legal theories, and references to prototypical factual scenarios. At the conclusion of the search, BankXX outputs the set of argument pieces filled with harvested material relevant to the input problem situation.This research explores the appropriateness of the search paradigm as a framework for harvesting and mining information needed to make legal arguments. In this article, we describe how legal research fits the heuristic search framework and detail how this model is used in BankXX. We describe the BankXX program with emphasis on its representation of legal knowledge and legal argument. We describe the heuristic search mechanism and evaluation functions that drive the program. We give an extended example of the processing of BankXX on the facts of an actual legal case in BankXX's application domain — the good faith question of Chapter 13 personal bankruptcy law. We discuss closely related research on legal knowledge representation and retrieval and the use of search for case retrieval or tasks related to argument creation. Finally we review what we believe are the contributions of this research to the understanding of the diverse disciplines it addresses.This research was supported in part by grant No. 90-0359 from the Air Force Office of Sponsored Research and NSF grant No. EEC-9209623 State/University/Industry Cooperative Research on Intelligent Information Retrieval.  相似文献   

This study aims at developing an evaluation framework of strategic information systems (SIS) and evaluating the SIS planning and implementation by using a cognitive approach called the repertory grid technique. The findings are based on in-depth interviews with chief information officers (CIOs) involved with SIS developments in their organisations. This exploratory study builds on a cognitive methodology and enables us to develop the evaluation framework of SIS within the CIO's mind. In a practical viewpoint, we evaluated the effectiveness of the essential activities in the SIS planning and implementation. Results showed that activities on analysing industry and environment, analysing information system weakness and strength, formulating SIS strategy, identifying SIS initiatives, prioritising and allocating resources for SIS, documenting SIS, and liaising with top management team are well performed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号