首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Although search engines are essential tools for finding information on the World Wide Web, the effective use of search engines for information retrieval (IR) is a crucial challenge for any Internet user. Based on the user-focused approach, this study investigates individual information retrieval behaviors using information processing theory. The results show that experience with search engines significantly affects users’ attitudes toward search engines for information retrieval, the query-based service is more popular than the directory-based service, users are not completely satisfied with the precision of retrieved information and the response time of search engines, and users’ motivation is a key factor that predicts their intention to use search engines for information retrieval. Furthermore, this study proposes a conceptual model for investigating individual attitudes toward search engines for information retrieval.  相似文献   

2.
The accuracy of searches for visual data elements, as well as other types of information, depends on the terms used by the user in the input query to retrieve the relevant results and to reduce the irrelevant ones. Most of the results that are returned are relevant to the query terms, but not to their meaning. For example, certain types of web contents hold hidden information that traditional search engines are unable to retrieve. Searching for the mathematical construct of 1/x using Google will not result in the retrieval of the documents that contain the mathematically equivalent expressions (i.e. x?1). Because conventional search engines fall short of providing math-search capabilities. One of these capabilities is the ability of these search engines to detect the mathematical equivalence between users’ quires and math contents. In addition, users sometimes need to use slang terms, either to retrieve slang-based visual data (e.g. social media content) or because they do not know how to write using classical form. To solve such a problem, this paper proposed an AI-based system for analysing multilingual slang web contents so as to allow a user to retrieve web slang contents that are relevant to the user’s query. The proposed system presents an approach for visual data analytics, and it also enables users to analyse hundreds of potential search results/web pages by starting an informed friendly dialogue and presenting innovative answers.  相似文献   

3.
《Applied Soft Computing》2007,7(1):398-410
Personalized search engines are important tools for finding web documents for specific users, because they are able to provide the location of information on the WWW as accurately as possible, using efficient methods of data mining and knowledge discovery. The types and features of traditional search engines are various, including support for different functionality and ranking methods. New search engines that use link structures have produced improved search results which can overcome the limitations of conventional text-based search engines. Going a step further, this paper presents a system that provides users with personalized results derived from a search engine that uses link structures. The fuzzy document retrieval system (constructed from a fuzzy concept network based on the user's profile) personalizes the results yielded from link-based search engines with the preferences of the specific user. A preliminary experiment with six subjects indicates that the developed system is capable of searching not only relevant but also personalized web pages, depending on the preferences of the user.  相似文献   

4.
基于XML和ANN的Web文本智能检索研究   总被引:1,自引:1,他引:0  
传统的网络信息检索技术如搜索引擎存在一些不足,一方面它只是将信息搜寻出来,不能发现隐藏在数据背后的知识;另一方面其采集软件在采集数据时缺乏人工干预,智能性不强,导致信息利用率不高.针对传统的Web搜索引擎存在的上述问题,结合Web文本挖掘、XML、BP神经网络在数据处理方面的长处,提出了一个具有一定智能的Web文本信息检索模型,以使其具有较高的信息利用率.  相似文献   

5.
信息检索的效果很大程度上取决于用户能否输入恰当的查询来描述自身信息需求。很多查询通常简短而模糊,甚至包含噪音。查询推荐技术可以帮助用户提炼查询、准确描述信息需求。为了获得高质量的查询推荐,在大规模“查询-链接”二部图上采用随机漫步方法产生候选集合。利用摘要点击信息对候选列表进行重排序,使得体现用户意图的查询排在比较高的位置。最终采用基于学习的算法对推荐查询中可能存在的噪声进行过滤。基于真实用户行为数据的实验表明该方法取得了较好的效果。  相似文献   

6.
基于日志挖掘的搜索引擎用户行为分析   总被引:1,自引:0,他引:1  
随着网络搜索用户的大规模增加,网络用户行为分析已成为网络信息检索系统进行架构分析、性能优化和系统维护的重要基石,是网络信息检索和知识挖掘的重要研究领域之一。为更好理解网络用户的搜索行为,该文基于7.56亿条真实网络用户行为日志,对用户行为进行分析和研究。我们主要考察了用户搜索行为中的查询长度、查询修改率、相关搜索点击率、首次/最后一次点击位置分布以及查询内点击数分布等信息。该文还基于不同类型的查询集合,考察用户在不同查询需求下的行为差异性。相关分析结果对搜索引擎算法优化和系统改进等都具有一定的参考意义。  相似文献   

7.
Improved relevance ranking in WebGather   总被引:7,自引:0,他引:7       下载免费PDF全文
The amount of information on the web is growing rapidly,and search engines that rely on keyword matching usually return too many low quality matches.To improve search results,a challenging task for search engines is how to effecively calculate a relevance ranking for each web page,This paper discusses in what order a search engine should return the uRLs it has produced in response to a user‘s query,so at to show ore relevant pages first.Emphasis is given on the ranking functions adopted by WebGather that take link structure and user popularity factors into account.Experimental results are also presented to evaluate the proposed strategy.  相似文献   

8.
通用搜索引擎在检索过程中会出现查询结果与关键词所属领域无关的主题漂移现象.本文提出了面向特定领域的网页重排序算法-TSRR(Topic Sensitive Re-Ranking)算法,从一个新的视角对主题漂移问题加以解决. TSRR算法设计一种独立于网页排序的模型,用来表示领域,然后建立网页信息模型,在用户检索过程中结合领域向量模型和网页信息模型对网页搜索结果进行重排序.在爬取的特定领域的数据集上,以用户满意度和准确率为标准进行评估,实验结果表明,本文中提出的TSRR算法性能优异,比经典的基于Lucene的排序算法在用户满意度上平均提高17.3%,在准确率上平均提高41.9%.  相似文献   

9.
针对目前通用搜索引擎对林业主题信息覆盖率和查准率较低的不足,提出了一种基于Shark-Search算法的林业主题爬虫设计方案。详细讨论了该主题爬虫的爬行策略、算法描述及实现,并通过实践构建了林业主题搜索引擎"搜林"。实验结果表明,相对于通用搜索引擎,"搜林"减少了搜索结果的信息量,提高了林业主题信息搜索的准确率。  相似文献   

10.
常浩  陈莉 《微计算机信息》2006,22(24):302-304
Internet是一个巨大的,分步广泛的,动态性强的全球信息服务中心,人们想在它上面找到想要的相关信息是很困难的,一般用户通过给搜索引擎提供简短的关键词来检索信息,但是通过搜索引擎返回的相关结果太多,这使得处理相关结果太耗时,本文提出了一种语义虚拟文档(SVD)来表示web文档,在此基础上实现了凝聚层次聚类算法,以自动聚类内容相似的web文档。结果:一方面使网络用户增强了相关结果的判断处理,同时使用户快速、高效的从Internet上发现想要的信息,另一方面返回的结果在知识表示上增强了web内容挖掘。  相似文献   

11.
On web information exists in the form of text, audio, image, and video objects often referred to multiple media objects. Vertical web search provides the search of multiple media information usually via keyword-based queries. The search results in different media formats usually presented in separate panels/tabs; integration is mostly non-blended. Therefore, results exploration via vertical web search engines require the selection of a source and scrolling of a linear ranked list of results. Relationships in the results presented in separate panels/tabs are mostly not considered. Search aggregations unify results from several vertical web sources via blended integration, but exploration still requires scrolling of a linear ranked list. Multimedia search frameworks provide the exploration of results in different media formats but more focused towards the retrieval issues. We proposed a multiple media information search framework to address issues, particularly in aggregated search. Our search framework provides a mechanism to explore results via non-linear ways. The search framework realized by suggesting a framework architecture design and instantiating a search tool. The effectiveness of blended integration and browsing is measured via precision and click through rate respectively. Search task support in results exploration mechanism measured via task-based evaluation. We also validated the conformance of various search/exploration attributes discussed in the state-of-the-art in our frameworks.  相似文献   

12.
Rank aggregation mechanisms have been used in solving problems from various domains such as bioinformatics, natural language processing, information retrieval, etc. Metasearch is one such application where a user gives a query to the metasearch engine, and the metasearch engine forwards the query to multiple individual search engines. Results or rankings returned by these individual search engines are combined using rank aggregation algorithms to produce the final result to be displayed to the user. We identify few aspects that should be kept in mind for designing any rank aggregation algorithm for metasearch. For example, generally equal importance is given to the input rankings while performing the aggregation. However, depending on the indexed set of web pages, features considered for ranking, ranking functions used etc. by the individual search engines, the individual rankings may be of different qualities. So, the aggregation algorithm should give more weight to the better rankings while giving less weight to others. Also, since the aggregation is performed when the user is waiting for response, the operations performed in the algorithm need to be light weight. Moreover, getting supervised data for rank aggregation problem is often difficult. In this paper, we present an unsupervised rank aggregation algorithm that is suitable for metasearch and addresses the aspects mentioned above.We also perform detailed experimental evaluation of the proposed algorithm on four different benchmark datasets having ground truth information. Apart from the unsupervised Kendall-Tau distance measure, several supervised evaluation measures are used for performance comparison. Experimental results demonstrate the efficacy of the proposed algorithm over baseline methods in terms of supervised evaluation metrics. Through these experiments we also show that Kendall-Tau distance metric may not be suitable for evaluating rank aggregation algorithms for metasearch.  相似文献   

13.
随着Web技术的迅速发展,提供个性化服务的搜索引擎技术受到用户的广泛关注,网页排序是其中的关键技术之一。本文利用PageRank算法对原有的Lucene网页排序进行了改进,设计并实现了关于手机信息搜索的个性化搜索引擎。实验结果证明,改进后的排序算法能够较好地提高信息检索的准确度,为用户带来了优于Lucene自身排序的搜索体验。  相似文献   

14.
15.
Metasearch engines offer better coverage and are more fault-tolerant and expandable than single search engines. A metasearch engine is required to post queries with and obtain retrieval results from several other Internet search engines. In this paper, we describe the use of the extensible style language (XSL) to support metasearches. We show how XSL can transform a query, expressed in XML, into different forms for different search engines. We show how the retrieval results could be transformed into a standard format so that the metasearch engine can interpret the retrieved data, filtering the irrelevant information (e.g. advertisement). The proposed structure treats the metasearch engine and the individual search engines as separate modules with a clearly defined communication structure through XSL. Thus, the system is more extensible than coding the structure and syntactic transformation processes. It allows other new search engines to be included just through plug-and-play, requiring only that the new transformation of XML for this search engine be included in the XSL.  相似文献   

16.
提出一种基于用户动机模型的网络搜索引擎和一种提高用户行为模型构建效率的方案.动机模型建立于用户与搜索引擎之间,用以辅助用户检索,以达到提高搜索引擎检索效率和准确率的目的.以人类行为学为理论基础,以个性化技术为手段,从而合并相似的用户行为模型以构建用户动机模型.通过实验,验证了基于用户动机模型的搜索引擎比通用搜索引擎能更好地适应用户的需求.  相似文献   

17.
Zhang  Hongjiang  Chen  Zheng  Li  Mingjing  Su  Zhong 《World Wide Web》2003,6(2):131-155
A major bottleneck in content-based image retrieval (CBIR) systems or search engines is the large gap between low-level image features used to index images and high-level semantic contents of images. One solution to this bottleneck is to apply relevance feedback to refine the query or similarity measures in image search process. In this paper, we first address the key issues involved in relevance feedback of CBIR systems and present a brief overview of a set of commonly used relevance feedback algorithms. Almost all of the previously proposed methods fall well into such framework. We present a framework of relevance feedback and semantic learning in CBIR. In this framework, low-level features and keyword annotations are integrated in image retrieval and in feedback processes to improve the retrieval performance. We have also extended framework to a content-based web image search engine in which hosting web pages are used to collect relevant annotations for images and users' feedback logs are used to refine annotations. A prototype system has developed to evaluate our proposed schemes, and our experimental results indicated that our approach outperforms traditional CBIR system and relevance feedback approaches.  相似文献   

18.
After the Internet has gained great popularity at homes and schools, there is much information on the Web. Today, one of the primary uses of the Internet is information retrieval from search engines. The main purpose of the current study is to develop and examine an individual attitude model towards search engines as a tool for retrieving information. This model integrates individual computer experience with perceptions. In addition, it also combines perception theories, such as technology acceptance model (TAM) and motivation, in order to understand individual attitudes toward search engines. The results show that individual computer experience, quality of search systems, motivation, and perceptions of technology acceptance are all key factors that affect individual feelings to use search engines as an information retrieval tool.  相似文献   

19.
As the web grows,the massive increase in information is placing severe burdens on information retrieval and sharing.Automated search engines and directories with small editorial staff are unable to keep up with the increasing submission of web sites.To address the problem,this paper presents Infomarker-an Internet information service system based on open Directory and Zero-Keyword Inquiry,The Open DIrectory sets up a net-community in which the increasing netcitizens can each organize a small portion of the web and present it to the others.By means of Zero-Keyword Inquiry,user can get the information he is interested in with out inputting any keyword that is often required by search engines,In Infomarker,user can record the web address he likes and can put forward an information request based on his wed records.The information matching engine checks the information in the Open Directory to find what fits user‘s needs and adds it to user‘s web address records.The key to the matching process is layered keyword mapping.Informarker provides people with a whole new approach to getting information and shows a wide prospect.  相似文献   

20.
应用链接分析的web搜索结果聚类   总被引:3,自引:0,他引:3  
随着web上信息的急剧增长,如何有效地从web上获得高质量的web信息已经成为很多研究领域里的热门研究主题之一,比如在数据库,信息检索等领域。在信息检索里,web搜索引擎是最常用的工具,然而现今的搜索引擎还远不能达到满意的要求,使用链接分析,提出了一种新的方法用来聚类web搜索结果,不同于信息检索中基于文本之间共享关键字或词的聚类算法,该文的方法是应用文献引用和匹配分析的方法,基于两web页面所共享和匹配的公共链接,并且扩展了标准的K-means聚类算法,使它更适合于处理噪音页面,并把它应用于web结果页面的聚类,为验证它的有效性,进行了初步实验,实验结果显示通过链接分析对web搜索结果聚类取得了预期效果  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号