首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Machine Learning for User Modeling   总被引:25,自引:0,他引:25  
At first blush, user modeling appears to be a prime candidate for straightforward application of standard machine learning techniques. Observations of the user's behavior can provide training examples that a machine learning system can use to form a model designed to predict future actions. However, user modeling poses a number of challenges for machine learning that have hindered its application in user modeling, including: the need for large data sets; the need for labeled data; concept drift; and computational complexity. This paper examines each of these issues and reviews approaches to resolving them.  相似文献   

2.
Fuzzy User Modeling for Information Retrieval on the World Wide Web   总被引:4,自引:1,他引:4  
Information retrieval from the World Wide Web through the use of search engines is known to be unable to capture effectively the information needs of users. The approach taken in this paper is to add intelligence to information retrieval from the World Wide Web, by the modeling of users to improve the interaction between the user and information retrieval systems. In other words, to improve the performance of the user in retrieving information from the information source. To effect such an improvement, it is necessary that any retrieval system should somehow make inferences concerning the information the user might want. The system then can aid the user, for instance by giving suggestions or by adapting any query based on predictions furnished by the model. So, by a combination of user modeling and fuzzy logic a prototype system has been developed (the Fuzzy Modeling Query Assistant (FMQA)) which modifies a user's query based on a fuzzy user model. The FMQA was tested via a user study which clearly indicated that, for the limited domain chosen, the modified queries are better than those that are left unmodified. Received 10 November 1998 / Revised 14 June 2000 / Accepted in revised form 25 September 2000  相似文献   

3.
Web站点的超链结构挖掘   总被引:11,自引:0,他引:11  
WWW是一个由成千上万个分布在世界各地的Web站点组成的全球信息系统,每个Web站点又是一个由许多Web页构成的信息(子)系统。由于一个文档作者可以通过超链把自己的文档与任意一个已知的Web页链接起来,而一个 Web站点上的信息资源又通常是由许多人共同提供的, 因此 Web站点内的超链链接通常是五花八门、各种各样的,它们可以有各种含义和用途。文章分析了WWW系统中超链的使用特征和规律,提出了一个划分超链类型、挖掘站点结构的方法,初步探讨了它在信息收集和查询等方面的应用。  相似文献   

4.
面向Web的信息收集工具的设计与开发   总被引:8,自引:1,他引:8  
随着互联网的发展以及网上信息的日益丰富 ,传统的信息处理已经延伸到互联网领域。在对互联网上的信息进行处理时 ,常常要将分布在互联网各处的Web页面下载到本地供进一步处理 ;这便是所讨论的Web页面收集工具的核心功能。该页面收集系统在综合使用Web页面间的链接关系和页面内容的基础上 ,增加了多层次的页面过滤模块 ,可用来收集特定领域内的Web页面 ;同时可采用多机并行收集的方法提高页面收集的效率 ;采用大型数据库存放元收集信息 ,并对收集到的页面进行压缩 ,能够支持海量数据的收集 ;动态更新机制的实施使得下载到本地的页面信息能够得到及时的更新。  相似文献   

5.
Abstract

The World Wide Web offers a large and ever-expanding number of resources in the fields of psychology and sociology. Growth of the Web has been so fast that it has been difficult to keep track of all the new resources that are rapidly becoming available. With so many new Web sites cropping up, there is a growing need to sort through these resources and determine which sites are the most valuable for research in these subject areas. This study first discusses some criteria for assessing the quality of Web sites, and then applies these criteria to the Web sites currently available in an effort to determine which sites are the best for these two fields.  相似文献   

6.
The information accessible through the Internet is increasing explosively as the Web is getting more and more widespread. In this situation, the Web is indispensable information resource for both of information gathering and information searching. Though traditional information retrieval techniques have been applied to information gathering and searching in the Web, they are insufficient for this new form of information source. Fortunately some Al techniques can be straightforwardly applicable to such tasks in the Web, and many researchers are trying this approach. In this paper, we attempt to describe the current state of information gathering and searching technologies in the Web, and the application of AI techniques in the fields. Then we point out limitations of these traditional and AI approaches and introduce two aapproaches: navigation planning and a Mondou search engine for overcoming them. The navigation planning system tries to collect systematic knowledge, rather than Web pages, which are only pieces of knowledge. The Mondou search engine copes with the problems of the query expansion/modification based on the techniques of text/web mining and information visualization. Seiji Yamada, Dr. Eng.: He received the B.S., M.S. and Ph.S. degrees in control engineering and artificial intelligence from Osaka University, Osaka, Japan, in 1984, 1986 and 1989, respectively. From 1989 to 1991, he served as a Research Associate in the Department of Control Engineering at Osaka University. From 1991 to 1996, he served as a Lecturer in the Institute of Scientific and Industrial Research at Osaka University. In 1996, he joined the Department of Computational Intelligence and Systems Science at Tokyo Institute of Technology, Yokohama, Japan, as an Associate Professor. His research interests include artificial intelligence, planning, machine learning for a robotics, intelligent information retrieval in the WWW, human computer interaction, He is a member of AAAI, IEEE, JSAI, RSJ and IEICE. Hiroyuki Kawano, Dr.Eng.: He is an Associate Professor at the Department of Systems Science, Graduate School of Informatics, Kyoto University, Japan. He obtained his B.Eng. and M.Eng. degrees in Applied Mathematics and Physics, and his Dr.Eng. degree in Applied Systems Science from Kyoto University. His research interests are in advanced database technologies, such as data mining, data warehousing, knowledge discovery and web search engine (Mondou). He has served on the program committees of several conferences in the areas of Data Base Systems, and technical committes of advanced information systems.  相似文献   

7.
在对目前Web存在的主要问题进行了分析的基础上,介绍语义网,并针对语义网优势和特征进行了论述,展望了语义网的发展前景。  相似文献   

8.
为搜索相关主题最具权威的Web信息资源,提出了一种计算Web页权威值的算法。该算法改进了HITS^[1]算法,无须用户提供关键词,采用由Web例子页的连接扩展获得相关主题的例子页集,用一个Web页被超链接引用的次数来度量该页的权威性。  相似文献   

9.
The World Wide Web (WWW) has become the biggest information source for students while solving information problems for school projects. Since anyone can post anything on the WWW, information is often unreliable or incomplete, and it is important to evaluate sources and information before using them. Earlier research has shown that students have difficulties with evaluating sources and information. This study investigates the criteria secondary educational students use while searching the Web for information. 23 students solved two information problems while thinking aloud. After completing the tasks they were interviewed in groups on their use of criteria. Results show that students do not evaluate results, source and information very often. The criteria students mention when asked which criteria are important for evaluating information are not always the same criteria they mention while solving the information problems. They mentioned more criteria but also admitted not always using these criteria while searching the Web.  相似文献   

10.
ABSTRACT

This article presents the results of a study of World Wide Web (Web) Sites for 133 academic libraries serving medium-sized universities. Each library Web site was examined, and all of the features were recorded. The study identified 31 core components which were present in over 50 percent of the libraries surveyed. The study indicated that: the navigational and design aspects of library Web sites should be improved; materials should be placed on the sites only if they will be accessed and utilized by the user community; and libraries could profit by making greater use of online tutorials and virtual tours to supplement regular bibliographic instruction.  相似文献   

11.
12.
Integrating a large number of Web information sources may significantly increase the utility of the World-Wide Web. A promising solution to the integration is through the use of a Web Information mediator that provides seamless, transparent access for the clients. Information mediators need wrappers to access a Web source as a structured database, but building wrappers by hand is impractical. Previous work on wrapper induction is too restrictive to handle a large number of Web pages that contain tuples with missing attributes, multiple values, variant attribute permutations, exceptions and typos. This paper presents SoftMealy, a novel wrapper representation formalism. This representation is based on a finite-state transducer (FST) and contextual rules. This approach can wrap a wide range of semistructured Web pages because FSTs can encode each different attribute permutation as a path. A SoftMealy wrapper can be induced from a handful of labeled examples using our generalization algorithm. We have implemented this approach into a prototype system and tested it on real Web pages. The performance statistics shows that the sizes of the induced wrappers as well as the required training effort are linear with regard to the structural variance of the test pages. Our experiment also shows that the induced wrappers can generalize over unseen pages.  相似文献   

13.
Although it has become very common to use World Wide Web‐based information in many educational settings, there has been little research on how to better search and organize Web‐based information. This paper discusses the shortcomings of Web search engines and Web browsers as learning environments and describes an alternative Web search environment that combines a concept mapping and a data mining technique to address their drawbacks.  相似文献   

14.
The World Wide Web as Enabling Technology for CSCW: The Case of BSCW   总被引:5,自引:0,他引:5  
Despite the growth of interest in the field of CSCW,and the increasingly large number of systems whichhave been developed, it is still the case that fewsystems have been adopted for widespread use. This isparticularly true for widely-dispersed, cross-organisational working groups where problems ofheterogeneity in computing hardware and softwareenvironments inhibit the deployment of CSCWtechnologies. With a lightweight and extensibleclient-server architecture, client implementations forall popular computing platforms, and an existing userbase numbered in millions, the World Wide Web offersgreat potential in solving some of these problems toprovide an enabling technology for CSCWapplications. We illustrate this potential using ourwork with the BSCW shared workspace system – anextension to the Web architecture which provides basicfacilities for collaborative information sharing fromunmodified Web browsers. We conclude that despitelimitations in the range of applications which can bedirectly supported, building on the strengths of theWeb can give significant benefits in easing thedevelopment and deployment of CSCW applications.  相似文献   

15.
电力企业WebGIS的实现   总被引:1,自引:0,他引:1  
地理信息系统可以对空间数据按地理坐标或空间位置进行各种处理,有效管理数据,研究各种空间实体及相互关系。基于WWW的地理信息系统在电力企业建立的企业内部网络(Intranet)中运行,能够及时、准确地传递信息并辅助决策。目前电力企业输配电WebGIS的关键问题是在原有基于C/S模式MIS的基础上,创建与平台无关的、运行于开放的、基于TCP/IP协议网络之上的软件系统。为了解决这个问题,本文讨论了三种基于WWW技术的电力企业地理信息系统的实现策略:服务器端策略——允许用户向Web服务器提交申请数据和分析结果的请求;客户机端策略——允许用户在他们的本地机上执行某些数据的处理和分析;混合策略——服务器端策略和客户机端策略的结合,并对这些策略的优缺点做了对比,分析了电力企业WebGIS各个功能模块适合采取的策略。  相似文献   

16.
机器学习与网络信息处理   总被引:2,自引:0,他引:2  
机器学习在网络信息处理中占有重要地位。GHunt是一个采用多项机器学习技术的网络信息智能获取与处理系统。首先,这一系统支持分布式的网络信息并行搜索与内容过滤;其次,采用机器学习技术,包括文本分类、聚类,文本概念抽取,从概念层次理解文本信息;再次,基于概念语义空间有效地统一文本信息管理;最后提供高效的基于概念语义的文本信息检索,以及个性化的专题组织与信息推送服务。文中着重阐述了系统中所用到的机器学习技术。  相似文献   

17.
Unsupervised Information Extraction (UIE) is the task of extracting knowledge from text without the use of hand-labeled training examples. Because UIE systems do not require human intervention, they can recursively discover new relations, attributes, and instances in a scalable manner. When applied to massive corpora such as the Web, UIE systems present an approach to a primary challenge in artificial intelligence: the automatic accumulation of massive bodies of knowledge.A fundamental problem for a UIE system is assessing the probability that its extracted information is correct. In massive corpora such as the Web, the same extraction is found repeatedly in different documents. How does this redundancy impact the probability of correctness?We present a combinatorial “balls-and-urns” model, called Urns, that computes the impact of sample size, redundancy, and corroboration from multiple distinct extraction rules on the probability that an extraction is correct. We describe methods for estimating Urns's parameters in practice and demonstrate experimentally that for UIE the model's log likelihoods are 15 times better, on average, than those obtained by methods used in previous work. We illustrate the generality of the redundancy model by detailing multiple applications beyond UIE in which Urns has been effective. We also provide a theoretical foundation for Urns's performance, including a theorem showing that PAC Learnability in Urns is guaranteed without hand-labeled data, under certain assumptions.  相似文献   

18.
With the explosive growth of the World Wide Web, it is becoming increasingly difficult for users to collect and analyze Web pages that are relevant to a particular topic. To address this problem we are developing WTMS, a system for Web topic management. In this paper we explain how the WTMS crawler efficiently collects Web pages for a topic. We also introduce the user interface of the system that integrates several techniques for analyzing the collection. Moreover, we present the various views of the interface that allow navigation through the information space. We highlight several examples to show how the system enables the user to gain useful insights about the collection.  相似文献   

19.
Abstract

The World Wide Web represents the final step in the evolution of the Internet as a tool worthy for practical applications in instruction. Two particular applications for the Web are discussed in light of projects which have been undertaken in the Helen Topping Architecture and Fine Arts Library at the University of Southern California. First, the World Wide Web may be used as a resource in the library. The Web is a source of content which, like all library resources, must be taught. It should be presented to users along with the same information literacy skills which must accompany any resource. Second, the Web may be used as a publishing tool where the content is created according to the particular instructional need or situation. This usage involves the technology of the Web rather than the content of the Web; this technology is the interface and access capabilities, either local or on a server, provided by the Web browser.  相似文献   

20.
Abstract

The internal conflicts within an organization may hinder the successful design and set-up of a World Wide Web site. This article looks at the three most common routes that an organization may take to get on the Web, and presents guidelines for how to overcome the organizational politics that get in the way.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号

京公网安备 11010802026262号