首页 | 官方网站   微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Machine Learning for User Modeling   总被引:25,自引:0,他引:25  
At first blush, user modeling appears to be a prime candidate for straightforward application of standard machine learning techniques. Observations of the user's behavior can provide training examples that a machine learning system can use to form a model designed to predict future actions. However, user modeling poses a number of challenges for machine learning that have hindered its application in user modeling, including: the need for large data sets; the need for labeled data; concept drift; and computational complexity. This paper examines each of these issues and reviews approaches to resolving them.  相似文献   

Fuzzy User Modeling for Information Retrieval on the World Wide Web   总被引:4,自引:1,他引:4  
Information retrieval from the World Wide Web through the use of search engines is known to be unable to capture effectively the information needs of users. The approach taken in this paper is to add intelligence to information retrieval from the World Wide Web, by the modeling of users to improve the interaction between the user and information retrieval systems. In other words, to improve the performance of the user in retrieving information from the information source. To effect such an improvement, it is necessary that any retrieval system should somehow make inferences concerning the information the user might want. The system then can aid the user, for instance by giving suggestions or by adapting any query based on predictions furnished by the model. So, by a combination of user modeling and fuzzy logic a prototype system has been developed (the Fuzzy Modeling Query Assistant (FMQA)) which modifies a user's query based on a fuzzy user model. The FMQA was tested via a user study which clearly indicated that, for the limited domain chosen, the modified queries are better than those that are left unmodified. Received 10 November 1998 / Revised 14 June 2000 / Accepted in revised form 25 September 2000  相似文献   

Web站点的超链结构挖掘   总被引:11,自引:0,他引:11  
WWW是一个由成千上万个分布在世界各地的Web站点组成的全球信息系统,每个Web站点又是一个由许多Web页构成的信息(子)系统。由于一个文档作者可以通过超链把自己的文档与任意一个已知的Web页链接起来,而一个 Web站点上的信息资源又通常是由许多人共同提供的, 因此 Web站点内的超链链接通常是五花八门、各种各样的,它们可以有各种含义和用途。文章分析了WWW系统中超链的使用特征和规律,提出了一个划分超链类型、挖掘站点结构的方法,初步探讨了它在信息收集和查询等方面的应用。  相似文献   

WWW浏览导航与结构优化技术   总被引:1,自引:0,他引:1  
In this paper,we introduce some typical WWW navigation systems and Web site optimizationsystems,analyse the properties of navigation and optimization techniques, present some key problemsand techniques that are valuable to pay special attention and discuss the future works.  相似文献   

万维风知识挖掘方法的研究   总被引:11,自引:0,他引:11  
1.引言万维网(World Wide Web)的出现使计算机拥有海量的信息资源,然而这些信息却很少以计算机可理解的结构存在,因为,万维网上的页面本来就是以人,而不是计算机为其阅读对象的。因此,复杂的文本结构、图像、声音等多种信息的存在,既把万维网变成一种丰富多采的媒体,又造成了计算机对万维网信息进一步处理的障碍。  相似文献   

面向Web的信息收集工具的设计与开发   总被引:8,自引:1,他引:8  
随着互联网的发展以及网上信息的日益丰富 ,传统的信息处理已经延伸到互联网领域。在对互联网上的信息进行处理时 ,常常要将分布在互联网各处的Web页面下载到本地供进一步处理 ;这便是所讨论的Web页面收集工具的核心功能。该页面收集系统在综合使用Web页面间的链接关系和页面内容的基础上 ,增加了多层次的页面过滤模块 ,可用来收集特定领域内的Web页面 ;同时可采用多机并行收集的方法提高页面收集的效率 ;采用大型数据库存放元收集信息 ,并对收集到的页面进行压缩 ,能够支持海量数据的收集 ;动态更新机制的实施使得下载到本地的页面信息能够得到及时的更新。  相似文献   


The World Wide Web offers a large and ever-expanding number of resources in the fields of psychology and sociology. Growth of the Web has been so fast that it has been difficult to keep track of all the new resources that are rapidly becoming available. With so many new Web sites cropping up, there is a growing need to sort through these resources and determine which sites are the most valuable for research in these subject areas. This study first discusses some criteria for assessing the quality of Web sites, and then applies these criteria to the Web sites currently available in an effort to determine which sites are the best for these two fields.  相似文献   

The information accessible through the Internet is increasing explosively as the Web is getting more and more widespread. In this situation, the Web is indispensable information resource for both of information gathering and information searching. Though traditional information retrieval techniques have been applied to information gathering and searching in the Web, they are insufficient for this new form of information source. Fortunately some Al techniques can be straightforwardly applicable to such tasks in the Web, and many researchers are trying this approach. In this paper, we attempt to describe the current state of information gathering and searching technologies in the Web, and the application of AI techniques in the fields. Then we point out limitations of these traditional and AI approaches and introduce two aapproaches: navigation planning and a Mondou search engine for overcoming them. The navigation planning system tries to collect systematic knowledge, rather than Web pages, which are only pieces of knowledge. The Mondou search engine copes with the problems of the query expansion/modification based on the techniques of text/web mining and information visualization. Seiji Yamada, Dr. Eng.: He received the B.S., M.S. and Ph.S. degrees in control engineering and artificial intelligence from Osaka University, Osaka, Japan, in 1984, 1986 and 1989, respectively. From 1989 to 1991, he served as a Research Associate in the Department of Control Engineering at Osaka University. From 1991 to 1996, he served as a Lecturer in the Institute of Scientific and Industrial Research at Osaka University. In 1996, he joined the Department of Computational Intelligence and Systems Science at Tokyo Institute of Technology, Yokohama, Japan, as an Associate Professor. His research interests include artificial intelligence, planning, machine learning for a robotics, intelligent information retrieval in the WWW, human computer interaction, He is a member of AAAI, IEEE, JSAI, RSJ and IEICE. Hiroyuki Kawano, Dr.Eng.: He is an Associate Professor at the Department of Systems Science, Graduate School of Informatics, Kyoto University, Japan. He obtained his B.Eng. and M.Eng. degrees in Applied Mathematics and Physics, and his Dr.Eng. degree in Applied Systems Science from Kyoto University. His research interests are in advanced database technologies, such as data mining, data warehousing, knowledge discovery and web search engine (Mondou). He has served on the program committees of several conferences in the areas of Data Base Systems, and technical committes of advanced information systems.  相似文献   

在对目前Web存在的主要问题进行了分析的基础上,介绍语义网,并针对语义网优势和特征进行了论述,展望了语义网的发展前景。  相似文献   

为搜索相关主题最具权威的Web信息资源,提出了一种计算Web页权威值的算法。该算法改进了HITS^[1]算法,无须用户提供关键词,采用由Web例子页的连接扩展获得相关主题的例子页集,用一个Web页被超链接引用的次数来度量该页的权威性。  相似文献   

The World Wide Web (WWW) has become the biggest information source for students while solving information problems for school projects. Since anyone can post anything on the WWW, information is often unreliable or incomplete, and it is important to evaluate sources and information before using them. Earlier research has shown that students have difficulties with evaluating sources and information. This study investigates the criteria secondary educational students use while searching the Web for information. 23 students solved two information problems while thinking aloud. After completing the tasks they were interviewed in groups on their use of criteria. Results show that students do not evaluate results, source and information very often. The criteria students mention when asked which criteria are important for evaluating information are not always the same criteria they mention while solving the information problems. They mentioned more criteria but also admitted not always using these criteria while searching the Web.  相似文献   

王彤  何丕廉 《计算机工程》2008,34(6):182-184
提出引入生物信息技术解决Web挖掘中的用户识别问题的设想及基于隐马尔科夫模型的虹膜识别方法,该方法仅需要虹膜的方向域作为输入参数,对虹膜图像的噪声与扭曲并不敏感,从而使该方法具有鲁棒性的特点。通过准确识别用户,克服了现有Web体系无状态的缺陷,可以实现对Web日志数据按“用户维”进行切片,使挖掘出的结果能够满足对用户个性化使用的需求。  相似文献   


This article presents the results of a study of World Wide Web (Web) Sites for 133 academic libraries serving medium-sized universities. Each library Web site was examined, and all of the features were recorded. The study identified 31 core components which were present in over 50 percent of the libraries surveyed. The study indicated that: the navigational and design aspects of library Web sites should be improved; materials should be placed on the sites only if they will be accessed and utilized by the user community; and libraries could profit by making greater use of online tutorials and virtual tours to supplement regular bibliographic instruction.  相似文献   

Integrating a large number of Web information sources may significantly increase the utility of the World-Wide Web. A promising solution to the integration is through the use of a Web Information mediator that provides seamless, transparent access for the clients. Information mediators need wrappers to access a Web source as a structured database, but building wrappers by hand is impractical. Previous work on wrapper induction is too restrictive to handle a large number of Web pages that contain tuples with missing attributes, multiple values, variant attribute permutations, exceptions and typos. This paper presents SoftMealy, a novel wrapper representation formalism. This representation is based on a finite-state transducer (FST) and contextual rules. This approach can wrap a wide range of semistructured Web pages because FSTs can encode each different attribute permutation as a path. A SoftMealy wrapper can be induced from a handful of labeled examples using our generalization algorithm. We have implemented this approach into a prototype system and tested it on real Web pages. The performance statistics shows that the sizes of the induced wrappers as well as the required training effort are linear with regard to the structural variance of the test pages. Our experiment also shows that the induced wrappers can generalize over unseen pages.  相似文献   

Although it has become very common to use World Wide Web‐based information in many educational settings, there has been little research on how to better search and organize Web‐based information. This paper discusses the shortcomings of Web search engines and Web browsers as learning environments and describes an alternative Web search environment that combines a concept mapping and a data mining technique to address their drawbacks.  相似文献   

The World Wide Web as Enabling Technology for CSCW: The Case of BSCW   总被引:5,自引:0,他引:5  
Despite the growth of interest in the field of CSCW,and the increasingly large number of systems whichhave been developed, it is still the case that fewsystems have been adopted for widespread use. This isparticularly true for widely-dispersed, cross-organisational working groups where problems ofheterogeneity in computing hardware and softwareenvironments inhibit the deployment of CSCWtechnologies. With a lightweight and extensibleclient-server architecture, client implementations forall popular computing platforms, and an existing userbase numbered in millions, the World Wide Web offersgreat potential in solving some of these problems toprovide an enabling technology for CSCWapplications. We illustrate this potential using ourwork with the BSCW shared workspace system – anextension to the Web architecture which provides basicfacilities for collaborative information sharing fromunmodified Web browsers. We conclude that despitelimitations in the range of applications which can bedirectly supported, building on the strengths of theWeb can give significant benefits in easing thedevelopment and deployment of CSCW applications.  相似文献   

电力企业WebGIS的实现   总被引:1,自引:0,他引:1  
地理信息系统可以对空间数据按地理坐标或空间位置进行各种处理,有效管理数据,研究各种空间实体及相互关系。基于WWW的地理信息系统在电力企业建立的企业内部网络(Intranet)中运行,能够及时、准确地传递信息并辅助决策。目前电力企业输配电WebGIS的关键问题是在原有基于C/S模式MIS的基础上,创建与平台无关的、运行于开放的、基于TCP/IP协议网络之上的软件系统。为了解决这个问题,本文讨论了三种基于WWW技术的电力企业地理信息系统的实现策略:服务器端策略——允许用户向Web服务器提交申请数据和分析结果的请求;客户机端策略——允许用户在他们的本地机上执行某些数据的处理和分析;混合策略——服务器端策略和客户机端策略的结合,并对这些策略的优缺点做了对比,分析了电力企业WebGIS各个功能模块适合采取的策略。  相似文献   

分布式WWW信息收集技术   总被引:14,自引:0,他引:14  
讨论了 WWW搜索引擎的分布式信息收集技术,提出了最佳机器人作用范围划分的概念,给出了实用的信息收集代价估算方法和实现最佳机器人作用范围划分的具体算法。  相似文献   

万维网的链接结构分析及其应用综述   总被引:47,自引:0,他引:47  
王晓宇  周傲英 《软件学报》2003,14(10):1768-1780
当今万维网的规模已经快速发展到包含大约80亿个网页和560亿个超链接.此外,对万维网的创建进行全局规划显然是不可能的.这些都对万维网的相关研究提出了挑战.另一方面,互联网环境下通过超链连接起来的网页,为人们的日常和商务用途提供了非常丰富的信息资源,但前提是必须掌握有效的办法来理解万维网.链接结构分析在万维网的很多研究领域起着越来越重要的作用.全面介绍了万维网链接分析方面的最新研究进展和应用情况,对链接分析在Web信息搜索、万维网潜在社区发现及万维网建模等方面的研究进展和实际应用进行了综述.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号