首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The advent of the World Wide Web has made an enormous amount of information available to everyone and the widespread use of digital equipment enables end-users (peers) to produce their own digital content. This vast amount of information requires scalable data management systems. Peer-to-peer (P2P) systems have so far been well established in several application areas, with file-sharing being the most prominent. The next challenge that needs to be addressed is (more complex) data sharing, management and query processing, thus facilitating the delivery of a wide spectrum of novel data-centric applications to the end-user, while providing high Quality-of-Service. In this paper, we propose a self-organizing P2P system that is capable to identify peers with similar content and intentionally assign them to the same super-peer. During content retrieval, fewer super-peers need to be contacted and therefore efficient similarity search is supported, in terms of reduced network traffic and contacted peers. Our approach increases the responsiveness and reliability of a P2P system and we demonstrate the advantages of our approach using large-scale simulations.  相似文献   

2.
In the ubiquitous Web environment which exploits real-time image processing methods, easy access to a variety of digital contents such as movies, e-books, and digital songs significantly enhances people’s quality of life. However, most digital content sites do not provide a concrete mechanism to prohibit minors from accessing harmful content, even though a few mechanisms are available in the market to screen out minors from accessing inappropriate content. This paper proposes a fundamental approach that confirms the age of a user using his digital signature with the X.509 certificate when the user attempts to access specific digital content. Its performance is verified by the implementation of the approach.  相似文献   

3.
一种基于用户标记的搜索结果排序算法   总被引:1,自引:0,他引:1  
随着计算机网络的快速发展,网络上的信息量也日益纷繁复杂.如何准确、快速地帮助人们从海量网络数据中获取所需信息,这是目前搜索引擎首要解决的问题,为此,各种搜索排序算法应运而生.但是目前,网页信息的表达形式都十分简单,用户描述查询的形式更是十分简单,这就造成了在判断网页内容与用户查询相关性时十分困难.首先对现有的搜索引擎排序算法进行了分类总结,分析它们的优缺点.然后提出了一种基于用户反馈的语义标记的新方法,最后采用多种评估方法与Google搜索结果进行对比分析.实验结果表明,利用该方法所得到的排序结果比Google的排序结果更接近用户需求.  相似文献   

4.
Lee  Minsoo  Su  Stanley Y. W.  Lam  Herman 《World Wide Web》2001,4(1-2):121-140
Although the Internet and the World Wide Web technologies have gained a tremendous amount of popularity among people and organizations, the network that these technologies created is not much more than a multimedia data network. It provides tools and services for people to browse and search for data but does not provide the facilities for automatically delivering the relevant information for supporting decision–making to the right people or applications at the right time. Nor does it provide the means for users to enter and share their knowledge that would be useful for making the right decisions. In this work, we introduce the concept of a Web–based knowledge network, which allows users and organizations to publish, not only their multimedia data, but also their knowledge in terms of events, parameterized event filters, customizable rules and triggers that are associated with their data and application systems. Operations on the data and application systems may post events over the Internet to trigger the processing of rules defined by both information providers and consumers. The knowledge network is constructed by a number of replicable software components, which can be installed at various network sites. They, together with the existing Web servers, form a network of knowledge Web servers.  相似文献   

5.
The Web is a universal repository of human knowledge and culture which has allowed unprecedented sharing of ideas and information in a scale never seen before. It can also be considered as a universal digital library interconnecting digital libraries in multiple domains and languages. Beside the advance of information technology, the global economy has also accelerated the development of inter-organizational information systems. Managing knowledge obtained in multilingual information systems from multiple geographical regions is an essential component in the contemporary inter-organization information systems. An organization cannot claim itself to be a global organization unless it is capable to overcome the cultural and language barriers in their knowledge management. Cross-lingual semantic interoperability is a challenge in multilingual knowledge management systems. Dictionary is a tool that is widely utilized in commercial systems to cross the language barrier. However, terms available in dictionary are always limited. As language is evolving, there are new words being created from time to time. For examples, there are new technical terms and name entities such as RFID and Baidu. To solve the problem of cross-lingual semantic interoperability, an associative constraint network approach is investigated to construct an automatic cross-lingual thesaurus. In this work, we have investigated the backmarking algorithm and the forward evaluation algorithm to resolve the constraint satisfaction problem represented by the associative constraint network. Experiments have been conducted and show that the forward evaluation algorithm outperforms the backmarking one in terms of precision and recall but the backmarking algorithm is more efficient than the forward evaluation algorithm. We have also benchmarked with our earlier technique, Hopfield network, and showed that the associate constraint network (either backmarking or forward evaluation) outperforms in precision, recall, and efficiency.  相似文献   

6.
The main problem we explore in this paper involves predicting the performance of Web-resource downloads from unknown Web servers, based on knowledge about client-to-unknown-server network paths and performance measurements carried out on the set of known Web servers. We propose unknown-server-to-known-server topology-aware distance metrics based on the knowledge of network paths to both unknown and known servers at the autonomous systems level of Internet organization. The throughput value we want to predict for an unknown-server is approximated by the value achievable for the known-server—called the best one—with the least value of unknown-server-to-known-server distance metrics. The best server is selected using the nearest neighbor algorithm. The usefulness of this method for Web-performance prediction has been confirmed in real-life experiments. The results of the work allowed us to formulate positive recommendations for applying this approach to efficient gaining of Web resources in replicated content systems, file mirrors, content delivery networks, and digital libraries.  相似文献   

7.
Recent work on searching the Semantic Web has yielded a wide range of approaches with respect to the underlying search mechanisms, results management and presentation, and style of input. Each approach impacts upon the quality of the information retrieved and the user’s experience of the search process. However, despite the wealth of experience accumulated from evaluating Information Retrieval (IR) systems, the evaluation of Semantic Web search systems has largely been developed in isolation from mainstream IR evaluation with a far less unified approach to the design of evaluation activities. This has led to slow progress and low interest when compared to other established evaluation series, such as TREC for IR or OAEI for Ontology Matching. In this paper, we review existing approaches to IR evaluation and analyse evaluation activities for Semantic Web search systems. Through a discussion of these, we identify their weaknesses and highlight the future need for a more comprehensive evaluation framework that addresses current limitations.  相似文献   

8.
The Internet is estimated to grow significantly as access to Web content in some non-English languages continues to increase. However, prior research in human–computer interaction (HCI) has implicitly assumed the primary language used on the Web to be English. This assumption is not true for many non-English-speaking regions where rapidly growing on-line populations access the Web in their native languages. For example, Latin America, where the majority of people speak Spanish, will have the fastest growing population in coming decades. However, existing Spanish search engines lack search, browse, and analysis capabilities. The research reported here studied human information seeking on the non-English Web. In it we developed a Spanish business Web portal that supports searching, browsing, summarization, categorization, and visualization of Spanish business Web pages. Using 42 Spanish speakers as subjects we conducted a two-phase experiment to evaluate this portal and found that, compared with a Spanish search engine and a Spanish Web directory, it achieved significantly better user ratings on information quality, cross-regional search capability, system performance attributes, and overall satisfaction. Subjects’ verbal comments strongly favored the search and browse functionality and user interface of our portal. As the Web becomes more international, this research makes three contributions: (1) an empirical evaluation of the performance level of a Spanish search portal; (2) an examination of the information quality, cross-regional search capability and usability of search engines for the non-English Web; and (3) a better understanding of non-English Web searching.  相似文献   

9.
随着基于互联网的社会媒体及其应用的迅速发展,互联网用户的“关注”对建立在其基础上的“虚拟经济”具有越来越重要的意义,通过把人看作是传播的内容,把信息资源看作是对象,互联网可以被看作是一个人类关注力在信息资源之间分配和流动的网络.利用搜集和分析互联网用户行为的数据,构建了基于互联网的关注力转移网络,给出了一个描述互联网用户关注力的动力学模型.实验结果表明,相比Web 1.0站点而言,Web 2.0站点更受关注,网站的关注力增长小于流量增长,存在着“规模不经济”的现象,相比搜索引擎和广告联盟,广告网络和垂直网络更受关注.  相似文献   

10.
11.
This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of cross-lingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.  相似文献   

12.
Providing highly relevant page hits to the user is a major concern in Web search. To accomplish this goal, the user must be allowed to express his intent precisely. Secondly, page hit rating mechanisms should be used that take the user’s intent into account. Finally, a learning mechanism is needed that captures a user’s preferences in his Web search, even when those preferences are changing dynamically. To address the first two issues, we propose a semantic taxonomy-based meta-search agent approach that incorporates the user’s taxonomic search intent. It also addresses relevancy improvement issues of the resulting page hits by using user’s search intent and preference-based rating. To provide a learning mechanism, we first propose a connectionist model-based user profile representation approach, which can leverage all of the features of the semantic taxonomy-based information retrieval approach. A user profile learning algorithm is also devised for our proposed user profile representation framework by significantly modifying and extending a typical neural network learning algorithm. Finally, the entire methodology including this learning mechanism is implemented in an agent-based system, WebSifter II. Empirical results of learning performance are also discussed.  相似文献   

13.
High-performance Web sites rely on Web server `farms', hundreds of computers serving the same content, for scalability, reliability, and low-latency access to Internet content. Deploying these scalable farms typically requires the power of distributed or clustered file systems. Building Web server farms on file systems complements hierarchical proxy caching. Proxy caching replicates Web content throughout the Internet, thereby reducing latency from network delays and off-loading traffic from the primary servers. Web server farms scale resources at a single site, reducing latency from queuing delays. Both technologies are essential when building a high-performance infrastructure for content delivery. The authors present a cache consistency model and locking protocol customized for file systems that are used as scalable infrastructure for Web server farms. The protocol takes advantage of the Web's relaxed consistency semantics to reduce latencies and network overhead. Our hybrid approach preserves strong consistency for concurrent write sharing with time-based consistency and push caching for readers (Web servers). Using simulation, we compare our approach to the Andrew file system and the sequential consistency file system protocols we propose to replace  相似文献   

14.
Abstract

Dynamic composition or integration remains one of the key objectives of Web services technology. This paper aims to propose an innovative approach of dynamic Web services composition based on functional and non-functional attributes and individual preferences. In this approach, social networks of Web services are used to maintain interactions between Web services in order to select and compose Web services that are more tightly related to user’s preferences. We use the concept of Web services community in a social network of Web services to reduce considerably their search space. These communities are created by the direct involvement of Web services providers.  相似文献   

15.
We investigate the possibility of using Semantic Web data to improve hypertext Web search. In particular, we use relevance feedback to create a ‘virtuous cycle’ between data gathered from the Semantic Web of Linked Data and web-pages gathered from the hypertext Web. Previous approaches have generally considered the searching over the Semantic Web and hypertext Web to be entirely disparate, indexing, and searching over different domains. While relevance feedback has traditionally improved information retrieval performance, relevance feedback is normally used to improve rankings over a single data-set. Our novel approach is to use relevance feedback from hypertext Web results to improve Semantic Web search, and results from the Semantic Web to improve the retrieval of hypertext Web data. In both cases, an evaluation is performed based on certain kinds of informational queries (abstract concepts, people, and places) selected from a real-life query log and checked by human judges. We evaluate our work over a wide range of algorithms and options, and show it improves baseline performance on these queries for deployed systems as well, such as the Semantic Web Search engine FALCON-S and Yahoo! Web search. We further show that the use of Semantic Web inference seems to hurt performance, while the pseudo-relevance feedback increases performance in both cases, although not as much as actual relevance feedback. Lastly, our evaluation is the first rigorous ‘Cranfield’ evaluation of Semantic Web search.  相似文献   

16.
基于K-近邻算法的网页自动分类系统的研究及实现   总被引:2,自引:0,他引:2  
随着网络信息量的爆炸式增长,人们查找信息越来越难。Web搜索引擎的出现在一定程度上解决了这种矛盾。然而现行的搜索引擎无法根据用户所指定的主题进行针对性的搜索,因此,必须在搜索后对结果是否属于目标主题进行判断,以提高搜索的准确性,文中提出了一种基于K-近邻机器学习算法的信息自动分类的方法,能够对搜索到的网页自动地判定是否属于目标主题,并在实验的基础上验证了其在提高搜索准确性上的作用。  相似文献   

17.
Recent advances in digital libraries have been closely intertwined with advances in Internet technologies. With the advent of the Web, digital libraries have been able to reach constituencies previously unanticipated. Because of the wide deployability of Web-accessible digital libraries, the potential for privacy violations has also grown tremendously. The much touted Semantic Web, with its agent, service, and ontology technologies, is slated to take the Web to another qualitative level in advances. Unfortunately, these advances may also open doors for privacy violations in ways never seen before. We propose a Semantic Web infrastructure, called SemWebDL, that enables the dynamic composition of disparate and autonomous digital libraries while preserving user privacy. In the proposed infrastructure, users will be able to pose more qualitative queries that may require the ad hoc collaboration of multiple digital libraries. In addition to the Semantic Web-based infrastructure, the quality of the response would rest on extraneous information in the form of a profile. We introduce the concept of communities to enable subject-based cooperation and search speedup. Further, digital libraries heterogeneity and autonomy are transcended by a layered Web-service-based infrastructure. Semantic Web-based digital library providers would advertise to Web services, which in turn are organized in communities accessed by users. For the purpose of privacy preservation, we devise a three-tier privacy model consisting of user privacy, Web service privacy, and digital library privacy that offers autonomy of perspectives for privacy definition and violation. We propose an approach that seamlessly interoperates with potentially conflicting privacy definitions and policies at the different levels of the Semantic Web-based infrastructure. A key aspect in the approach is the use of reputations for outsourcing Web services. A Web service reputation is associated with its behavior with regard to privacy preservation. We developed a technique that uses attribute ontologies and information flow difference to collect, evaluate, and disseminate the reputation of Web services.  相似文献   

18.
一种互联网信息智能搜索新方法   总被引:10,自引:1,他引:9  
提出了一种互联网信息智能搜索新方法。该方法能够从组织结构和内容描述类似的同类网站中,准确有效搜索出隐藏于其内部的目标网页。为此它采用了将网页间相互关联特征与网页内容特征描述有机结合而形成的一种新的搜索知识表示方法。基于这种知识表示方法及其所表示的知识;该智能搜索方法不仅能够对风站中网页进行深度优先的智能搜索,而且还能够通过对其搜索过程和结果的自学习来获取更多更好的搜索知识。初步实验结果表明,这种智能搜索新方法在对同类型网站的目标网页搜索中具有很强的深度网页搜索能力。  相似文献   

19.
针对当前P2P系统中多数为媒体文件,而对应描述信息有限的问题,提出了一个通过Web信息挖掘来扩展语义的算法.同时提出了一个基于语义跳表的多层环网络结构,帮助用户进行相关内容推荐.实验表明,用本文所提出的方法,在消息量很小的情况下,与传统的基于中心服务器的检索精度很相近,具有实用价值.  相似文献   

20.
Knowledge extraction from Chinese wiki encyclopedias   总被引:1,自引:0,他引:1  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号