期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A clustering-based prefetching scheme on a Web cache environment

George Athena Jaroslav 《Computers & Electrical Engineering》2008,34(4):309-323

Web prefetching is an attractive solution to reduce the network resources consumed by Web services as well as the access latencies perceived by Web users. Unlike Web caching, which exploits the temporal locality, Web prefetching utilizes the spatial locality of Web objects. Specifically, Web prefetching fetches objects that are likely to be accessed in the near future and stores them in advance. In this context, a sophisticated combination of these two techniques may cause significant improvements on the performance of the Web infrastructure. Considering that there have been several caching policies proposed in the past, the challenge is to extend them by using data mining techniques. In this paper, we present a clustering-based prefetching scheme where a graph-based clustering algorithm identifies clusters of “correlated” Web pages based on the users’ access patterns. This scheme can be integrated easily into a Web proxy server, improving its performance. Through a simulation environment, using a real data set, we show that the proposed integrated framework is robust and effective in improving the performance of the Web caching environment. 相似文献

2.

Using current web page structure to improve prefetching performance

Josep Domenech Jose A. Gil Julio Sahuquillo Ana Pont 《Computer Networks》2010,54(9):1404-1417

Web prefetching is a technique aimed at reducing user-perceived latencies in the World Wide Web. The spatial locality shown by user accesses makes it possible to predict future accesses from the previous ones. A prefetching engine uses these predictions to prefetch web objects before the user demands them. The existing prediction algorithms achieved an acceptable performance when they were proposed but the high increase in the number of embedded objects per page has reduced their effectiveness in the current web. In this paper, we show that most of the predictions made by the existing algorithms are not useful to reduce the user-perceived latency because these algorithms do not take into account the structure of the current web pages, i.e., an HTML object with several embedded objects. Thus, they predict the accesses to the embedded objects in an HTML after reading the HTML itself. For this reason, the prediction is not made early enough to prefetch the objects and, therefore, there is no latency reduction. In this paper we present the double dependency graph (DDG) algorithm that distinguishes between container objects (HTML) and embedded objects to create a new prediction model according to the structure of the current web. Results show that, for the same number of extra requests to the server, DDG reduces the perceived latency, on average, 40% more than the existing algorithms. Moreover, DDG distributes latency reductions more homogeneously among users. 相似文献

3.

Integrating Web Prefetching and Caching Using Prediction Models 总被引：2，自引：0，他引：2

Yang Qiang Zhang Henry Hanning 《World Wide Web》2001,4(4):299-321

Web caching and prefetching have been studied in the past separately. In this paper, we present an integrated architecture for Web object caching and prefetching. Our goal is to design a prefetching system that can work with an existing Web caching system in a seamless manner. In this integrated architecture, a certain amount of caching space is reserved for prefetching. To empower the prefetching engine, a Web-object prediction model is built by mining the frequent paths from past Web log data. We show that the integrated architecture improves the performance over Web caching alone, and present our analysis on the tradeoff between the reduced latency and the potential increase in network load. 相似文献

4.

基于深度优先序列模式挖掘的预取模型

下载免费PDF全文

卫琳石磊《计算机工程与应用》2007,43(20):169-172

序列模式挖掘能够发现隐含在Web日志中的用户的访问规律,可以被用来在Web预取模型中预测即将访问的Web对象。目前大多数序列模式挖掘是基于Apriori的宽度优先算法。提出了基于位图深度优先挖掘算法,采用基于字典树数据结构的深度优先策略,同时采用位图保存和计算各序列的支持度,能够较迅速地挖掘出频繁序列。将该序列模式挖掘算法应用于Web预取模型中,在预取缓存一体化的条件下实验表明具有较好的性能。相似文献

5.

一种智能的预取算法 总被引：1，自引：0，他引：1

曹新平刘美华古志民尹春天刘金华《计算机工程与应用》2003,39(31):103-106

网络延迟问题是用户QoS的主要问题之一,它依赖诸多因素如网络带宽、传输延迟、排队延迟和客户机及服务器的处理速度。目前主要采用缓存和预取技术来减少网络延迟,但缓存技术所提高的缓存代理服务器的命中率是有限的。该文系统地阐述了目前预取算法的基本思想并把它们分成四类:基于流行度、基于交互、基于访问概率和基于数据挖掘的预取算法。在对它们进行分析比较的基础上,提出了一种智能的预取方案。该方案使用模糊匹配来计算用户对页面的访问概率,同时要控制预取的量和预取的时刻,以避免对网络的性能产生负面影响。相似文献

6.

Incremental and interactive mining of web traversal patterns

Yue-Shi Lee Show-Jane Yen 《Information Sciences》2008,178(2):287-306

Web mining involves the application of data mining techniques to large amounts of web-related data in order to improve web services. Web traversal pattern mining involves discovering users’ access patterns from web server access logs. This information can provide navigation suggestions for web users indicating appropriate actions that can be taken. However, web logs keep growing continuously, and some web logs may become out of date over time. The users’ behaviors may change as web logs are updated, or when the web site structure is changed. Additionally, it can be difficult to determine a perfect minimum support threshold during the data mining process to find interesting rules. Accordingly, we must constantly adjust the minimum support threshold until satisfactory data mining results can be found.The essence of incremental data mining and interactive data mining is the ability to use previous mining results in order to reduce unnecessary processes when web logs or web site structures are updated, or when the minimum support is changed. In this paper, we propose efficient incremental and interactive data mining algorithms to discover web traversal patterns that match users’ requirements. The experimental results show that our algorithms are more efficient than other comparable approaches. 相似文献

7.

An SPN-Based Integrated Model for Web Prefetching and Caching 总被引：17，自引：0，他引：17

下载免费PDF全文

Lei Shi Ying-Jie Han Xiao-Guang Ding Lin Wei Zhi-Min Gu 《计算机科学技术学报》2006,21(4):482-489

The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the primary approaches to reducing the user perceived access latency and improving the quality of services. In this paper, a Stochastic Petri Nets （SPN） based integrated web prefetching and caching model （IWPCM） is presented and the performance evaluation of IWPCM is made. The performance metrics, access latency, throughput, HR （hit ratio） and BHR （byte hit ratio） are analyzed and discussed. Simulations show that compared with caching only model （CM）, IWPCM can further improve the throughput, HR and BHR efficiently and reduce the access latency. The performance evaluation based on the SPN model can provide a basis for implementation of web prefetching and caching and the combination of web prefetching and caching holds the promise of improving the QoS of web systems. 相似文献

8.

基于数据仓库的Web日志挖掘技术研究

席景科张辰谢红侠《计算机工程与设计》2007,28(24):5890-5892

Web日志挖掘是目前Web挖掘研究的一个重点.针对Web日志挖掘中存在的问题,给出了基于数据仓库技术的Web日志挖掘方案,就数据预处理、数据立方体设计及数据挖掘技术的应用进行了较为深入的探讨.并以一个Web站点日志为例,详细阐述了Web日志数据预处理、Web日志立方体设计以及数据挖掘算法的实现过程,并实现了一个Web日志多维数据集,能够有效解决Web日志分析中的难题. 相似文献

9.

A User-Aware Prefetching Mechanism for Video Streaming

Chung-Ming Huang Tz-Heng Hsu 《World Wide Web》2003,6(4):353-374

The randomly and unpredictable user behaviors during a multimedia presentation may cause the long retrieval latency in the client–server connection. To accommodate the above problem, we propose a prefetching scheme that using the association rules from the data mining technique. The data mining technique can provide some priority information such as the support, confidence, and association rules which can be utilized for prefetching continuous media. Thus, using the data mining technique, the proposed prefetching policy can predict user behaviors and evaluate segments that may be accessed in near future. The proposed prefetching scheme was implemented and tested on synthetic data to estimate its effectiveness. Performance experiments show that the proposed prefetching scheme is effective in improving the latency reduction, even for small cache sizes. 相似文献

10.

ART2 Clustering of Multiple Web Objects for Qualitative Web Prefetching

Chithra D. Gracia 《Applied Artificial Intelligence》2016,30(5):475-493

The web resources in the World Wide Web are rising, to large extent due to the services and applications provided by it. Because web traffic is large, gaining access to these resources incurs user-perceived latency. Although the latency can never be avoided, it can be minimized to a larger extent. Web prefetching is identified as a technique that anticipates the user’s future requests and fetches them into the cache prior to an explicit request made. Because web objects are of various types, a new algorithm is proposed that concentrates on prefetching embedded objects, including audio and video files. Further, clustering is employed using adaptive resonance theory (ART)2 in order to prefetch embedded objects as clusters. For comparative study, the web objects are clustered using ART2, ART1, and other statistical techniques. The clustering results confirm the supremacy of ART2 and, thereby, prefetching web objects in clusters is observed to produce a high hit rate. 相似文献

11.

基于Web对象流行度的PPM预测模型 总被引：7，自引：0，他引：7

石磊张岳裴云霞古志民《小型微型计算机系统》2006,27(7):1378-1382

Web预取技术是减少网络延迟，提高服务质量的主要解决方案之一．利用Zipf第一法则和第二法则分别对Web高频区对象和低频区对象建立访问流行度模型，进而提出一种基于Web对象流行度的PPM预测模型,实验表明，该模型除继承了传统PPM模型简单易实现的特点外，在缩减模型规模的同时预测精度也有一定程度的提高，并且控制了由预取引起的网络流量．相似文献

12.

Prediction of Web Page Accesses by Proxy Server Log

Wu Yi-Hung Chen Arbee L. P. 《World Wide Web》2002,5(1):67-88

As the population of web users grows, the variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. Recently, many efforts have been made to analyze user behaviors on the WWW. In this paper, we represent user behaviors by sequences of consecutive web page accesses, derived from the access log of a proxy server. Moreover, the frequent sequences are discovered and organized as an index. Based on the index, we propose a scheme for predicting user requests and a proxy-based framework for prefetching web pages. We perform experiments on real data. The results show that our approach makes the predictions with a high degree of accuracy with little overhead. In the experiments, the best hit ratio of the prediction achieves 75.69%, while the longest time to make a prediction only requires 2.3 ms. 相似文献

13.

Xin Chen Xiaodong Zhang 《Computer》2003,36(3):63-70

The diverse server, client, and unique file object types used today slow Web performance. Caching alone offers limited performance relief because it cannot handle many different file types easily. One solution combines caching with Web prefetching: obtaining the Web data a client might need from data about that client's past surfing activity. The prediction by partial match model, for example, makes prefetching decisions by reviewing URLs clients have accessed on a particular server, then structuring them in a Markov predictor tree. The authors propose a variation of this model that builds common surfing patterns and regularities into the tree. 相似文献

14.

A taxonomy of web prediction algorithms

Josep Domenech Bernardo de la Ossa Julio Sahuquillo Jose A. Gil Ana Pont 《Expert systems with applications》2012,39(9):8496-8502

Web prefetching techniques are an attractive solution to reduce the user-perceived latency. These techniques are driven by a prediction engine or algorithm that guesses following actions of web users. A large amount of prediction algorithms has been proposed since the first prefetching approach was published, although it is only over the last two or three years when they have begun to be successfully implemented in commercial products. These algorithms can be implemented in any element of the web architecture and can use a wide variety of information as input. This affects their structure, data system, computational resources and accuracy. The knowledge of the input information and the understanding of how it can be handled to make predictions can help to improve the design of current prediction engines, and consequently prefetching techniques.This paper analyzes fifty of the most relevant algorithms proposed along 15 years of prefetching research and proposes a taxonomy where the algorithms are classified according to the input data they use. For each group, the main advantages and shortcomings are highlighted. 相似文献

15.

Web预取模型分析 总被引：1，自引：0，他引：1

王世克吴集金士尧《微机发展》2005,15(8):1-3,7

WWW的快速增长导致网络拥塞和服务器超载。缓存技术被认为是减轻服务器负载、减少网络拥塞、降低客户访问延迟的有效途径之一，但作用有限。为进一步提高WWW性能，引入了预取技术。文中首先介绍了Web预取技术的基本思想及其研究可行性，然后分析了现有Web预取模型，最后给出了一个Web预取模型应具有的关键属性。相似文献

16.

Web预取技术的研究 总被引：1，自引：0，他引：1

牛伟张延园《微计算机应用》2008,29(7)

预取技术是提高缓存命中率和解决Web访问延迟问题的主要方案,本文研究了网页预取技术,将数据挖掘应用于Web预取中,设计了一个为用户提供个性化服务的Web预取模型;详细介绍了对Web日志进行预处理的方法;提出了新的预取替换算法。相似文献

17.

基于应用服务器信息的Web使用模式挖掘模型设计

MEI Ying 《数字社区&智能家居》2008,(14)

本文介绍了Web使用模式的数据挖掘,分析作为源数据的Web服务器日志的局限性,提出基于应用服务器信息的Web使用模式挖掘,并在此基础上对传统的Web使用模式挖掘模型进行了改进。相似文献

18.

基于粗糙近似的Web事务聚类方法研究

申情韩燮蒋云良《计算机工程与设计》2007,28(18):4469-4471

Web使用挖掘是数据挖掘技术在Web信息仓库中的应用.Web使用挖掘通过挖掘Web服务器日志获取的知识来预测用户浏览行为,是Web挖掘技术中的一个重要研究方向.通常发现的知识或一些意外规则很可能是不精确的、不完备的,这就需要用软计算技术如粗糙集来解决.提出一种基于粗糙近似的聚类方法,该方法能够实现从Web访问日志中聚类Web事务.通过这种方法可以有效地挖掘Web日志记录,从而发现用户存取Web页面的模式. 相似文献

19.

马尔可夫预测模型的压缩与应用研究

石磊姚瑶《计算机应用》2007,27(11):2746-2749

Markov预测模型是Web预取与个性化推荐技术的基础。大量Web对象的存在使得用户浏览转移状态激增，导致预测模型出现了巨大的空间复杂度问题。基于网站链接结构（WLS），针对Markov预测模型中的转移概率矩阵，提出一种基于行相似与列相似的相似度度量方法。首先计算出相似矩阵，然后利用行相似、列相似获得相似页面并压缩在一起，减小了Markov模型中的状态个数。实验表明，该模型具有较好的整体性能和压缩效果，在预取效率方面能够保持较高的预测准确率和查全率。相似文献

20.

一种重构网站结构的Web日志挖掘数据预处理方法

袁健金鑫《小型微型计算机系统》2011,32(7)

在Web日志挖掘的过程中,数据预处理是整个Web日志挖掘过程的基础,其直接影响了日志挖掘的质量和结果.由于目前大多数网页都采用框架模式,而传统的预处理技术并没有针对frame页面进行过滤,即使过滤,也会导致页面结构的混乱,从而不能够为路径补充提供正确的信息.基于此,本文提出一种基于重构网站结构的Web日志挖掘数据预处理方法以及基于它的路径补充方法. 相似文献