A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases

Authors:	Woong-Kee Loh Sang-Wook Kim Kyu-Young Whang

Affiliation:	(1) Department of Computer Science & Advanced Information Technology Research Center (AITrc), Korea Advanced Institute of Science and Technology (KAIST), Korea;(2) College of Information and Communications, Hanyang University, Korea

Abstract:	In this paper, an algorithm is proposed for subsequence matching that supports normalization transform in time-series databases. Normalization transform enables finding sequences with similar fluctuation patterns even though they are not close to each other before the normalization transform. Simple application of existing subsequence matching algorithms to support normalization transform is not feasible since the algorithms do not have information for normalization transform of subsequences of arbitrary lengths. Application of the existing whole matching algorithm supporting normalization transform to the subsequence matching is feasible, but requires an index for every possible length of the query sequence causing serious overhead on both storage space and update time. The proposed algorithm generates indexes only for a small number of different lengths of query sequences. For subsequence matching it selects the most appropriate index among them. Better search performance can be obtained by using more indexes. In this paper, the approach is called index interpolation. It is formally proved that the proposed algorithm does not cause false dismissal. The search performance can be traded off with storage space by adjusting the number of indexes. For performance evaluation, a series of experiments is conducted using the indexes for only five different lengths out of lengths 256512 of the query sequence. The results show that the proposed algorithm outperforms the sequential scan by up to 2.4 times on the average when the selectivity of the query is 10^–2 and up to 14.6 times when it is 10^–5. Since the proposed algorithm performs better with smaller selectivities, it is suitable for practical situations, where the queries with smaller selectivities are much more frequent.

Keywords:	subsequence matching normalization transform index interpolation time-series databases
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏