首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 620 毫秒
1.
Dynamic time warping (DTW), which finds the minimum path by providing non-linear alignments between two time series, has been widely used as a distance measure for time series classification and clustering. However, DTW does not account for the relative importance regarding the phase difference between a reference point and a testing point. This may lead to misclassification especially in applications where the shape similarity between two sequences is a major consideration for an accurate recognition. Therefore, we propose a novel distance measure, called a weighted DTW (WDTW), which is a penalty-based DTW. Our approach penalizes points with higher phase difference between a reference point and a testing point in order to prevent minimum distance distortion caused by outliers. The rationale underlying the proposed distance measure is demonstrated with some illustrative examples. A new weight function, called the modified logistic weight function (MLWF), is also proposed to systematically assign weights as a function of the phase difference between a reference point and a testing point. By applying different weights to adjacent points, the proposed algorithm can enhance the detection of similarity between two time series. We show that some popular distance measures such as DTW and Euclidean distance are special cases of our proposed WDTW measure. We extend the proposed idea to other variants of DTW such as derivative dynamic time warping (DDTW) and propose the weighted version of DDTW. We have compared the performances of our proposed procedures with other popular approaches using public data sets available through the UCR Time Series Data Mining Archive for both time series classification and clustering problems. The experimental results indicate that the proposed approaches can achieve improved accuracy for time series classification and clustering problems.  相似文献   

2.
常炳国  臧虹颖 《计算机应用》2018,38(7):1910-1915
针对传统的动态时间弯曲(DTW)度量方法易出现过度弯曲现象且计算复杂度高、算法效率低等问题,提出一种基于路径修正的动态时间弯曲(UDTW)度量方法。首先通过分段降维方法——分段局部最大值平滑法(PLM)有效提取序列特征信息,减少UDTW的计算代价;其次,考虑了时间序列形态特征的相似性要求,给过度弯曲路径设置动态惩罚系数,以此修正路径的弯曲程度;最后,在改进度量距离基础上,采用1-近邻分类算法对时序数据进行分类,以提高时间序列相似性度量的准确率和效率。实验结果表明,在15个UCR数据集上,UDTW度量方法与传统DTW度量方法相比具有更高的分类准确率,UDTW在其中3个数据集上能实现100%分类正确;与导数DTW(DDTW)度量方法相比,UDTW分类准确率最多提高了71.8%,而PLM-UDTW在不影响分类准确率的前提下执行时间减小了99%。  相似文献   

3.

There exist a variety of distance measures which operate on time series kernels. The objective of this article is to compare those distance measures in a support vector machine setting. A support vector machine is a state-of-the-art classifier for static (non-time series) datasets and usually outperforms k-Nearest Neighbour, however it is often noted that that 1-NN DTW is a robust baseline for time-series classification. Through a collection of experiments we determine that the most effective distance measure is Dynamic Time Warping and the most effective classifier is kNN. However, a surprising result is that the pairing of kNN and DTW is not the most effective model. Instead we have discovered via experimentation that Dynamic Time Warping paired with the Gaussian Support Vector Machine is the most accurate time series classifier. Finally, with good reason we recommend a slightly inferior (in terms of accuracy) model Time Warp Edit Distance paired with the Gaussian Support Vector Machine as it has a better theoretical basis. We also discuss the reduction in computational cost achieved by using a Support Vector Machine, finding that the Negative Kernel paired with the Dynamic Time Warping distance produces the greatest reduction in computational cost.

  相似文献   

4.
Among many existing distance measures for time series data, Dynamic Time Warping (DTW) distance has been recognized as one of the most accurate and suitable distance measures due to its flexibility in sequence alignment. However, DTW distance calculation is computationally intensive. Especially in very large time series databases, sequential scan through the entire database is definitely impractical, even with random access that exploits some index structures since high dimensionality of time series data incurs extremely high I/O cost. More specifically, a sequential structure consumes high CPU but low I/O costs, while an index structure requires low CPU but high I/O costs. In this work, we therefore propose a novel indexed sequential structure called TWIST (Time Warping in Indexed Sequential sTructure) which benefits from both sequential access and index structure. When a query sequence is issued, TWIST calculates lower bounding distances between a group of candidate sequences and the query sequence, and then identifies the data access order in advance, hence reducing a great number of both sequential and random accesses. Impressively, our indexed sequential structure achieves significant speedup in a querying process. In addition, our method shows superiority over existing rival methods in terms of query processing time, number of page accesses, and storage requirement with no false dismissal guaranteed.  相似文献   

5.
刘帅  刘长良  甄成刚 《计算机应用》2019,39(4):1229-1233
针对风电机组故障预警中,原始动态时间规整(DTW)算法无法有效度量风电机组多变量时间序列数据之间距离的问题,提出一种基于犹豫模糊集的动态时间规整(HFS-DTW)算法。该算法是原始DTW算法的一种扩展算法,可对单变量和多变量时间序列数据进行距离度量,且精度与速度较原始DTW算法更优。以子时间序列相似度距离为目标函数,使用帝国竞争算法(ICA)优化了HFS-DTW算法中的子序列长度和步距参数。算例研究表明与仅DTW算法和非参数最优的HFS-DTW算法相对比,参数最优的HFS-DTW可挖掘更多的多维特征点信息,输出的多维特征点相似序列具有更丰富细节;且基于所提算法可提前10天预警风电机组齿轮箱故障。  相似文献   

6.
Dynamic Time Warping (DTW) is a popular method for measuring the similarity of time series. It is widely used in various domains. A major drawback of DTW is that it has a high computational complexity. To address this problem, pruning techniques to calculate the exact DTW distance, as well as DTW approximation methods, have become important approaches. In this paper, we introduce Blocked Dynamic Time Warping (BDTW), a new similarity measure which works on run-length encoded time series representation. BDTW utilizes any repetitive values (zero and nonzero) in time series to reduce DTW computation time. BDTW closely approximates DTW distance, and it is significantly faster than traditional DTW for time series with high levels of value repetition. Moreover, BDTW can be combined with time series representation methods which provide constant segments, to serve as a close approximation method even for the time series without value repetition. Constrained BDTW, BDTW upper bound and BDTW lower bound are discussed as variations of BDTW. BDTW upper bound and BDTW lower bound are presented as a new DTW upper bound and lower bound which can be efficiently applied on time series with high levels of value repetition for pruning unhopeful alignments and matches in the exact DTW calculation. We show the effectiveness of BDTW and its variations on different applications using the following datasets: Almanac of Minutely Power, Refit Smart Homes, as well as the 85 datasets from the University of California, Riverside time series classification archive (UCR archive).  相似文献   

7.
基于提前终止的加速时间序列弯曲算法   总被引:3,自引:0,他引:3  
动态时间弯曲(DTW)距离是时间序列相似搜索的一种重要距离度量,但其精确计算是一个性能瓶颈。针对此问题,提出一种名为EA_DTW的方法用于加速DTW距离的精确计算,该方法在计算累积距离矩阵中每个方格的距离时都判断其是否超过阈值,一旦超过则提前终止其余相关方格的距离计算;并对EA_DTW的过程进行了理论分析。实验对比表明,EA_DTW能够提高DTW的计算效率,在阈值与DTW距离相比较小时更加明显。  相似文献   

8.
针对动态时间弯曲(DTW)算法在提高计算速度同时不能兼顾分类正确率的问题,提出了一种基于朴素粒计算思想的弹性粗粒度动态时间弯曲(CG-DTW)算法。首先,通过计算时序方差特征的方法来获取较优的时序粒度,用粒度特征代替原始序列;其次,再代入执行DTW算法,允许动态调整被比较时序粒间的弹性大小,从而获得相对最优的时序对应粒;最后,在对应最优粒的情况下计算DTW距离。同时引入下界函数的提前终止策略进一步提高CG-DTW算法效率。实验结果表明,所提算法要比经典算法运行速率提高21.4%左右,比降维策略算法正确率提高近32.3个百分点,尤其是长序列的分类,CG-DTW能够在保持正确率的情况下兼顾较高的运行效率。CG-DTW在实际应用中能适应不确定长序列分类。  相似文献   

9.
姜逸凡  叶青 《计算机应用》2019,39(4):1041-1045
在时间序列分类等数据挖掘工作中,不同数据集基于类别的相似性表现有明显不同,因此一个合理有效的相似性度量对数据挖掘非常关键。传统的欧氏距离、余弦距离和动态时间弯曲等方法仅针对数据自身进行相似度公式计算,忽略了不同数据集所包含的知识标注对于相似性度量的影响。为了解决这一问题,提出基于孪生神经网络(SNN)的时间序列相似性度量学习方法。该方法从样例标签的监督信息中学习数据之间的邻域关系,建立时间序列之间的高效距离度量。在UCR提供的时间序列数据集上进行的相似性度量和验证性分类实验的结果表明,与ED/DTW-1NN相比SNN在分类质量总体上有明显的提升。虽然基于动态时间弯曲(DTW)的1近邻(1NN)分类方法在部分数据上表现优于基于SNN的1NN分类方法,但在分类过程的相似度计算复杂度和速度上SNN优于DTW。可见所提方法能明显提高分类数据集相似性的度量效率,在高维、复杂的时间序列的数据分类上有不错的表现。  相似文献   

10.
高效的时间序列下界技术   总被引:3,自引:0,他引:3       下载免费PDF全文
针对时间序列数据,提出一种新的基于动态时间弯曲的下界技术,该技术首先基于分段聚集近似的线性表示对原始序列进行降维,同时生成查询序列的网格最小边界矩形近似表示,然后利用基于动态时间弯曲距离对两者下界距离度量。实验结果表明,该下界技术与以往相关技术相比,能够产生更大的下界距离,具有更强的紧凑度、裁剪搜索空间能力以及更短的运行时间,有利于时间序列数据挖掘。  相似文献   

11.
限制对齐路径长度的动态时间规整(LDTW)算法存在时间复杂度高和计算量大的问题。基于LDTW算法提出固定对齐路径长度的动态时间规整(FDTW)算法。通过调整LDTW算法中对齐路径长度的控制策略,由控制在某个区间改为固定到某个具体值,相应缩减累计代价矩阵中元素的计算范围。在UCR时间序列数据集上的实验结果表明,FDTW与LDTW算法的分类准确率持平,但FDTW算法在分类过程中的时间开销更小,并且能有效降低累计代价矩阵元素的计算量,提高计算效率。  相似文献   

12.
Similarity search is a core module of many data analysis tasks, including search by example, classification, and clustering. For time series data, Dynamic Time Warping (DTW) has been proven a very effective similarity measure, since it minimizes the effects of shifting and distortion in time. However, the quadratic cost of DTW computation to the length of the matched sequences makes its direct application on databases of long time series very expensive. We propose a technique that decomposes the sequences into a number of segments and uses cheap approximations thereof to compute fast lower bounds for their warping distances. We present several, progressively tighter bounds, relying on the existence or not of warping constraints. Finally, we develop an index and a multi-step technique that uses the proposed bounds and performs two levels of filtering to efficiently process similarity queries. A thorough experimental study suggests that our method consistently outperforms state-of-the-art methods for DTW similarity search.  相似文献   

13.
Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original DTW distance does not take advantage of this sparsity, leading to redundant calculations and a prohibitively large computational cost for long time series. We derive a new time warping similarity measure (AWarp) for sparse time series that works on the run-length encoded representation of sparse time series. The complexity of AWarp is quadratic on the number of observations as opposed to the range of time of the time series. Therefore, AWarp can be several orders of magnitude faster than DTW on sparse time series. AWarp is exact for binary-valued time series and a close approximation of the original DTW distance for any-valued series. We discuss useful variants of AWarp: bounded (both upper and lower), constrained, and multidimensional. We show applications of AWarp to three data mining tasks including clustering, classification, and outlier detection, which are otherwise not feasible using classic DTW, while producing equivalent results. Potential areas of application include bot detection, human activity classification, search trend analysis, seismic analysis, and unusual review pattern mining.  相似文献   

14.
一种新的DTW最佳弯曲窗口学习方法   总被引:1,自引:0,他引:1  
陈乾  胡谷雨 《计算机科学》2012,39(8):191-195
时间序列相似性查询中,DTW(Dynamic Time Warping)距离是支持时间弯曲的经典度量,约束弯曲窗口的DTW是DTW最常见的实用形式。分析了传统DTW最佳弯曲窗口学习方法存在的问题,并在此基础上引入时间距离的概念,提出了新的DTW最佳弯曲窗口学习方法。由于时间距离是DTW计算的附属产物,因此该方法可以在几乎不增加运算量的情况下提高DTW的分类精度。实验证明,采用了新的学习方法后,具有最佳弯曲窗口的DTW分类精度得到明显改善,分类精度优于ERP(Edit Distance with Real Penalty)和LCSS(Longest Common SubSequence),接近TWED(Time Warp Edit Distance)的水平。  相似文献   

15.
Dynamic time warping (DTW) has proven itself to be an exceptionally strong distance measure for time series. DTW in combination with one-nearest neighbor, one of the simplest machine learning methods, has been difficult to convincingly outperform on the time series classification task. In this paper, we present a simple technique for time series classification that exploits DTW’s strength on this task. But instead of directly using DTW as a distance measure to find nearest neighbors, the technique uses DTW to create new features which are then given to a standard machine learning method. We experimentally show that our technique improves over one-nearest neighbor DTW on 31 out of 47 UCR time series benchmark datasets. In addition, this method can be easily extended to be used in combination with other methods. In particular, we show that when combined with the symbolic aggregate approximation (SAX) method, it improves over it on 37 out of 47 UCR datasets. Thus the proposed method also provides a mechanism to combine distance-based methods like DTW with feature-based methods like SAX. We also show that combining the proposed classifiers through ensembles further improves the performance on time series classification.  相似文献   

16.
In recent years Dynamic Time Warping (DTW) has emerged as the distance measure of choice for virtually all time series data mining applications. For example, virtually all applications that process data from wearable devices use DTW as a core sub-routine. This is the result of significant progress in improving DTW’s efficiency, together with multiple empirical studies showing that DTW-based classifiers at least equal (and generally surpass) the accuracy of all their rivals across dozens of datasets. Thus far, most of the research has considered only the one-dimensional case, with practitioners generalizing to the multi-dimensional case in one of two ways, dependent or independent warping. In general, it appears the community believes either that the two ways are equivalent, or that the choice is irrelevant. In this work, we show that this is not the case. The two most commonly used multi-dimensional DTW methods can produce different classifications, and neither one dominates over the other. This seems to suggest that one should learn the best method for a particular application. However, we will show that this is not necessary; a simple, principled rule can be used on a case-by-case basis to predict which of the two methods we should trust at the time of classification. Our method allows us to ensure that classification results are at least as accurate as the better of the two rival methods, and, in many cases, our method is significantly more accurate. We demonstrate our ideas with the most extensive set of multi-dimensional time series classification experiments ever attempted.  相似文献   

17.
DTW(Dynamic Time Warping)算法被广泛应用于序列数据比对,以度量序列间距离,但算法较高的时间复杂度限制了其在长序列比对上的应用。提出基于自适应搜索窗口的序列相似比对算法(ADTW),算法利用分段聚集平均(Piecewise Aggregate Approximation,PAA)策略进行序列抽样得到低精度序列,然后计算低精度序列下的比对路径,并根据低精度距离矩阵上的梯度变化预测路径偏差,限制路径搜索窗口的拓展范围;随后算法逐步提高序列精度,并在搜索窗口内修正路径、计算新的搜索窗口,最终,实现DTW距离和相似比对路径的快速求解。对比FastDTW,ADTW算法在同等度量准确率下提高计算效率约20%,其时间复杂度为[O(n)]。  相似文献   

18.
Clustering of stationary time series has become an important tool in many scientific applications, like medicine, finance, etc. Time series clustering methods are based on the calculation of suitable similarity measures which identify the distance between two or more time series. These measures are either computed in the time domain or in the spectral domain. Since the computation of time domain measures is rather cumbersome we resort to spectral domain methods. A new measure of distance is proposed and it is based on the so-called cepstral coefficients which carry information about the log spectrum of a stationary time series. These coefficients are estimated by means of a semiparametric model which assumes that the log-likelihood ratio of two or more unknown spectral densities has a linear parametric form. After estimation, the estimated cepstral distance measure is given as an input to a clustering method to produce the disjoint groups of data. Simulated examples show that the method yields good results, even when the processes are not necessarily linear. These cepstral-based clustering algorithms are applied to biological time series. In particular, the proposed methodology effectively identifies distinct and biologically relevant classes of amino acid sequences with the same physicochemical properties, such as hydrophobicity.  相似文献   

19.
郭小芳  李锋 《计算机工程与应用》2012,48(23):111-114,119
为提高多元时间序列相似性度量的效率,采用扩展Frobenius范数(Eros)的主元分析(PCA)方法,通过主元和本征值构造主元相似因子,用于比较多元时间序列矩阵之间的相似性。为了验证这种方法的有效性,针对三组数据(两个真实数据,一个合成数据)进行了实验。结果表明,该方法相对于以往的欧几里德距离(ED),动态时间弯曲(DTW)相似性度量方法具有一定的优越性。  相似文献   

20.
Scaling and time warping in time series querying   总被引:3,自引:0,他引:3  
The last few years have seen an increasing understanding that dynamic time warping (DTW), a technique that allows local flexibility in aligning time series, is superior to the ubiquitous Euclidean distance for time series classification, clustering, and indexing. More recently, it has been shown that for some problems, uniform scaling (US), a technique that allows global scaling of time series, may just be as important for some problems. In this work, we note that for many real world problems, it is necessary to combine both DTW and US to achieve meaningful results. This is particularly true in domains where we must account for the natural variability of human actions, including biometrics, query by humming, motion-capture/animation, and handwriting recognition. We introduce the first technique which can handle both DTW and US simultaneously, our techniques involve search pruning by means of a lower bounding technique and multi-dimensional indexing to speed up the search. We demonstrate the utility and effectiveness of our method on a wide range of problems in industry, medicine, and entertainment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号