首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Knowledge and Information Systems - In the research area of time series classification, the ensemble shapelet transform algorithm is one of the state-of-the-art algorithms for classification....  相似文献   

2.
Time-series classification (TSC) problems present a specific challenge for classification algorithms: how to measure similarity between series. A shapelet is a time-series subsequence that allows for TSC based on local, phase-independent similarity in shape. Shapelet-based classification uses the similarity between a shapelet and a series as a discriminatory feature. One benefit of the shapelet approach is that shapelets are comprehensible, and can offer insight into the problem domain. The original shapelet-based classifier embeds the shapelet-discovery algorithm in a decision tree, and uses information gain to assess the quality of candidates, finding a new shapelet at each node of the tree through an enumerative search. Subsequent research has focused mainly on techniques to speed up the search. We examine how best to use the shapelet primitive to construct classifiers. We propose a single-scan shapelet algorithm that finds the best $k$ shapelets, which are used to produce a transformed dataset, where each of the $k$ features represent the distance between a time series and a shapelet. The primary advantages over the embedded approach are that the transformed data can be used in conjunction with any classifier, and that there is no recursive search for shapelets. We demonstrate that the transformed data, in conjunction with more complex classifiers, gives greater accuracy than the embedded shapelet tree. We also evaluate three similarity measures that produce equivalent results to information gain in less time. Finally, we show that by conducting post-transform clustering of shapelets, we can enhance the interpretability of the transformed data. We conduct our experiments on 29 datasets: 17 from the UCR repository, and 12 we provide ourselves.  相似文献   

3.
4.
The problem of anomaly detection in time series has received a lot of attention in the past two decades. However, existing techniques cannot locate where the anomalies are within anomalous time series, or they require users to provide the length of potential anomalies. To address these limitations, we propose a self-learning online anomaly detection algorithm that automatically identifies anomalous time series, as well as the exact locations where the anomalies occur in the detected time series. In addition, for multivariate time series, it is difficult to detect anomalies due to the following challenges. First, anomalies may occur in only a subset of dimensions (variables). Second, the locations and lengths of anomalous subsequences may be different in different dimensions. Third, some anomalies may look normal in each individual dimension but different with combinations of dimensions. To mitigate these problems, we introduce a multivariate anomaly detection algorithm which detects anomalies and identifies the dimensions and locations of the anomalous subsequences. We evaluate our approaches on several real-world datasets, including two CPU manufacturing data from Intel. We demonstrate that our approach can successfully detect the correct anomalies without requiring any prior knowledge about the data.  相似文献   

5.
Zhang  Hanbo  Wang  Peng  Fang  Zicheng  Wang  Zeyu  Wang  Wei 《World Wide Web》2021,24(2):511-539
World Wide Web - In recent years, time series classification with shapelets, due to the high accuracy and good interpretability, has attracted considerable interests. These approaches extract or...  相似文献   

6.
7.
8.
Pattern Analysis and Applications - In the process of early classification, earliness and accuracy are two key indicators to evaluate the performance of classification, and early classification...  相似文献   

9.
Time series classification is related to many different domains, such as health informatics, finance, and bioinformatics. Due to its broad applications, researchers have developed many algorithms for this kind of tasks, e.g., multivariate time series classification. Among the classification algorithms, k-nearest neighbor (k-NN) classification (particularly 1-NN) combined with dynamic time warping (DTW) achieves the state of the art performance. The deficiency is that when the data set grows large, the time consumption of 1-NN with DTWwill be very expensive. In contrast to 1-NN with DTW, it is more efficient but less effective for feature-based classification methods since their performance usually depends on the quality of hand-crafted features. In this paper, we aim to improve the performance of traditional feature-based approaches through the feature learning techniques. Specifically, we propose a novel deep learning framework, multi-channels deep convolutional neural networks (MC-DCNN), for multivariate time series classification. This model first learns features from individual univariate time series in each channel, and combines information from all channels as feature representation at the final layer. Then, the learnt features are applied into a multilayer perceptron (MLP) for classification. Finally, the extensive experiments on real-world data sets show that our model is not only more efficient than the state of the art but also competitive in accuracy. This study implies that feature learning is worth to be investigated for the problem of time series classification.  相似文献   

10.
Data Mining and Knowledge Discovery - We present XEM, an eXplainable-by-design Ensemble method for Multivariate time series classification. XEM relies on a new hybrid ensemble method that combines...  相似文献   

11.
A general procedure is derived for simulating univariate and multivariate nonnormal distributions using polynomial transformations of order five. The procedure allows for the additional control of the fifth and sixth moments. The ability to control higher moments increases the precision in the approximations of nonnormal distributions and lowers the skew and kurtosis boundary relative to the competing procedures considered. Tabled values of constants are provided for approximating various probability density functions. A numerical example is worked to demonstrate the multivariate procedure. The results of a Monte Carlo simulation are provided to demonstrate that the procedure generates specified population parameters and intercorrelations.  相似文献   

12.
Time series classification is a supervised learning problem aimed at labeling temporally structured multivariate sequences of variable length. The most common approach reduces time series classification to a static problem by suitably transforming the set of multivariate input sequences into a rectangular table composed by a fixed number of columns. Then, one of the alternative efficient methods for classification is applied for predicting the class of new temporal sequences. In this paper, we propose a new classification method, based on a temporal extension of discrete support vector machines, that benefits from the notions of warping distance and softened variable margin. Furthermore, in order to transform a temporal dataset into a rectangular shape, we also develop a new method based on fixed cardinality warping distances. Computational tests performed on both benchmark and real marketing temporal datasets indicate the effectiveness of the proposed method in comparison to other techniques.  相似文献   

13.
Multivariate time series (MTS) data are widely available in different fields including medicine, finance, bioinformatics, science and engineering. Modelling MTS data accurately is important for many decision making activities. One area that has been largely overlooked so far is the particular type of time series where the data set consists of a large number of variables but with a small number of observations. In this paper we describe the development of a novel computational method based on Natural Computation and sparse matrices that bypasses the size restrictions of traditional statistical MTS methods, makes no distribution assumptions, and also locates the associated parameters. Extensive results are presented, where the proposed method is compared with both traditional statistical and heuristic search techniques and evaluated on a number of criteria. The results have implications for a wide range of applications involving the learning of short MTS models.  相似文献   

14.
Time series classification tries to mimic the human understanding of similarity. When it comes to long or larger time series datasets, state-of-the-art classifiers reach their limits because of unreasonably high training or testing times. One representative example is the 1-nearest-neighbor dynamic time warping classifier (1-NN DTW) that is commonly used as the benchmark to compare to. It has several shortcomings: it has a quadratic time complexity in the time series length and its accuracy degenerates in the presence of noise. To reduce the computational complexity, early abandoning techniques, cascading lower bounds, or recently, a nearest centroid classifier have been introduced. Still, classification times on datasets of a few thousand time series are in the order of hours. We present our Bag-Of-SFA-Symbols in Vector Space classifier that is accurate, fast and robust to noise. We show that it is significantly more accurate than 1-NN DTW while being multiple orders of magnitude faster. Its low computational complexity combined with its good classification accuracy makes it relevant for use cases like long or large amounts of time series or real-time analytics.  相似文献   

15.
现有的多元时间序列相似性度量方法 难以平衡度量准确性和计算效率之间的矛盾.针对该问题,首先,对多元时间序列进行多维分段拟合;然后,选取各分段上序列点的均值作为特征;最后,以特征序列作为输入,利用动态时间弯曲算法实现相似性度量.实验结果表明,所提出方法参数配置简单,能够在保证度量准确性的前提下有效降低计算复杂度.  相似文献   

16.
A strategy for improving speed of the previously proposed evolving neuro-fuzzy model (ENFM) is presented in this paper to make it more appropriate for online applications. By considering a recursive extension of Gath?CGeva clustering, the ENFM takes advantage of elliptical clusters for defining validity region of its neurons which leads to better modeling with less number of neurons. But this necessitates the computing of reverse and determinant of the covariance matrices which are time consuming in online applications with large number of input variables. In this paper a strategy for recursive estimation of singular value decomposition components of covariance matrices is proposed which converts the burdensome computations to calculating reverse and determinant of a diagonal matrix while keeping the advantages of elliptical clusters. The proposed method is applied to online detection of epileptic seizures in addition to prediction of Mackey?CGlass time series and modeling a time varying heat exchanger. Simulation results show that required time for training and test of fast ENFM is far less than its basic model. Moreover its modeling ability is similar to the ENFM which is superior to other online modeling approaches.  相似文献   

17.
Machine Learning - An increasing number of applications require to recognize the class of an incoming time series as quickly as possible without unduly compromising the accuracy of the prediction....  相似文献   

18.
In recent years, dynamic time warping (DTW) has begun to become the most widely used technique for comparison of time series data where extensive a priori knowledge is not available. However, it is often expected a multivariate comparison method to consider the correlation between the variables as this correlation carries the real information in many cases. Thus, principal component analysis (PCA) based similarity measures, such as PCA similarity factor (SPCA), are used in many industrial applications.In this paper, we present a novel algorithm called correlation based dynamic time warping (CBDTW) which combines DTW and PCA based similarity measures. To preserve correlation, multivariate time series are segmented and the local dissimilarity function of DTW originated from SPCA. The segments are obtained by bottom-up segmentation using special, PCA related costs. Our novel technique qualified on two databases, the database of signature verification competition 2004 and the commonly used AUSLAN dataset. We show that CBDTW outperforms the standard SPCA and the most commonly used, Euclidean distance based multivariate DTW in case of datasets with complex correlation structure.  相似文献   

19.
20.
Multivariate time series may contain outliers of different types. In the presence of such outliers, applying standard multivariate time series techniques becomes unreliable. A robust version of multivariate exponential smoothing is proposed. The method is affine equivariant, and involves the selection of a smoothing parameter matrix by minimizing a robust loss function. It is shown that the robust method results in much better forecasts than the classic approach in the presence of outliers, and performs similarly when the data contain no outliers. Moreover, the robust procedure yields an estimator of the smoothing parameter less subject to downward bias. As a byproduct, a cleaned version of the time series is obtained, as is illustrated by means of a real data example.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号