首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper we describe a method to analyze the structure and dynamics of the 30 largest North American companies. The method combines the tools of symbolic time series analysis (Daw et al. in Rev Sci Instrum 74:916–930, 2003) with the nearest neighbor single linkage clustering algorithm (Mantegna and Stanley in An introduction to econophysics: Correlations and complexity in finance, Cambridge University Press, UK, 2000). Data symbolization allows to obtain a metric distance between two different time series that is used to construct a minimal spanning tree allowing to compute an ultrametric distance. From the analysis of time series data of companies included in Dow Jones Industrial Average, we derive a hierarchical organization of these companies. In particular, we detect different clusters of companies which correspond with their common production activities or their strong interrelationship. The obtained classification of companies can be used to study deep relationships among different branch of economic activities and to construct financial portfolios.  相似文献   

2.
Distinguishing among linear and nonlinear time series or between nonlinear time series generated by different underlying processes is challenging, as second-order properties are generally insufficient for the task. Different nonlinear processes have different nonconstant bispectral signatures, whereas the bispectral density function of a Gaussian or linear time series is constant. Based on this, we propose a procedure to distinguish among various nonlinear time series and between nonlinear and linear time series through application of a hierarchical clustering algorithm based on distance measures computed from the square modulus of the estimated normalized bispectra. We find that clustering using a distance measure computed by averaging the ratio of normalized bispectral periodogram ordinates over the intersection of the principle domain of each pair of time series provides good performance, subject to trimming of extreme bispectral values prior to taking the ratios. Additionally, we show through simulation studies that the distance procedure performs better than a significance test that we derive. Moreover, it is robust with respect to the choice of smoothing parameter in estimating the bispectrum. As an example, we apply the method to a set of time series of intensities of gamma-ray bursts, some of which exhibit nonlinear behavior; this enables us to identify gamma-ray bursts that may be emanating from the same type of astral event.  相似文献   

3.
为了从情绪的视角分析紧急情境下人群的疏散行为,梳理了现有情绪感染的研究工作,总结了人群紧急状况下行为特点.采用智能体描述人群个体,提出一种多智能体情绪感染模型.其主体框架分为感知层、情绪层、感染层、行为层和行动层.归纳了产生情绪感染现象的3个条件及情绪感染的3个规则,提出了情绪感染的算法,考虑个体的个性和个体间距离因素,采用情绪强度和人群紧密度来计算个体疏散速度.用C#语言编制了仿真实验,采用真实的地震疏散案例,验证了仿真疏散时间和实际观测的基本一致.通过与以往基于传染病思路的情绪感染模型对比,所提出的模型可以更好地描述情绪感染从局部到整体的过程.实验结果表明,所提出的模型可以推演情绪驱动下的群体聚集行为,有望为制定应急疏散预案提供一种可视化分析方法.  相似文献   

4.
时间序列的相似性的分层查询   总被引:1,自引:0,他引:1  
提出了一种新的基于重要点的分段方法,将时间序列数据转换为趋势序列。在进行相似性比较时先进行趋势相似的比较,然后对结果进行欧氏距离的比较。实验结果表明该算法是有效的。  相似文献   

5.
Dynamic Time Warping (DTW) is a popular and efficient distance measure used in classification and clustering algorithms applied to time series data. By computing the DTW distance not on raw data but on the time series of the (first, discrete) derivative of the data, we obtain the so-called Derivative Dynamic Time Warping (DDTW) distance measure. DDTW, used alone, is usually inefficient, but there exist datasets on which DDTW gives good results, sometimes much better than DTW. To improve the performance of the two distance measures, we can combine them into a new single (parametric) distance function. The literature contains examples of the combining of DTW and DDTW in algorithms for supervised classification of time series data. In this paper, we demonstrate that combination of DTW and DDTW can also be applied in a method of time series clustering (unsupervised classification). In particular, we focus on a hierarchical clustering (with average linkage) of univariate (one-dimensional) time series data. We construct a new parametric distance function, combining DTW and DDTW, where a single real number parameter controls the contribution of each of the two measures to the total value of the combined distances. The parameter is tuned in the initial phase of the clustering algorithm. Using this technique in clustering methods requires a different approach (to address certain specific problems) than for supervised methods. In the clustering process we use three internal cluster validation measures (measures which do not use labels) and three external cluster validation measures (measures which do use clustering data labels). Internal measures are used to select an optimal value of the parameter of the algorithm, where external measures give information about the overall performance of the new method and enable comparison with other distance functions. Computational experiments are performed on a large real-world data base (UCR Time Series Classification Archive: 84 datasets) from a very broad range of fields, including medicine, finance, multimedia and engineering. The experimental results demonstrate the effectiveness of the proposed approach for hierarchical clustering of time series data. The method with the new parametric distance function outperforms DTW (and DDTW) on the data base used. The results are confirmed by graphical and statistical comparison.  相似文献   

6.
The quantity of unstructured and semi-structured data available is growing rapidly. Adding structure to such data by grouping similar items into fuzzy categories (or granules) can be a productive approach, and can lead to additional knowledge (e.g. by monitoring association and other relations between classes). Formal concept analysis (and fuzzy formal concept analysis) enables us to identify hierarchical structure arising from similarities in attribute values. However, in an environment where source data is updated, this data-driven approach may lead to concept lattices whose structure varies over time (that is, the number of concepts and their relation to each other may change significantly as updates are processed). In this paper, we describe a novel way of measuring the distance between concept lattices. The method can be applied to comparison of lattices derived from the same set of objects using different attributes or to different sets of objects categorised by the same attributes. We prove that the proposed method is a distance metric and illustrate its use by means of examples.  相似文献   

7.
针对具有递阶层次准则结构的决策问题,考虑到决策者偏好表达的不确定和犹豫性以及认知的参照依赖和损失规避行为,提出一种区间犹豫模糊多层的TODIM多准则决策方法。首先在对传统JACCARD距离修正后提出一种新的区间犹豫模糊距离测度,接着结合层次分析法思想与TODIM方法建立一种能够解决递阶层次准则结构决策问题的区间犹豫模糊决策方法。最后将该方法运用于云制造资源选择问题,并通过对比分析证实了其有效性和可用性。  相似文献   

8.
Hierarchies of partitions are generally represented by dendrograms (direct representation). They can also be represented by saliency maps or minimum spanning trees. In this article, we precisely study the links between these three representations. In particular, we provide a new bijection between saliency maps and hierarchies based on quasi-flat zones as often used in image processing and we characterize saliency maps and minimum spanning trees as solutions to constrained minimization problems where the constraint is quasi-flat zones preservation. In practice, these results make up a toolkit for designing new hierarchical methods where one can choose the most convenient representation. They also invite us to process non-image data with morphological hierarchies. More precisely, we show the practical interest of the proposed framework for: (i) hierarchical watershed image segmentations, (ii) combinations of different hierarchical segmentations, (iii) hierarchicalizations of some non-hierarchical image segmentation methods based on regional dissimilarities, and (iv) hierarchical analysis of geographic data.  相似文献   

9.
交通流时间序列分离方法   总被引:3,自引:0,他引:3  
采用聚类分析方法对交通流时间序列进行分析可以发现典型的交通流变化模式。通常 可采用欧式距离及K均值算法进行时间序列聚类,但经分析发现单凭此方法还难以实现不同变化趋 势的交通流时间序列的有效分离。针对此问题,提出了将动态时间弯曲及灰色关联度引入交通流时 间序列相似性度量,且结合层次化聚类方法对交通流时间序列进一步分离的方法。通过实验研究,发 现基于灰色关联度的层次化聚类方法能较好地实现交通流时间序列的进一步有效分离。  相似文献   

10.
Although the distance between binary codes can be computed fast in Hamming space, linear search is not practical for large scale datasets. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which hierarchical clustering trees (HCT) are widely used. However, HCT select cluster centers randomly and build indexes with the entire binary code, this degrades search performance. In this paper, we first propose a new clustering algorithm, which chooses cluster centers on the basis of relative distances and uses a more homogeneous partition of the dataset than HCT has to build the hierarchical clustering trees. Then, we present an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Consequently, a new index is proposed using compressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.  相似文献   

11.
交通流时间序列模式相似性度量法   总被引:1,自引:0,他引:1  
针对交通流时间序列具有高维、高噪声的特性,设计了基于趋势变动、拟合优度和最小距离和百分比原则的联机分割算法用于时间序列维约简。对分割后的时间序列进行5元组分段线性表示,并据此定义五种常见的时间序列形状相似性距离。使用分层聚类算法分析它们在不同的交通流状态辨识中的效果,以此确定交通流时间序列的模式相似性度量方法。以上海南北高架东侧间部分路段固定线圈检测数据为例进行了实证分析,最终确定模式距离与欧氏距离组合方式为交通时序模式相似性度量的最佳方法。  相似文献   

12.
13.
杨艳林  叶枫  吕鑫  余霖  刘璇 《计算机科学》2016,43(2):245-249
水文时间序列相似性挖掘是水文时间序列挖掘的重要方面,对洪水预报、防洪调度等具有重要意义。针对水文数据的特点,提出了一种基于DTW聚类的水文时间序列相似性挖掘方法。该方法先对数据进行小波去噪、特征点分段以及语义划分,再基于DTW距离对划分后的子序列做层次聚类并符号化;然后根据符号序列间的编辑距离筛选候选集;最后通过序列间的DTW距离进行精确匹配,获取相似水文时间序列。以滁河六合站的日水位数据进行实验,结果表明,所提方法能够有效地缩小候选集,提高查找语义相似的水文时间序列的效率。  相似文献   

14.
Forward stepwise regression analysis selects critical attributes all the way with the same set of data. Regression analysis is, however, not capable of splitting data to construct piecewise regression models. Regression trees have been known to be an effective data mining tool for constructing piecewise models by iteratively splitting data set and selecting attributes into a hierarchical tree model. However, the sample size reduces sharply after few levels of data splitting causing unreliable attribute selection. In this research, we propose a method to effectively construct a piecewise regression model by extending the sample-efficient regression tree (SERT) approach that combines the forward selection in regression analysis and the regression tree methodologies. The proposed method attempts to maximize the usage of the dataset's degree of freedom and to attain unbiased model estimates at the same time. Hypothetical and actual semiconductor yield-analysis cases are used to illustrate the method and its effective search for critical factors to be included in the dataset's underlying model.  相似文献   

15.
时间序列的相似性度量是时间序列数据挖掘的研究基础,为数据挖掘任务的效率和准确度提供可靠的保障。提出一种时间序列的层次分段及相似性度量方法,方法首先识别时间序列中的极值点,依据极值点的特征对时间序列进行分层次分段,并以此为基础,通过定义新的距离公式来度量时间序列间的相似性。使用新提出的相似性度量方法对时间序列进行聚类计算,实验结果表明,该方法能够有效地度量时间序列间的相似性,聚类效果明显,具有较好的实用性和良好的应用前景。  相似文献   

16.
基于运动补偿的三维小波视频编码   总被引:2,自引:0,他引:2       下载免费PDF全文
提出了一种新的基于运动补偿的三维小波视频编码方案。通过对原始图象序列沿着运动轨迹进行时间维小波分解以及空间上的二维小波分解,得到不同的时间-空间三维频率子带。然后,将这些子带中的小波系数构成三维方向等级树结构,并采用改进的SPIHT零树编码算法进行压缩。实验表明,此方法不仅提高了视频编码效率,而且易于进行码率控制,以及实现时间、空间分辨率上的可伸缩编码  相似文献   

17.
High speed paper currency recognition by neural networks.   总被引:9,自引:0,他引:9  
In this paper a new technique is proposed to improve the recognition ability and the transaction speed to classify the Japanese and US paper currency. Two types of data sets, time series data and Fourier power spectra, are used in this study. In both cases, they are directly used as inputs to the neural network. Furthermore, we also refer a new evaluation method of recognition ability. Meanwhile, a technique is proposed to reduce the input scale of the neural network without preventing the growth of recognition. This technique uses only a subset of the original data set which is obtained using random masks. The recognition ability of using large data set and a reduced data set are discussed. In addition to that the results of using a reduced data set of the Fourier power spectra and the time series data are compared.  相似文献   

18.
高能物理计算是典型的数据密集型计算,其主要采用基于文件的分级存储方案,根据访问热度的不同将数据存储于不同性能的存储设备上,然而当前数据热度预测采用基于人工经验的启发式算法,准确率较低。提出一种借助长短期记忆网络预测文件未来访问热度的方法,包括网络结构设计、训练和预测算法等。该方法通过划分动态时间窗口构造文件访问特征的时序序列,预测不同数据的访问趋势。在LHAASO高能物理实验数据集上的实验结果表明,与SVM、MLP等算法相比,该方法预测准确率提升了30%左右,具有更强的适用性。  相似文献   

19.
一种时间序列快速分段及符号化方法   总被引:1,自引:0,他引:1  
任江涛  何武  印鉴  张毅 《计算机科学》2005,32(9):166-169
作为一类重要的复杂类型数据,时间序列已成为数据挖掘领域的热点研究对象之一.针对时间序列的挖掘通常首先需要将时间序列分段并转变为种类有限的符号序列,以利于进一步进行时间序列模式挖掘.针对当前的时间序列分段方法复杂度较大,效率不高等问题,本文提出了一种简单高效的基于拐点检测的时间序列分段方法,并且采用动态时间弯曲度量计算不等长子序列的相异度,最后运用层次化聚类算法实现子序列的分类及符号化.实验表明,本文所提出的方法切实可行,实验结果具有较为明显的物理意义.  相似文献   

20.
Previous studies have shown that a random walk model is an appropriate time series model for explaining exchange rate time series. This analysis is based on the assumption that the variance of an exchange rate time series is homogeneous with respect to time. This paper shows that this assumption may be violated for exchange rate time series. The monthly exchange rate of German Deutschemark per U.S. dollar is considered. The data ranges from March 1973 to December 1984. The starting point roughly coincides with the beginning of the floating rate regime. It is seen that a non-linear model would be more appropriate than a linear model for explaining this exchange rate time series.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号