首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
张豪  陈黎飞  郭躬德 《计算机科学》2015,42(5):114-118, 141
符号序列由有限个符号按一定顺序排列而成,广泛存在于数据挖掘的许多应用领域,如基因序列、蛋白质序列和语音序列等.作为序列挖掘的一种主要方法,序列聚类分析在识别序列数据内在结构等方面具有重要的应用价值;同时,由于符号序列间相似性度量较为困难,序列聚类也是当前的一项开放性难题.首先提出一种新的符号序列相似度度量,引入长度规范因子解决现有度量对序列长度敏感的问题,从而提高了符号序列相似度度量的有效性.在此基础上,提出一种新的聚类方法,根据样本相似度构建无回路连通图,通过图划分进行符号序列的层次聚类.在多个实际数据集上的实验结果表明,采用规范化度量的新方法可以有效提高符号序列的聚类精度.  相似文献   

2.
We overview and discuss several methods for the Fourier analysis of symbolic data, such as DNA sequences, emphasizing their mutual connections. We consider the indicator sequence approach, the vector and the symbolic autocorrelation methods, and methods such as the spectral envelope, that for each frequency optimize the symbolic-no-numeric mapping to emphasize any periodic data features. We discuss the equivalence or connections between these methods. We show that it is possible to define the autocorrelation function of symbolic data, assuming only that we can compare any two symbols and decide if they are equal or distinct. The autocorrelation is a numeric sequence, and its Fourier transform can also be obtained by summing the squares of the Fourier transform of indicator sequences (zero/one sequences indicating the position of the symbols). Another interpretation of the spectrum is given, borrowing from the spectral envelope concept: among all symbolic-to-numeric mappings there is one that maximizes the spectral energy at each frequency, and leads to the spectrum.  相似文献   

3.
In order to gain probabilistic results, ensemble simulation techniques are increasingly applied in the weather and climate sciences (as well as in various other scientific disciplines). In many cases, however, only mean results or other abstracted quantities such as percentiles are used for further analyses and dissemination of the data. In this work, we aim at a more detailed visualization of the temporal development of the whole ensemble that takes the variability of all single members into account. We propose a visual analytics tool that allows an effective analysis process based on a hierarchical clustering of the time‐dependent scalar fields. The system includes a flow chart that shows the ensemble members' cluster affiliation over time, reflecting the whole cluster hierarchy. The latter one can be dynamically explored using a visualization derived from a dendrogram. As an aid in linking the different views, we have developed an adaptive coloring scheme that takes into account cluster similarity and the containment relationships. Finally, standard visualizations of the involved field data (cluster means, ground truth data, etc.) are also incorporated. We include results of our work on real‐world datasets to showcase the utility of our approach.  相似文献   

4.
The Hierarchical Hidden Markov Model: Analysis and Applications   总被引:20,自引:0,他引:20  
Fine  Shai  Singer  Yoram  Tishby  Naftali 《Machine Learning》1998,32(1):41-62
We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM). Our model is motivated by the complex multi-scale structure which appears in many natural sequences, particularly in language, handwriting and speech. We seek a systematic unsupervised approach to the modeling of such structures. By extending the standard Baum-Welch (forward-backward) algorithm, we derive an efficient procedure for estimating the model parameters from unlabeled data. We then use the trained model for automatic hierarchical parsing of observation sequences. We describe two applications of our model and its parameter estimation procedure. In the first application we show how to construct hierarchical models of natural English text. In these models different levels of the hierarchy correspond to structures on different length scales in the text. In the second application we demonstrate how HHMMs can be used to automatically identify repeated strokes that represent combination of letters in cursive handwriting.  相似文献   

5.
We study run-time issues, such as site allocation and query scheduling policies, in executing read-only queries in a hierarchical, distributed memory, multicomputer system. The particular architecture considered is based on the hypercube interconnection. The data are stored in a base cube, which is controlled by a control cube and host node hierarchy. Input query trees are transformed into operation sequence trees, and the operation sequences become the units of scheduling. These sequences are scheduled dynamically at run-time. Algorithms for dynamic site allocation are provided. Several query scheduling policies that support interquery concurrency are also studied. Average query completion times and initiation delays are obtained for the various policies using simulations  相似文献   

6.
This paper proposes a novel approach to structuring behavioral knowledge based on symbolization of human whole body motions, hierarchical classification of the motions, and extraction of the causality among the motions. The motion patterns are encoded into parameters of corresponding Hidden Markov Models (HMMs), where each HMM abstracts the dynamics of motion pattern, and hereafter is referred to as “motion symbol”. The motion symbols allow motion recognition and synthesis. The motion symbols are organized into a hierarchical tree structure representing the property of spatial similarity among the motion patterns, and this tree is referred to as “motion symbol tree”. Seamless motion is segmented into a sequence of motion primitives, each of which is classified as a motion symbol based on the motion symbol tree. The seamless motion results in a sequence of the motion symbols, which is stochastically represented as transitions between the motion symbols by an N-gram model. The motion symbol N-gram model is referred to as “motion symbol graph”. The motion symbol graph extracts the temporal causality among the human behaviors. The integration of the motion symbol tree and the motion symbol graph makes it possible to recognize motion patterns fast and predict human behavior during observation. The experiments on a motion dataset of radio calisthenics and on a large motion dataset provided by CMU motion database validate the proposed framework.  相似文献   

7.
A method is derived whereby cross-correlation techniques using pseudo-random sequences can be used to estimate the weighting sequence from experimental data of arbitrary lengths without incurring error duo to partial pseudo-random sequence periods. The method consists of forming an ensemble average of the output data before cross-correlation is performed. Slight modifications of the method give increased accuracy when the non-zero off-peak value of the autocorrelation function is not ignored, or when it is known that the weighting sequence is shorter than the pseudo-random sequenco period.  相似文献   

8.
Discovering High-Order Periodic Patterns   总被引:2,自引:2,他引:0  
Discovery of periodic patterns in time series data has become an active research area with many applications. These patterns can be hierarchical in nature, where a higher-level pattern may consist of repetitions of lower-level patterns. Unfortunately, the presence of noise may prevent these higher-level patterns from being recognized in the sense that two portions (of a data sequence) that support the same (high-level) pattern may have different layouts of occurrences of basic symbols. There may not exist any common representation in terms of raw symbol combinations; and hence such (high-level) pattern may not be expressed by any previous model (defined on raw symbols or symbol combinations) and would not be properly recognized by any existing method. In this paper, we propose a novel model, namely meta-pattern, to capture these high-level patterns. As a more flexible model, the number of potential meta-patterns could be very large. A substantial difficulty lies in how to identify the proper pattern candidates. However, the well-known Apriori property is not able to provide sufficient pruning power. A new property, namely component location property, is identified and used to conduct the candidate generation so that an efficient computation-based mining algorithm can be developed. Last, but not least, we apply our algorithm to some real and synthetic sequences and some interesting patterns are discovered.  相似文献   

9.
《Graphical Models》2005,67(3):150-165
In this paper, we propose a hierarchical approach to 3D scattered data interpolation and approximation with compactly supported radial basis functions. Our numerical experiments suggest that the approach integrates the best aspects of scattered data fitting with locally and globally supported basis functions. Employing locally supported functions leads to an efficient computational procedure, while a coarse-to-fine hierarchy makes our method insensitive to the density of scattered data and allows us to restore large parts of missed data. Given a point cloud distributed over a surface, we first use spatial down sampling to construct a coarse-to-fine hierarchy of point sets. Then we interpolate (approximate) the sets starting from the coarsest level. We interpolate (approximate) a point set of the hierarchy, as an offsetting of the interpolating function computed at the previous level. The resulting fitting procedure is fast, memory efficient, and easy to implement.  相似文献   

10.
Video annotation is an important issue in video content management systems. Rapid growth of the digital video data has created a need for efficient and reasonable mechanisms that can ease the annotation process. In this paper, we propose a novel hierarchical clustering based system for video annotation. The proposed system generates a top-down hierarchy of the video streams using hierarchical k-means clustering. A tree-based structure is produced by dividing the video recursively into sub-groups, each of which consists of similar content. Based on the visual features, each node of the tree is partitioned into its children using k-means clustering. Each sub-group is then represented by its key frame, which is selected as the closest frame to the centroids of the corresponding cluster, and then can be displayed at the higher level of the hierarchy. The experiments show that very good hierarchical view of the video sequences can be created for annotation in terms of efficiency.  相似文献   

11.
In this paper we discuss a hierarchical relaxation method for solving a system of equations. The method proceeds by adjoining to a given system of equations whose solution is sought, an auxiliary hierarchy of systems. The relaxation procedure consists of a judicious mixing of relaxation steps in the different members of the hierarchy. When the choice of the hierarchy and the mixing of relaxation steps are appropriate, the entire procedure provides an acceleration of the relaxation process toward a determination of the solution of the original system. The procedure lends itself to parallel implementation, even in an asynchronous mode. We discuss these aspects of hierarchical relaxation as well.  相似文献   

12.
构造了多层Count-Min概要数据结构来概括流数据中的层次结构。通过定义多层数据域U*上两两相互独立的异或哈希函数族,将数据流元组映射到L×D×W的三维计数数组,L是层次个数,D是从哈希函数族中均匀随机选取的哈希函数个数,W是哈希函数的值域。基于该结构,利用广度优先查询策略,查找多层频繁项集和估计多层频繁项值。实验表明,该结构在更新时间、存储空间和估计精度方面比直接堆叠多个Count-Min结构有较大的提高。  相似文献   

13.
In hierarchical classification, classes are arranged in a hierarchy represented by a tree or a forest, and each example is labeled with a set of classes located on paths from roots to leaves or internal nodes. In other words, both multiple and partial paths are allowed. A straightforward approach to learn a hierarchical classifier, usually used as a baseline method, consists in learning one binary classifier for each node of the hierarchy; the hierarchical classifier is then obtained using a top-down evaluation procedure. The main drawback of this naive approach is that these binary classifiers are constructed independently, when it is clear that there are dependencies between them that are motivated by the hierarchy and the evaluation procedure employed. In this paper, we present a new decomposition method in which each node classifier is built taking into account other classifiers, its descendants, and the loss function used to measure the goodness of hierarchical classifiers. Following a bottom-up learning strategy, the idea is to optimize the loss function at every subtree assuming that all classifiers are known except the one at the root. Experimental results show that the proposed approach has accuracies comparable to state-of-the-art hierarchical algorithms and is better than the naive baseline method described above. Moreover, the benefits of our proposal include the possibility of parallel implementations, as well as the use of all available well-known techniques to tune binary classification SVMs.  相似文献   

14.
序列化推荐任务根据用户历史行为序列,预测下一时刻即将交互的物品.大量研究表明:预测物品对用户历史行为序列的依赖是多层次的.已有的多尺度方法是针对隐式表示空间的启发式设计,不能显式地推断层次结构.为此,该文提出动态层次Transformer,来同时学习多尺度隐式表示与显式层次树.动态层次Transformer采用多层结构...  相似文献   

15.
Temporal sequence learning is one of the most critical components for human intelligence. In this paper, a novel hierarchical structure for complex temporal sequence learning is proposed. Hierarchical organization, a prediction mechanism, and one-shot learning characterize the model. In the lowest level of the hierarchy, we use a modified Hebbian learning mechanism for pattern recognition. Our model employs both active 0 and active 1 sensory inputs. A winner-take-all (WTA) mechanism is used to select active neurons that become the input for sequence learning at higher hierarchical levels. Prediction is an essential element of our temporal sequence learning model. By correct prediction, the machine indicates it knows the current sequence and does not require additional learning. When the prediction is incorrect, one-shot learning is executed and the machine learns the new input sequence as soon as the sequence is completed. A four-level hierarchical structure that isolates letters, words, sentences, and strophes is used in this paper to illustrate the model  相似文献   

16.
17.
The relationship between parallel isometric array languages and sequential isometric array languages is examined. Their hierarchical structures are investigated and a hierarchy is established by introducing parallel context-free array languages (PCFAL), derivation bounded array languages (DBAL), linear array languages (LAL), and extended regular afray languages (ERAL). It is interesting to find that some fundamental aspects that hold in one-dimensional string languages do not hold in their two-dimensional counterparts. Some parsing techniques are also explored. It is shown that while parallel parsing grammars may be simpler to write and parallel processing usually takes less time than sequential ones, the nature of parallel parsing is very complicated. Finally, several future research topics concerned with parallel isometric array languages including their complexities, hierarchical structures, and application to pattern recognition are discussed.  相似文献   

18.
Visual data comprise of multi-scale and inhomogeneous signals. In this paper, we exploit these characteristics and develop a compact data representation technique based on a hierarchical tensor-based transformation. In this technique, an original multi-dimensional dataset is transformed into a hierarchy of signals to expose its multi-scale structures. The signal at each level of the hierarchy is further divided into a number of smaller tensors to expose its spatially inhomogeneous structures. These smaller tensors are further transformed and pruned using a tensor approximation technique. Our hierarchical tensor approximation supports progressive transmission and partial decompression. Experimental results indicate that our technique can achieve higher compression ratios and quality than previous methods, including wavelet transforms, wavelet packet transforms, and single-level tensor approximation. We have successfully applied our technique to multiple tasks involving multi-dimensional visual data, including medical and scientific data visualization, data-driven rendering and texture synthesis.  相似文献   

19.
Compression of Human Motion Capture Data Using Motion Pattern Indexing   总被引:1,自引:0,他引:1  
In this work, a novel scheme is proposed to compress human motion capture data based on hierarchical structure construction and motion pattern indexing. For a given sequence of 3D motion capture data of human body, the 3D markers are first organized into a hierarchy where each node corresponds to a meaningful part of the human body. Then, the motion sequence corresponding to each body part is coded separately. Based on the observation that there is a high degree of spatial and temporal correlation among the 3D marker positions, we strive to identify motion patterns that form a database for each meaningful body part. Thereafter, a sequence of motion capture data can be efficiently represented as a series of motion pattern indices. As a result, higher compression ratio has been achieved when compared with the prior art, especially for long sequences of motion capture data with repetitive motion styles. Another distinction of this work is that it provides means for flexible and intuitive global and local distortion controls.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号