期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list

Jim Z.C. Lai 《Information Sciences》2011,181(9):1722-3410

In this paper, a new algorithm is developed to reduce the computational complexity of Ward’s method. The proposed approach uses a dynamic k-nearest-neighbor list to avoid the determination of a cluster’s nearest neighbor at some steps of the cluster merge. Double linked algorithm (DLA) can significantly reduce the computing time of the fast pairwise nearest neighbor (FPNN) algorithm by obtaining an approximate solution of hierarchical agglomerative clustering. In this paper, we propose a method to resolve the problem of a non-optimal solution for DLA while keeping the corresponding advantage of low computational complexity. The computational complexity of the proposed method DKNNA + FS (dynamic k-nearest-neighbor algorithm with a fast search) in terms of the number of distance calculations is O(N²), where N is the number of data points. Compared to FPNN with a fast search (FPNN + FS), the proposed method using the same fast search algorithm (DKNNA + FS) can reduce the computing time by a factor of 1.90-2.18 for the data set from a real image. In comparison with FPNN + FS, DKNNA + FS can reduce the computing time by a factor of 1.92-2.02 using the data set generated from three images. Compared to DLA with a fast search (DLA + FS), DKNNA + FS can decrease the average mean square error by 1.26% for the same data set. 相似文献

2.

Improvement of the k-means clustering filtering algorithm

Jim Z.C. Lai Author Vitae Author Vitae 《Pattern recognition》2008,41(12):3677-3681

In this paper, we present a modified filtering algorithm (MFA) by making use of center variations to speed up clustering process. Our method first divides clusters into static and active groups. We use the information of cluster displacements to reject unlikely cluster centers for all nodes in the kd-tree. We reduce the computational complexity of filtering algorithm (FA) through finding candidates for each node mainly from the set of active cluster centers. Two conditions for determining the set of candidate cluster centers for each node from active clusters are developed. Our approach is different from the major available algorithm, which passes no information from one stage of iteration to the next. Theoretical analysis shows that our method can reduce the computational complexity, in terms of the number of distance calculations, of FA at each stage of iteration by a factor of FC/AC, where FC and AC are the numbers of total clusters and active clusters, respectively. Compared with the FA, our algorithm can effectively reduce the computing time and number of distance calculations. It is noted that our proposed algorithm can generate the same clusters as that produced by hard k-means clustering. The superiority of our method is more remarkable when a larger data set with higher dimension is used. 相似文献

3.

A novel encoding algorithm for vector quantization using transformed codebook

Jim Z.C. Lai Author Vitae Author Vitae 《Pattern recognition》2009,42(11):3065-3070

In this paper, a novel encoding algorithm for vector quantization is presented. Our method uses a set of transformed codewords and partial distortion rejection to determine the reproduction vector of an input vector. Experimental results show that our algorithm is superior to other methods in terms of the computing time and number of distance calculations. Compared with available approaches, our method can reduce the computing time and number of distance calculations significantly. Compared with the available best method of reducing the number of distance computations, our approach can reduce the number of distance calculations by 32.3-67.1%. Compared with the best encoding algorithm for vector quantization, our method can also further reduce the computing time by 19.7-23.9%. The performance of our method is better when a larger codebook is used and is weakly correlated to codebook size. 相似文献

4.

Improvement of the fast exact pairwise-nearest-neighbor algorithm

Yi-Ching Liaw Author Vitae 《Pattern recognition》2009,42(5):867-1734

Pairwise-nearest-neighbor (PNN) is an effective method of data clustering, which can always generate good clustering results, but with high computational complexity. Fast exact PNN (FPNN) algorithm proposed by Fränti et al. is an effective method to speed up PNN and generates the same clustering results as those generated by PNN. In this paper, we present a novel method to improve the FPNN algorithm. Our algorithm uses the property that the cluster distance increases as the cluster merge process proceeds and adopts a fast search algorithm to reject impossible candidate clusters. Experimental results show that our proposed method can effectively reduce the number of distance calculations and computation time of FPNN algorithm. Compared with FPNN, our proposed approach can reduce the computation time and number of distance calculations by a factor of 24.8 and 146.4, respectively, for the data set from three real images. It is noted that our method generates the same clustering results as those produced by PNN and FPNN. 相似文献

5.

Fast global k-means clustering using cluster membership and inequality

Jim Z.C. Lai Author Vitae Author Vitae 《Pattern recognition》2010,43(5):1954-1963

In this paper, we present a fast global k-means clustering algorithm by making use of the cluster membership and geometrical information of a data point. This algorithm is referred to as MFGKM. The algorithm uses a set of inequalities developed in this paper to determine a starting point for the jth cluster center of global k-means clustering. Adopting multiple cluster center selection (MCS) for MFGKM, we also develop another clustering algorithm called MFGKM+MCS. MCS determines more than one starting point for each step of cluster split; while the available fast and modified global k-means clustering algorithms select one starting point for each cluster split. Our proposed method MFGKM can obtain the least distortion; while MFGKM+MCS may give the least computing time. Compared to the modified global k-means clustering algorithm, our method MFGKM can reduce the computing time and number of distance calculations by a factor of 3.78-5.55 and 21.13-31.41, respectively, with the average distortion reduction of 5,487 for the Statlog data set. Compared to the fast global k-means clustering algorithm, our method MFGKM+MCS can reduce the computing time by a factor of 5.78-8.70 with the average reduction of distortion of 30,564 using the same data set. The performances of our proposed methods are more remarkable when a data set with higher dimension is divided into more clusters. 相似文献

6.

A New Line Symmetry Distance and Its Application to Data Clustering 总被引：2，自引：0，他引：2

下载免费PDF全文

Sriparna Saha Student Member IEEE and Sanghamitra Bandyopadhyay Senior Member IEEE 《计算机科学技术学报》2009,24(3):544-556

In this paper,at first a new line-symmetry-based distance is proposed.The properties of the proposed distance are then elaborately described.Kd-tree-based nearest neighbor search is used to reduce the complexity of computing the proposed line-symmetry-based distance.Thereafter an evolutionary clustering technique is developed that uses the new linesymmetry -based distance measure for assigning points to different clusters.Adaptive mutation and crossover probabilities are used to accelerate the proposed c... 相似文献

7.

A Recursive Algorithm for On-line Clustering Obstacles Cluttered in Dynamic Environments

Jin-Liang Chen Jing-Sin Liu Wan-Chi Lee 《Journal of Intelligent and Robotic Systems》2002,33(2):209-230

In some applications of industrial robots, the robot manipulator must traverse a pre-specified Cartesian path with its hand tip while links of the robot safely move among obstacles cluttered in the robot's scene (environment). In order to reduce the costs of collision detection, one approach is to reduce the number of collision checks by enclosing a few real obstacles with a larger (artificial) bounding volume (a cluster), e.g., by their convex hull [4, 14], without cutting the specified path.In this paper, we propose a recursive algorithm composed of four procedures to tackle the problem of clustering convex polygons cluttered around a specified path in a dynamic environment. A key fact observed is that the number k of clusters is actually determined by the specified path not by any criterion used in clustering. Based on this fact, an initial set of k clusters could be rapidly generated. Then, the initial set of clusters and its number is further refined for satisfying the minimum Euclidean distance criterion imposed in clustering. Compared to the heuristic algorithm in [14], complexity of the proposed algorithm is reduced by one order with respect to the number n of obstacles. Simulation are performed in both static and dynamic environments, which show that the recursive algorithm is very efficient and acquires less number k of clusters. 相似文献

8.

An effective density based approach to detect complex data clusters using notion of neighborhood difference

S. Nagaraju Manish Kashyap Mahua Bhattachraya 《国际自动化与计算杂志》2017,14(1):57-67

The density based notion for clustering approach is used widely due to its easy implementation and ability to detect arbitrary shaped clusters in the presence of noisy data points without requiring prior knowledge of the number of clusters to be identified. Density-based spatial clustering of applications with noise (DBSCAN) is the first algorithm proposed in the literature that uses density based notion for cluster detection. Since most of the real data set, today contains feature space of adjacent nested clusters, clearly DBSCAN is not suitable to detect variable adjacent density clusters due to the use of global density parameter neighborhood radius N _rad and minimum number of points in neighborhood N _pts. So the efficiency of DBSCAN depends on these initial parameter settings, for DBSCAN to work properly, the neighborhood radius must be less than the distance between two clusters otherwise algorithm merges two clusters and detects them as a single cluster. Through this paper: 1) We have proposed improved version of DBSCAN algorithm to detect clusters of varying density adjacent clusters by using the concept of neighborhood difference and using the notion of density based approach without introducing much additional computational complexity to original DBSCAN algorithm. 2) We validated our experimental results using one of our authors recently proposed space density indexing (SDI) internal cluster measure to demonstrate the quality of proposed clustering method. Also our experimental results suggested that proposed method is effective in detecting variable density adjacent nested clusters. 相似文献

9.

A Point Symmetry-Based Clustering Technique for Automatic Evolution of Clusters 总被引：2，自引：0，他引：2

Bandyopadhyay S. Saha S. 《Knowledge and Data Engineering, IEEE Transactions on》2008,20(11):1441-1457

In this article, a new symmetry based genetic clustering algorithm is proposed which automatically evolves the number of clusters as well as the proper partitioning from a data set. Strings comprise both real numbers and the don't care symbol in order to encode a variable number of clusters. Here, assignment of points to different clusters are done based on a point symmetry based distance rather than the Euclidean distance. A newly proposed point symmetry based cluster validity index, {em Sym}-index, is used as a measure of the validity of the corresponding partitioning. The algorithm is therefore able to detect both convex and non-convex clusters irrespective of their sizes and shapes as long as they possess the point symmetry property. Kd-tree based nearest neighbor search is used to reduce the complexity of computing point symmetry based distance. A proof on the convergence property of variable string length GA with point symmetry based distance clustering (VGAPS-clustering) technique is also provided. The effectiveness of VGAPS-clustering compared to variable string length Genetic K-means algorithm (GCUK-clustering) and one recently developed weighted sum validity function based hybrid niching genetic algorithm (HNGA-clustering) is demonstrated for nine artificial and five real-life data sets. 相似文献

10.

Transformational Derivation of an Improved Alias Analysis Algorithm

Deepak Goyal 《Higher-Order and Symbolic Computation》2005,18(1-2):15-49

In this paper we use a program transformational approach to obtain an asymptotically improved may-alias analysis algorithm. We derive an O(N³) time algorithm for computing an intra-procedural flow sensitive may-alias analysis, where N denotes the number of edges in the program control flow graph (CFG). Our algorithm improves the previous O(N⁵) time algorithm by Hind et al. [19]. Our time complexity improvement comes without any deterioration in space complexity. We also show that for a large subclass of programs in which the in-degree and out-degree of all CFG nodes is bounded by a constant, our algorithm is linear in the sum of the number of edges in the CFG of the program and the size of the output, i.e., the size of the computed alias information, and is therefore asymptotically optimal. Our transformational algorithm derivation technique also leads to a simplified yet precise analysis of time complexity.The work in this paper was done when the author was a graduate student at New York University. This paper was originally submitted when the author was a Research Staff Member at the IBM T.J. Watson Research Center. 相似文献

11.

A novel approach for fast codebook re-quantization

Chih-Tang Chang Yi-Ching Liaw Jim Z.C. Lai 《Pattern recognition》2008,41(9):2956-2963

In this paper, we present a fast codebook re-quantization algorithm (FCRA) using codewords of a codebook being re-quantized as the training vectors to generate the re-quantized codebook. Our method is different from the available approach, which uses the original training set to generate a re-quantized codebook. Compared to the traditional approach, our method can reduce the computing time dramatically, since the number of codewords of a codebook being re-quantized is usually much smaller than the number of original training vectors. Our method first classifies codewords of a re-quantized codebook into static and active groups. This approach uses the information of codeword displacements between successive partitions to reject impossible candidates in the partition process of codebook re-quantization. By implementing a fast search algorithm used for vector quantization encoding (MFAUPI) in the partition step of FCRA, the computational complexity of codebook re-quantization can be further reduced significantly. Using MFAUPI, the computing time of FCRA can be reduced by a factor of 1.55–3.78. Compared with the available approach OARC (optimization algorithm for re-quantization codebook), our proposed method can reduce the codebook re-quantization time by a factor of about 8005 using a training set of six real images. This reduction factor is increased when the re-quantized codebook size and/or training set size are increased. It is noted that our proposed algorithm can generate the same re-quantized codebook as that produced by the OARC. 相似文献

12.

Optimal Computing the Chessboard Distance Transform on Parallel Processing Systems

Yu-Hua Lee Shi-Jinn Horng 《Computer Vision and Image Understanding》1999,73(3):272

Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N²) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N²/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N²/log logN) processors, andO(logN) time on the hypercube computer usingO(N²/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications. 相似文献

13.

Fast k-nearest neighbors search using modified principal axis search tree

Yi-Ching Liaw Chien-Min Wu Maw-Lin Leou 《Digital Signal Processing》2010,20(5):1494-1501

The problem of k-nearest neighbors (kNN) is to find the nearest k neighbors for a query point from a given data set. Among available methods, the principal axis search tree (PAT) algorithm always has good performance on finding nearest k neighbors using the PAT structure and a node elimination criterion. In this paper, a novel kNN search algorithm is proposed. The proposed algorithm stores projection values for all data points in leaf nodes. If a leaf node in the PAT cannot be rejected by the node elimination criterion, data points in the leaf node are further checked using their pre-stored projection values to reject more impossible data points. Experimental results show that the proposed method can effectively reduce the number of distance calculations and computation time for the PAT algorithm, especially for the data set with a large dimension or for a search tree with large number of data points in a leaf node. 相似文献

14.

On computing the diameter of a point set in high dimensional Euclidean space

《Theoretical computer science》2002,287(2):501-514

We consider the problem of computing the diameter of a set of n points in d-dimensional Euclidean space under Euclidean distance function. We describe an algorithm that in time O(dnlogn+n²) finds with high probability an arbitrarily close approximation of the diameter. For large values of d the complexity bound of our algorithm is a substantial improvement over the complexity bounds of previously known exact algorithms. Computing and approximating the diameter are fundamental primitives in high dimensional computational geometry and find practical application, for example, in clustering operations for image databases. 相似文献

15.

并行马尔可夫随机场计算变分光流方法

下载免费PDF全文

江少锋赵鹏杨素华陈震张聪炫《中国图象图形学报》2018,23(12):1852-1863

目的基于马尔可夫随机场（MRF）的变分光流计算是一种较为鲁棒的光流计算方法,但是计算效率很低。置信传播算法（BP）是一种针对MRF较为高效的全局优化算法。本文提出一种MRF变分光流计算模型并采用并行BP方法实现,极大提高计算效率。方法提出的MRF变分光流计算模型中的数据项采用了Horn等人根据灰度守恒假设得到的光流基本约束方程,并采用非平方惩罚函数进行调整以平滑边界影响。为在CUDA平台上实现高效并行处理,本文提出了一种优化的基于置信传播的MRF并行光流计算方法。该优化方法在采用置信传播最小化MRF光流能量函数时,采用了一种4层的3维网络结构进行并行计算,每层对应MRF4邻域模型中的一个方向的信息传播,同时在每层中为每个像素分配多个线程采用并行降维法计算所要传递的信息,大大降低单线程计算负荷,大幅度提高计算效率。结果采用旋转小球图像序列进行实验,计算效率提高314倍;采用旋转小球、Yosemite山谷和RubberWhale 3种不同图像序列,与Horn算法、Weickert算法、Hossen并行Lucas算法、Grauer-Gray并行MRF算法进行对比实验,本文方法得到最低的平均端点误差（AEE）,分别为0.13、0.55和0.34。结论本文提出了一种新的MRF光流计算模型,并在CUDA平台上实现了并行优化计算。实验结果表明,本文提出的并行计算方法在保持计算精度的同时极大提高了计算效率。本文方法对内存需求巨大,在处理高分辨率图像时,限制了采样点数,难以计算大位移。相似文献

16.

A subdivision method for computing nearest gcd with certification

Guillaume Chèze André Galligo Bernard Mourrain 《Theoretical computer science》2011,412(35):4493-4503

A new subdivision method for computing the nearest univariate gcd is described and analyzed. It is based on an exclusion test and an inclusion test. The exclusion test in a cell exploits Taylor expansion of the polynomial at the center of the cell. The inclusion test uses Smale’s α-theorems to certify the existence and unicity of a solution in a cell.Under the condition of simple roots for the distance minimization problem, we analyze the complexity of the algorithm in terms of a condition number, which is the inverse of the distance to the set of degenerate systems.We report on some experimentation on representative examples to illustrate the behavior of the algorithm. 相似文献

17.

On the Linear-Cost Subtree-Transfer Distance between Phylogenetic Trees

B. DasGupta X. He T. Jiang M. Li J. Tromp 《Algorithmica》1999,25(2-3):176-195

Different phylogenetic trees for the same group of species are often produced either by procedures that use diverse optimality criteria [16] or from different genes [12] in the study of molecular evolution. Comparing these trees to find their similarities and dissimilarities (i.e., distance ) is thus an important issue in computational molecular biology. Several distance metrics including the nearest neighbor interchange (nni) distance and the subtree-transfer distance have been proposed and extensively studied in the literature. This article considers a natural extension of the subtree-transfer distance, called the linear-cost subtree-transfer distance, and studies the complexity and efficient approximation algorithms for this distance as well as its relationship to the nni distance. The linear-cost subtree-transfer model seems more suitable than the (unit-cost) subtree-transfer model in some applications. The following is a list of our results: 1. The linear-cost subtree-transfer distance is in fact identical to the nni distance on unweighted phylogenies. 2. There is an algorithm to compute an optimal linear-cost subtree-transfer sequence between unweighted phylogenies in O(n ⋅ 2 ^O(d) ) time, where d denotes the linear-cost subtree-transfer distance. Such an algorithm is useful when d is small. 3. Computing the linear-cost subtree-transfer distance between two weighted phylogenetic trees is NP-hard, provided we allow multiple leaves of a tree to share the same label (i.e., the trees are not necessarily uniquely labeled). 4. There is an efficient approximation algorithm for computing the linear-cost subtree-transfer distance between weighted phylogenies with performance ratio 2 . Received May 8, 1997; revised February 20, 1998. 相似文献

18.

水平集中符号距离函数并行降维计算

下载免费PDF全文

江少锋杨素华陈震张聪炫周旭欣《中国图象图形学报》2018,23(2):174-181

目的符号距离函数在水平集图像分割,视觉特征提取等图像处理领域有重要应用。随着图像分辨率越来越高,符号距离函数计算效率直接影响图像处理速度,为实现高分辨率图像实时处理,本文在降维法的基础上提出了并行算法,并针对并行计算对降维法进行了改进。方法降维法将2维距离计算转化为两个1维距离计算,并采用抛物线下界法计算1维距离,是当前最快的一种符号距离计算方法。首先利用行和列计算的独立性,提出了降维法的并行算法。然后再对并行降维法进行改进,提出了抛物线下界法的并行算法。该方法采用多线程分段并行计算抛物线下界,即每个像素点与段内相邻像素点并行进行抛物线求交运算,快速搜索抛物线下界,从而实现了抛物线下界法的分段并行距离函数计算。所有并行算法在CUDA平台上采用GPU通用并行计算方法实现。结果对不同分辨率及包含不同曲线的9幅图像进行实验测试,在距离计算误差小于1的条件下,并行降维算法对所有测试图像计算时间均小于0.06 s,计算效率比串行方法有了10倍以上的提升,改进并行降维算法对所有测试图像计算时间均小于0.03 s,计算效率比串行方法有了20倍左右的提升。结论该方法实现了符号距离函数的快速并行计算,其优势在于当图像分辨率较高时仍然能够实现实时处理。相似文献

19.

Computing the euler–poincaré characteristics of sign conditions

Saugata Basu Richard Pollack Marie-FrancÇoise Roy 《Computational Complexity》2005,14(1):53-71

相似文献

20.

快速提高NSGA-Ⅱ算法双目标优化效率的方法

下载免费PDF全文

刘敏陈宝兴郑金华《计算机工程与应用》2009,45(34):47-49

NSGA-Ⅱ是一种性能优良的多目标进化算法,近年来非常流行。为了进一步改进NSGA-Ⅱ在双目标优化时的效率,采取了按需分层的策略,提出了一种新的非支配前沿集分层方法以替代NSGA-II原有的分层方法。与NSGA-Ⅱ的时间复杂度O(N2)相比,新方法的时间复杂度减少为O(kN+NlogN),k为所分前沿层数(k<相似文献