期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Locally linear metric adaptation with application to semi-supervised clustering and image retrieval

Hong Chang 《Pattern recognition》2006,39(7):1253-1264

Many computer vision and pattern recognition algorithms are very sensitive to the choice of an appropriate distance metric. Some recent research sought to address a variant of the conventional clustering problem called semi-supervised clustering, which performs clustering in the presence of some background knowledge or supervisory information expressed as pairwise similarity or dissimilarity constraints. However, existing metric learning methods for semi-supervised clustering mostly perform global metric learning through a linear transformation. In this paper, we propose a new metric learning method that performs nonlinear transformation globally but linear transformation locally. In particular, we formulate the learning problem as an optimization problem and present three methods for solving it. Through some toy data sets, we show empirically that our locally linear metric adaptation (LLMA) method can handle some difficult cases that cannot be handled satisfactorily by previous methods. We also demonstrate the effectiveness of our method on some UCI data sets. Besides applying LLMA to semi-supervised clustering, we have also used it to improve the performance of content-based image retrieval systems through metric learning. Experimental results based on two real-world image databases show that LLMA significantly outperforms other methods in boosting the image retrieval performance. 相似文献

2.

基于颜色特征的视频数据库检索系统 总被引：2，自引：0，他引：2

许伟许宏丽《计算机工程与设计》2006,27(7):1208-1210

为了在视频数据库中提供有效的视频检索和浏览功能,必须建立高效的索引.由于视频数据具有层次性的结构,在镜头边界检测后,可以利用聚类方法按不同的相似性尺度对镜头关键帧进行处理,对视频数据建立索引.该系统采用颜色特征,使用Twin Comparison算法实现镜头检测和直方图平均法实现关键帧提取,对关键帧采用K均值聚类算法处理,建立视频数据库索引.实验结果表明该系统能较好地实现视频快速浏览和检索功能. 相似文献

3.

Efficient matching and indexing of graph models in content-basedretrieval

Berretti S. Del Bimbo A. Vicario E. 《IEEE transactions on pattern analysis and machine intelligence》2001,23(10):1089-1105

In retrieval from image databases, evaluation of similarity, based both on the appearance of spatial entities and on their mutual relationships, depends on content representation based on attributed relational graphs. This kind of modeling entails complex matching and indexing, which presently prevents its usage within comprehensive applications. In this paper, we provide a graph-theoretical formulation for the problem of retrieval based on the joint similarity of individual entities and of their mutual relationships and we expound its implications on indexing and matching. In particular, we propose the usage of metric indexing to organize large archives of graph models, and we propose an original look-ahead method which represents an efficient solution for the (sub)graph error correcting isomorphism problem needed to compute object distances. Analytic comparison and experimental results show that the proposed lookahead improves the state-of-the-art in state-space search methods and that the combined use of the proposed matching and indexing scheme permits for the management of the complexity of a typical application of retrieval by spatial arrangement 相似文献

4.

Relaxational metric adaptation and its application to semi-supervised clustering and content-based image retrieval

Hong Chang Author Vitae Author Vitae William K. Cheung Author Vitae 《Pattern recognition》2006,39(10):1905-1917

The performance of many supervised and unsupervised learning algorithms is very sensitive to the choice of an appropriate distance metric. Previous work in metric learning and adaptation has mostly been focused on classification tasks by making use of class label information. In standard clustering tasks, however, class label information is not available. In order to adapt the metric to improve the clustering results, some background knowledge or side information is needed. One useful type of side information is in the form of pairwise similarity or dissimilarity information. Recently, some novel methods (e.g., the parametric method proposed by Xing et al.) for learning global metrics based on pairwise side information have been shown to demonstrate promising results. In this paper, we propose a nonparametric method, called relaxational metric adaptation (RMA), for the same metric adaptation problem. While RMA is local in the sense that it allows locally adaptive metrics, it is also global because even patterns not in the vicinity can have long-range effects on the metric adaptation process. Experimental results for semi-supervised clustering based on both simulated and real-world data sets show that RMA outperforms Xing et al.'s method under most situations. Besides applying RMA to semi-supervised learning, we have also used it to improve the performance of content-based image retrieval systems through metric adaptation. Experimental results based on two real-world image databases show that RMA significantly outperforms other methods in improving the image retrieval performance. 相似文献

5.

融合自动权重学习的深度子空间聚类

江雨燕邵金李平《计算机工程》2022,48(8):77

子空间聚类算法是一种面向高维数据的聚类方法,具有独特的数据自表示方式和较高的聚类精度。传统子空间聚类算法聚焦于对输入数据构建最优相似图再进行分割,导致聚类效果高度依赖于相似图学习。自适应近邻聚类（CAN）算法改进了相似图学习过程,根据数据间的距离自适应地分配最优邻居以构建相似图和聚类结构。然而,现有CAN算法在进行高维数据非线性聚类时,难以很好地捕获局部数据结构,从而导致聚类准确性及算法泛化能力有限。提出一种融合自动权重学习与结构化信息的深度子空间聚类算法。通过自编码器将数据映射到非线性潜在空间并降维,自适应地赋予潜在特征不同的权重从而处理噪声特征,最小化自编码器的重构误差以保留数据的局部结构信息。通过CAN方法学习相似图,在潜在表示下迭代地增强各特征间的相关性,从而保留数据的全局结构信息。实验结果表明,在ORL、COIL-20、UMIST数据集上该算法的准确率分别达到0.780 1、0.874 3、0.742 1,聚类性能优于LRR、LRSC、SSC、KSSC等算法。相似文献

6.

Fast video segment retrieval by Sort-Merge feature selection, boundary refinement, and lazy evaluation

Yan Liu John R. Kender 《Computer Vision and Image Understanding》2003,92(2-3):147

We present a fast video retrieval system with three novel characteristics. First, it exploits the methods of machine learning to construct automatically a hierarchy of small subsets of features that are progressively more useful for indexing. These subsets are induced by a new heuristic method called Sort-Merge feature selection, which exploits a novel combination of Fastmap for dimensionality reduction and Mahalanobis distance for likelihood determination. Second, because these induced feature sets form a hierarchy with increasing classification accuracy, video segments can be segmented and categorized simultaneously in a coarse-fine manner that efficiently and progressively detects and refines their temporal boundaries. Third, the feature set hierarchy enables an efficient implementation of query systems by the approach of lazy evaluation, in which new queries are used to refine the retrieval index in real-time. We analyze the performance of these methods, and demonstrate them in the domain of a 75-min instructional video and a 30-min baseball video. 相似文献

7.

Fast similarity search and clustering of video sequences on the world-wide-web 总被引：3，自引：0，他引：3

Cheung S.-S. Zakhor A. 《Multimedia, IEEE Transactions on》2005,7(3):524-537

We define similar video content as video sequences with almost identical content but possibly compressed at different qualities, reformatted to different sizes and frame-rates, undergone minor editing in either spatial or temporal domain, or summarized into keyframe sequences. Building a search engine to identify such similar content in the World-Wide Web requires: 1) robust video similarity measurements; 2) fast similarity search techniques on large databases; and 3) intuitive organization of search results. In a previous paper, we proposed a randomized technique called the video signature (ViSig) method for video similarity measurement. In this paper, we focus on the remaining two issues by proposing a feature extraction scheme for fast similarity search, and a clustering algorithm for identification of similar clusters. Similar to many other content-based methods, the ViSig method uses high-dimensional feature vectors to represent video. To warrant a fast response time for similarity searches on high dimensional vectors, we propose a novel nonlinear feature extraction scheme on arbitrary metric spaces that combines the triangle inequality with the classical Principal Component Analysis (PCA). We show experimentally that the proposed technique outperforms PCA, Fastmap, Triangle-Inequality Pruning, and Haar wavelet on signature data. To further improve retrieval performance, and provide better organization of similarity search results, we introduce a new graph-theoretical clustering algorithm on large databases of signatures. This algorithm treats all signatures as an abstract threshold graph, where the distance threshold is determined based on local data statistics. Similar clusters are then identified as highly connected regions in the graph. By measuring the retrieval performance against a ground-truth set, we show that our proposed algorithm outperforms simple thresholding, single-link and complete-link hierarchical clustering techniques. 相似文献

8.

结合稀疏编码和金字塔匹配的视频检索

甘玲汪子彧《计算机工程与应用》2013,49(21):191-194

针对金字塔匹配下的视频检索系统中基础特征用矢量量化方法表示不够精确的问题,结合稀疏编码方法进行视频检索。视频的基础特征通过稀疏编码表示后,用金字塔方法进行多次匹配,将多次匹配结果线性合并,作为修正后的相似性度量结果。通过对UCF50的检索实验表明,该方法能显著提高检索的准确率。相似文献

9.

Efficient Content-Based Image Retrieval through Metric Histograms 总被引：1，自引：0，他引：1

Traina A. J. M. Traina C. Bueno J. M. Chino F. J. T. Azevedo-Marques P. 《World Wide Web》2003,6(2):157-185

This paper presents a new and efficient method for content-based image retrieval employing the color distribution of images. This new method, called metric histogram, takes advantage of the correlation among adjacent bins of histograms, reducing the dimensionality of the feature vectors extracted from images, leading to faster and more flexible indexing and retrieval processes. The proposed technique works on each image independently from the others in the dataset, therefore there is no pre-defined number of color regions in the resulting histogram. Thus, it is not possible to use traditional comparison algorithms such as Euclidean or Manhattan distances. To allow the comparison of images through the new feature vectors given by metric histograms, a new metric distance function MHD( ) is also proposed. This paper shows the improvements in timing and retrieval discrimination obtained using metric histograms over traditional ones, even when using images with different spatial resolution or thumbnails. The experimental evaluation of the new method, for answering similarity queries over two representative image databases, shows that the metric histograms surpass the retrieval ability of traditional histograms because they are invariant on geometrical and brightness image transformations, and answer the queries up to 10 times faster than the traditional ones. 相似文献

10.

模糊加权的高效鲁棒人体动作视频检索

张涵韩毅李跃新《计算机应用研究》2019,36(3)

为了提高人体动作视频检索的鲁棒性和效率,提出了一种模糊加权的人体动作视频检索方法。该方法采用3D Harris算子检测视频中的时空兴趣点,提取这些兴趣点的梯度信息,构建特征向量;然后采用模糊聚类方法构建聚类特征向量,提高特征向量的抗干扰能力;接着匹配聚类特征向量中的梯度向量对,构建模糊权重矩阵,计算查询视频与数据库中各个视频的相似度;最后在KTH数据库上进行视频检索实验,结合精确度、召回率和检索耗时三个指标进行评价,证明该方法的性能最优。相似文献

11.

一种通过视频片段进行视频检索的方法 总被引：14，自引：0，他引：14

下载免费PDF全文

彭宇新 Ngo Chong-Wah 董庆杰郭宗明肖建国《软件学报》2003,14(8):1409-1417

视频片段检索是基于内容的视频检索的主要方式,它需要解决两个问题:(1) 从视频库里自动分割出与查询片段相似的多个片段;(2) 按照相似度从高到低排列这些相似片段.首次尝试运用图论的匹配理论来解决这两个问题.针对问题(1),把检索过程分为两个阶段:镜头检索和片段检索.在镜头检索阶段,利用相机运动信息,一个变化较大的镜头被划分为几个内容一致的子镜头,两个镜头的相似性通过对应子镜头的相似性计算得到;在片段检索阶段,通过考察相似镜头的连续性初步得到一个个相似片段,再运用最大匹配的Hungarian算法来确定真正的相似片段.针对问题(2),考虑了片段相似性判断的视觉、粒度、顺序和干扰因子,提出用最优匹配的Kuhn-Munkres算法和动态规划算法相结合,来解决片段相似度的度量问题.实验对比结果表明,所提出的方法在片段检索中可以取得更高的检索精度和更快的检索速度. 相似文献

12.

Adaptive partial graph learning and fusion for incomplete multi-view clustering

Xiao Zheng Xinwang Liu Jiajia Chen En Zhu 《国际智能系统杂志》2022,37(1):991-1009

Most of existing multi-view clustering methods assume that different feature views of data are fully observed. However, it is common that only portions of data features can be obtained in many practical applications. The presence of incomplete feature views hinders the performance of the conventional multi-view clustering methods to a large extent. Recently proposed incomplete multi-view clustering methods often focus on directly learning a common representation or a consensus affinity similarity graph from available feature views while ignore the valuable information hidden in the missing views. In this study, we present a novel incomplete multi-view clustering method via adaptive partial graph learning and fusion (APGLF), which can capture the local data structure of both within-view and cross-view. Specifically, we use the available data of each view to learn a corresponding view-specific partial graph, in which the within-view local structure can be well preserved. Then we design a cross-view graph fusion term to learn a consensus complete graph for different views, which can take advantage of the complementary information hidden in the view-specific partial graphs learned from incomplete views. In addition, a rank constraint is imposed on the graph Laplacian matrix of the fused graph to better recover the optimal cluster structure of original data. Therefore, APGLF integrates within-view partial graph learning, cross-view partial graph fusion and cluster structure recovering into a unified framework. Experiments on five incomplete multi-view data sets are conducted to validate the efficacy of APGLF when compared with eight state-of-the-art methods. 相似文献

13.

Efficient motion data indexing and retrieval with local similarity measure of motion strings

Shuangyuan Wu Shihong Xia Zhaoqi Wang Chunpeng Li 《The Visual computer》2009,25(5-7):499-508

Widely used in data-driven computer animation, motion capture data exhibits its complexity both spatially and temporally. The indexing and retrieval of motion data is a hard task that is not totally solved. In this paper, we present an efficient motion data indexing and retrieval method based on self-organizing map and Smith–Waterman string similarity metric. Existing motion clips are first used to train a self-organizing map and then indexed by the nodes of the map to get the motion strings. The Smith–Waterman algorithm, a local similarity measure method for string comparison, is used in clustering the motion strings. Then the motion motif of each cluster is extracted for the retrieval of example-based query. As an unsupervised learning approach, our method can cluster motion clips automatically without needing to know their motion types. Experiment results on a dataset of various kinds of motion show that the proposed method not only clusters the motion data accurately but also retrieves appropriate motion data efficiently. 相似文献

14.

ALSBIR: A local-structure-based image retrieval

Yanling Chi Author Vitae Author Vitae 《Pattern recognition》2007,40(1):244-261

The general-purpose shape retrieval problem is a challenging task. Particularly, an ideal technique, which can work in clustered environment, meet the requirements of perceptual similarity measure on partial query and overcoming dimensionality curse and adverse environment, is in demand. This paper reports our study on one local structural approach that addresses these issues. Shape representation and indexing are two key points in shape retrieval. The proposed approach combines a novel local-structure-based shape representation and a new histogram indexing structure. The former makes possible partial shape matching of objects without the requirement of segmentation (separation) of objects from complex background, while the latter has an advantage on indexing performance. The search time is linearly proportional to the input complexity. In addition, the method is relatively robust under adverse environments. It is able to infer retrieval results from incomplete information of an input by first extracting consistent and structurally unique local neighborhood information from inputs or models, and then voting on the optimal matches. Thousands of images have been used to test the proposed concepts on sensitivity analysis, similarity-based retrieval, partial query and mixed object query. Very encouraging experimental results with respect to efficiency and effectiveness have been obtained. 相似文献

15.

Partially supervised speaker clustering

Tang H Chu SM Hasegawa-Johnson M Huang TS 《IEEE transactions on pattern analysis and machine intelligence》2012,34(5):959-971

Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance. 相似文献

16.

Microstructure pattern extraction based image retrieval

Priyanka S. 《Multimedia Tools and Applications》2020,79(3-4):2263-2283

Computer vision techniques enhanced by the advent of deep learning has become a quintessential part of our day-to-day life. The application of such computer vision techniques in image retrieval can be termed as query based image retrieval process. Conventional methods have limitations such as increased dimensionality, reduced accuracy, high time consumption, and dependence on indexing for retrieval. In order to overcome these limitations, this research work aims to develop a new image retrieval system by developing an image preprocessing mechanism via target prediction technique, which isolates object from the background. Further, a Micro-structure based Pattern Extraction (MPE) technique is implemented to extract the patterns from the preprocessed image, where the diagonal patterns are generated for increasing the accuracy of the retrieval process. Consequently, the Convolutional Neural Network (CNN) is utilized to reduce the dimensionality of the features, and the similarity learning approach is utilized to map the selected features with trained features based on the distance metric. The performance of the proposed system is evaluated by using various measures. Thereby, the efficiency of the proposed technique is ascertained by comparing it with the existing techniques.

相似文献

17.

基于贝叶斯学习的视频检索相关反馈算法设计*

邓丽刘小辉金立左费树岷《计算机应用研究》2008,25(3):934-935

针对信息检索中如何提高检索的精度问题,提出了一个基于相关反馈的视频检索算法.使用概率框架来描述检索问题,并根据贝叶斯学习按照用户的行为来更新概率分布,实现自动相关反馈,提高了检索精度.实验表明,用该算法检索的准确度比基于最近邻特征线(NFL)的视频检索方法有明显提高. 相似文献

18.

Personalized video similarity measure

Jialie Shen Zhiyong Cheng 《Multimedia Systems》2011,17(5):421-433

As an effective technique to manage and explore large scale of video collections, personalized video search has received great attentions in recent years. One of the key problems in the related technique development is how to design and evaluate the similarity measures. Most of the existing approaches simply adopt traditional Euclidean distance or its variants. Consequently, they generally suffer from two main disadvantages: (1) low effectiveness—retrieval accuracy is poor. One of main reasons is that very little research has been carried out on designing an effective fusion scheme for integrating multimodal information (e.g., text, audio and visual) from video sequences and (2) poor scalability—development process of the video similarity metrics is largely disconnected from that of the relevant database access methods (indexing structures). This article reports a new distance metric called personalized video distance to effectively fuse information about individual preference and multimodal properties into a compact signature. Moreover, a novel hashing-based indexing structure has been designed to facilitate fast retrieval process and better scalability. A set of comprehensive empirical studies have been carried out based on two large video test collections and carefully designed queries with different complexities. We observe significant improvements over the existing techniques on various aspects. 相似文献

19.

一种基于颜色的图像表示及全局相似检索技术 总被引：6，自引：0，他引：6

曹奎冯玉才王元珍《计算机研究与发展》2001,38(9):1121-1126

基于内容的图像检索是当图像数据库领域中的一个研究热点,给出了一种描述图像视觉特征的图像表示方法,并据此计算图像之间的全局相似度,首先,通过对彩色空间的分析,提取图像中的颜色不变量,然后在频域内对这样的颜色信息进行分析,对频域分析的结果进行K-L变换,变换后的低维向量即为图像的颜色表示,在此基础上,也讲座他图像的相似度量以及相应的图像检索技术,并给出了实验结果和图像检索性能的评价。相似文献

20.

基于空间特征的图像检索 总被引：2，自引：1，他引：1

史婷婷李岩《计算机应用》2008,28(9):2292-2296

提出一种新的基于空间特征的图像特征描述子SCH,利用基于颜色向量角和欧几里得距离的MCVAE算法共同检测原始彩色图像边缘,同时利用一种新的“最大最小分量颜色不变量模型”对原始图像量化,对边缘像素建立边缘相关矩阵;对非边缘像素使用颜色直方图描述局部颜色分布信息;然后,利用新的sin相似性度量法则衡量图像特征间的相似度。实验采用VC++6.0开发了基于内容的图像检索原型系统“SttImageRetrieval”,基于Oracle 9i数据库建立了一个综合型图像数据库“IMAGEDB”。实验分析结果证明,利用SCH描述子的检索准确度明显高于仅基于颜色统计特征的检索结果。相似文献