期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

许源薛向阳《计算机科学》2006,33(11):134-138

准确提取视频高层语义特征,有助于更好地进行基于内容的视频检索。视频局部高层语义特征描述的是图像帧中的物体。考虑到物体本身以及物体所处的特定场景所具有的特点,我们提出一种将图像帧的局部信息和全局信息结合起来提取视频局部高层语义特征的算法。在TRECVID2005数据集上的实验结果表明,与单独基于局部或者单独基于全局的方法相比,此方法具有较好的性能。相似文献

2.

Unsupervised Video Hashing via Deep Neural Network

Chao Ma Yun Gu Chen Gong Jie Yang Deying Feng 《Neural Processing Letters》2018,47(3):877-890

Hashing is a common solution for content-based multimedia retrieval by encoding high-dimensional feature vectors into short binary codes. Previous works mainly focus on image hashing problem. However, these methods can not be directly used for video hashing, as videos contain not only spatial structure within each frame, but also temporal correlation between successive frames. Several researchers proposed to handle this by encoding the extracted key frames, but these frame-based methods are time-consuming in real applications. Other researchers proposed to characterize the video by averaging the spatial features of frames and then the existing hashing methods can be adopted. Unfortunately, the sort of “video” features does not take the correlation between frames into consideration and may lead to the loss of the temporal information. Therefore, in this paper, we propose a novel unsupervised video hashing framework via deep neural network, which performs video hashing by incorporating the temporal structure as well as the conventional spatial structure. Specially, the spatial features of videos are obtained by utilizing convolutional neural network, and the temporal features are established via long-short term memory. After that, the time series pooling strategy is employed to obtain the single feature vector for each video. The obtained spatio-temporal feature can be applied to many existing unsupervised hashing methods. Experimental results on two real datasets indicate that by employing the spatio-temporal features, our hashing method significantly improves the performance of existing methods which only deploy the spatial features, and meanwhile obtains higher mean average precision compared with the state-of-the-art video hashing methods. 相似文献

3.

Novel automatic video cut detection technique using Gabor filtering

Tudor Barbu 《Computers & Electrical Engineering》2009,35(5):712-721

Video shot transition identification constitutes an important computer vision research field, being applied, as an essential step, in many other digital video analysis domains: video scene detection, video compression, video indexing, video content retrieval and video object tracking. This paper approaches the video cut transition detection domain, providing a novel feature-based automatic identification method. We propose a feature extraction technique that uses 2D Gabor filtering, computing tridimensional image feature vectors for the video frames. Most shot cut detection techniques use a thresholding operation to discriminate between the inter-frame difference metric values and thus identify the video break points. Our identification approach is not threshold-based, using an automatic unsupervised distance classification procedure instead of a threshold. Thus, we provide a region-growing based classification approach, that proves to be very efficient in clustering the distances between feature vectors of consecutive frames. The two resulted distance classes determine a satisfactory video shot detection. 相似文献

4.

视频序列的全景图拼接技术 总被引：10，自引：0，他引：10

下载免费PDF全文

朱云芳叶秀清顾伟康《中国图象图形学报》2006,11(8):1150-1155

提出了一种对视频序列进行全景图拼接的方法。主要讨论了有大面积的非刚性运动物体出现的序列，不过此方法也同样适用于无运动物体的纯背景序列。为计算各帧间的投影关系，用仿射模型来描述摄像机运动，并用特征点匹配的方法计算出模型中各参数的值。由于用相关法计算的匹配结果准确率比较低，所以用RANSAC（Random Sampling Consensus）对匹配结果进行了筛选，可以准确求出摄像机运动参数。利用运动参数进行投影，然后用多帧相减并求交集，估计出每帧图像中运动物体存在的区域，最后计算得到了全景图。该方法的结果与前人得到的结果进行了比较，证明用此方法能获得质量较高的全景图。相似文献

5.

Character‐Object Interaction Retrieval using the Interaction Bisector Surface

下载免费PDF全文

X. Zhao M.G. Choi T. Komura 《Computer Graphics Forum》2017,36(2):119-129

相似文献

6.

Robust object tracking via superpixels and keypoints

Shen Mingyu Zhang Yonggang Wang Ronggui Yang Juan Xue Lixia Hu Min 《Multimedia Tools and Applications》2018,77(19):25109-25129

相似文献

7.

双光流网络指导的视频目标检测

下载免费PDF全文

尉婉青禹晶史薪琪肖创柏《中国图象图形学报》2021,26(10):2473-2484

目的卷积神经网络广泛应用于目标检测中,视频目标检测的任务是在序列图像中对运动目标进行分类和定位。现有的大部分视频目标检测方法在静态图像目标检测器的基础上,利用视频特有的时间相关性来解决运动目标遮挡、模糊等现象导致的漏检和误检问题。方法本文提出一种双光流网络指导的视频目标检测模型,在两阶段目标检测的框架下,对于不同间距的近邻帧,利用两种不同的光流网络估计光流场进行多帧图像特征融合,对于与当前帧间距较小的近邻帧,利用小位移运动估计的光流网络估计光流场,对于间距较大的近邻帧,利用大位移运动估计的光流网络估计光流场,并在光流的指导下融合多个近邻帧的特征来补偿当前帧的特征。结果实验结果表明,本文模型的mAP（mean average precision）为76.4%,相比于TCN（temporal convolutional networks）模型、TPN+LSTM（tubelet proposal network and long short term memory network）模型、D（&T loss）模型和FGFA（flow-guided feature aggregation）模型分别提高了28.9%、8.0%、0.6%和0.2%。结论本文模型利用视频特有的时间相关性,通过双光流网络能够准确地从近邻帧补偿当前帧的特征,提高了视频目标检测的准确率,较好地解决了视频目标检测中目标漏检和误检的问题。相似文献

8.

Motion Flow-Based Video Retrieval 总被引：2，自引：0，他引：2

Chih-Wen Su Liao H.-Y.M. Hsiao-Rong Tyan Chia-Wen Lin Duan-Yu Chen Kuo-Chin Fan 《Multimedia, IEEE Transactions on》2007,9(6):1193-1201

In this paper, we propose the use of motion vectors embedded in MPEG bitstreams to generate so-called ldquomotion flowsrdquo, which are applied to perform video retrieval. By using the motion vectors directly, we do not need to consider the shape of a moving object and its corresponding trajectory. Instead, we simply ldquolinkrdquo the local motion vectors across consecutive video frames to form motion flows, which are then recorded and stored in a video database. In the video retrieval phase, we propose a new matching strategy to execute the video retrieval task. Motions that do not belong to the mainstream motion flows are filtered out by our proposed algorithm. The retrieval process can be triggered by query-by-sketch or query-by-example. The experiment results show that our method is indeed superb in the video retrieval process. 相似文献

9.

基于动态遍历的分层特征网络视觉定位

蒋雪源陈青梅黄初华《计算机工程》2021,47(9):197-202

采用分层特征网络估计查询图像的相机位姿,会出现检索失败和检索速度慢的问题。对分层特征网络进行分析,提出采用动态遍历与预聚类的视觉定位方法。依据场景地图进行图像预聚类,利用图像全局描述符获得候选帧集合并动态遍历查询图像,利用图像局部特征描述符进行特征点匹配,通过PnP算法估计查询图像的相机位姿,由此构建基于MobileNetV3的分层特征网络,以准确提取全局描述符与局部特征点。在典型数据集上与AS、CSL、DenseVLAD、NetVLAD等主流视觉定位方法的对比结果表明,该方法能够改善光照与季节变化场景下对候选帧的检索效率,提升位姿估计精度和候选帧检索速度。相似文献

10.

Multiple hierarchical deep hashing for large scale image retrieval

Liangfu Cao Lianli Gao Jingkuan Song Fumin Shen Yuan Wang 《Multimedia Tools and Applications》2018,77(9):10471-10484

Learning-based hashing methods are becoming the mainstream for large scale visual search. They consist of two main components: hash codes learning for training data and hash functions learning for encoding new data points. The performance of a content-based image retrieval system crucially depends on the feature representation, and currently Convolutional Neural Networks (CNNs) has been proved effective for extracting high-level visual features for large scale image retrieval. In this paper, we propose a Multiple Hierarchical Deep Hashing (MHDH) approach for large scale image retrieval. Moreover, MHDH seeks to integrate multiple hierarchical non-linear transformations with hidden neural network layer for hashing code generation. The learned binary codes represent potential concepts that connect to class labels. In addition, extensive experiments on two popular datasets demonstrate the superiority of our MHDH over both supervised and unsupervised hashing methods. 相似文献

11.

A new robust video watermarking algorithm based on SURF features and block classification

Zhila Bahrami Fardin Akhlaghian Tab 《Multimedia Tools and Applications》2018,77(1):327-345

In this paper, we propose a robust block classification based semi-blind video watermarking algorithm using visual cryptography and SURF (Speed-Up Robust Features) features to enhance the robustness, stability, imperceptibility and real-time performance. A method of selecting the best frames in each shot and the best regions or blocks within best frames is proposed to avoid employing frame–by-frame method for generating owner’s share in order to enhance robustness as well as reducing time complexity. In our method, Owner’s share is generated using the classification of selected robust blocks within the chosen frames along with corresponding watermark information. In extraction process, the SURF features are employed to match the feature points of selected frames with all frames to detect selected frames. Moreover, we resynchronize the embedded regions from distorted video to original sequence using SURF feature points matching. Afterwards, based on these matched feature points, rotation and scaling parameters are estimated next, selected blocks are retrieved using side information being stored eventually, watermark information is reconstructed successfully. Selecting Best frames, best regions, and employing surf features make our method to be highly robust against various kinds of attacks including image processing attacks, geometrical attacks and temporal attacks. Experimental results confirm the superiority of our scheme in case of being applicable in the real world, enhancing robustness and exploiting idea imperceptibility, over previous related methods. 相似文献

12.

一种基于轮廓特征点的图像检索方法 总被引：1，自引：0，他引：1

下载免费PDF全文

陈文兵成海燕陈允杰徐钦《计算机工程》2012,38(12):197-200

传统基于形状的图像检索方法检索效率较低,针对该问题,提出一种基于对象轮廓特征点的图像检索方法。利用Mean Shift算法提取感兴趣对象,以对象曲率的局部极值点作为特征点,并将对象表示为这些特征点的特征向量,定义检索对象与被检索对象特征向量间的距离匹配机制,实现对象的匹配或识别。实验结果表明,与传统方法相比,该方法具有较高的查全率和查准率。相似文献

13.

基于视觉的无人飞艇地面目标检测 总被引：1，自引：0，他引：1

下载免费PDF全文

赵基宇胡士强《计算机工程》2012,38(8):170-172

针对无人飞艇地面目标检测中细节信息缺失的问题,提出一种静态目标和运动目标的检测方法。利用Lucas-Kanade方法跟踪目标区域内特征点,从而实现静态目标的连续检测。通过图像特征点的跟踪估计相邻帧图像间的全局运动,进而对图像进行运动补偿,利用补偿后的帧差图实现运动目标的检测。采用上海交通大学“致远一号”无人飞艇采集的实际视频数据进行实验与分析,结果验证了该方法的有效性。相似文献

14.

3D model retrieval based on color + geometry signatures

Yong-Jin Liu Yi-Fu Zheng Lu Lv Yu-Ming Xuan Xiao-Lan Fu 《The Visual computer》2012,28(1):75-86

Color plays a significant role in the recognition of 3D objects and scenes from the perspective of cognitive psychology. In this paper, we propose a new 3D model retrieval method, focusing on not only the geometric features but also the color features of 3D mesh models. Firstly, we propose a new sampling method that samples the models in the regions of either geometry-high-variation or color-high-variation. After collecting geometry + color sensitive sampling points, we cluster them into several classes by using a modified ISODATA algorithm. Then we calculate the feature histogram of each model in the database using these clustered sampling points. For model retrieval, we compare the histogram of an input model to the stored histograms in the database to find out the most similar models. To evaluate the retrieval method based on the new color + geometry signatures, we use the precision/recall performance metric to compare our method with several classical methods. Experiment results show that color information does help improve the accuracy of 3D model retrieval, which is consistent with the postulate in psychophysics that color should strongly influence the recognition of objects. 相似文献

15.

Shape-based indexing scheme for camera view invariant 3-D object retrieval

Hyoung Joong Kim Yoon-Sik Tak Eenjun Hwang 《Multimedia Tools and Applications》2010,47(1):7-29

Camera view invariant 3-D object retrieval is an important issue in many traditional and emerging applications such as security, surveillance, computer-aided design (CAD), virtual reality, and place recognition. One straightforward method for camera view invariant 3-D object retrieval is to consider all the possible camera views of 3-D objects. However, capturing and maintaining such views require an enormous amount of time and labor. In addition, all camera views should be indexed for reasonable retrieval performance, which requires extra storage space and maintenance overhead. In the case of shape-based 3-D object retrieval, such overhead could be relieved by considering the symmetric shape feature of most objects. In this paper, we propose a new shape-based indexing and matching scheme of real or rendered 3-D objects for camera view invariant object retrieval. In particular, in order to remove redundant camera views to be indexed, we propose a camera view skimming scheme, which includes: i) mirror shape pairing and ii) camera view pruning according to the symmetrical patterns of object shapes. Since our camera view skimming scheme considerably reduces the number of camera views to be indexed, it could relieve the storage requirement and improve the matching speed without sacrificing retrieval accuracy. Through various experiments, we show that our proposed scheme can achieve excellent performance. 相似文献

16.

Object recognition using discriminative parts

Ying-Ho Liu Anthony J.T. Lee Fu Chang 《Computer Vision and Image Understanding》2012,116(7):854-867

The existing object recognition methods can be classified into two categories: interest-point-based and discriminative-part-based. The interest-point-based methods do not perform well if the interest points cannot be selected very carefully. The performance of the discriminative-part-base methods is not stable if viewpoints change, because they select discriminative parts from the interest points. In addition, the discriminative-part-based methods often do not provide an incremental learning ability. To address these problems, we propose a novel method that consists of three phases. First, we use some sliding windows that are different in scale to retrieve a number of local parts from each model object and extract a feature vector for each local part retrieved. Next, we construct prototypes for the model objects by using the feature vectors obtained in the first phase. Each prototype represents a discriminative part of a model object. Then, we establish the correspondence between the local parts of a test object and those of the model objects. Finally, we compute the similarity between the test object and each model object, based on the correspondence established. The test object is recognized as the model object that has the highest similarity with the test object. The experimental results show that our proposed method outperforms or is comparable with the compared methods in terms of recognition rates on the COIL-100 dataset, Oxford buildings dataset and ETH-80 dataset, and recognizes all query images of the ZuBuD dataset. It is robust enough for distortion, occlusion, rotation, viewpoint and illumination change. In addition, we accelerate the recognition process using the C4.5 decision tree technique, and the proposed method has the ability to build prototypes incrementally. 相似文献

17.

Combining topological and view-based features for 3D model retrieval

Pengjie Li Huadong Ma Anlong Ming 《Multimedia Tools and Applications》2013,65(3):335-361

With the rapidly increasing of 3D models, the 3D model retrieval methods have been paid significant research attention. Most of the existing methods focus on taking advantage of one kind of feature. These methods can not achieve ideal retrieval results for different classes of 3D models. In this paper, we propose a novel 3D model retrieval algorithm by combining topological and view-based features. To preserve the topological structure of the 3D model, a multiresolutional reeb graph (MRG) is constructed according to the salient topological points. The view-based features are extracted from the images, which are rendered at each of the topological points. To preserve the spatial structure information of the images, we modify the bag-of-features (BOF) method by using the combined shell-sector model. We take the view-based features as the attribute information of the corresponding MRG nodes. The comparison between two 3D models is transformed to the problem of computing the similarity of the corresponding MRGs. Finally, we calculate the similarity between the query model and the models in the databases by adapting the earth mover distance method. Experimental results on two standard benchmarks show that our algorithm can achieve satisfactory retrieval performance. 相似文献

18.

Image retrieval based on AND/OR-construction models

Huang Yin-Fu Hsieh Yun-Shin 《Multimedia Tools and Applications》2020,79(37-38):27293-27320

With the rapid development of the Internet, finding desired images from numerous images has become an important research topic. In this paper, we propose an image retrieval system facilitating retrieval time and accuracy. Since the performance of image retrieval is deeply influenced by image features and retrieval methods. Five different types of features and five different methods are used to find the best combination for an image retrieval system. First, we segment out the main object in an image and then extract its features. Next, relevant features are selected from the original feature set for facilitating image retrieval, using the SAHS algorithm. Then, five methods based on AND/OR-construction are proposed to build the image retrieval model, using the relevant features. Finally, the experimental results not only show that our methods are more effective than the other state-of-the-art methods but also present some observations never explored by the previous research.

相似文献

19.

Fast moving object detection with non-stationary background

Jiman Kim Xiaofei Wang Hai Wang Chunsheng Zhu Daijin Kim 《Multimedia Tools and Applications》2013,67(1):311-335

The detection of moving objects under a free-moving camera is a difficult problem because the camera and object motions are mixed together and the objects are often detected into the separated components. To tackle this problem, we propose a fast moving object detection method using optical flow clustering and Delaunay triangulation as follows. First, we extract the corner feature points using Harris corner detector and compute optical flow vectors at the extracted corner feature points. Second, we cluster the optical flow vectors using K-means clustering method and reject the outlier feature points using Random Sample Consensus algorithm. Third, we classify each cluster into the camera and object motion using its scatteredness of optical flow vectors. Fourth, we compensate the camera motion using the multi-resolution block-based motion propagation method and detect the objects using the background subtraction between the previous frame and the motion compensated current frame. Finally, we merge the separately detected objects using Delaunay triangulation. The experimental results using Carnegie Mellon University database show that the proposed moving object detection method outperforms the existing other methods in terms of detection accuracy and processing time. 相似文献

20.

A hierarchical visual model for video object summarization

Liu D Hua G Chen T 《IEEE transactions on pattern analysis and machine intelligence》2010,32(12):2178-2190

相似文献