首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 890 毫秒
1.
The tracker based on the Siamese network regards tracking tasks as solving a similarity problem between the target template and search area. Using shallow networks and offline training, these trackers perform well in simple scenarios. However, due to the lack of semantic information, they have difficulty meeting the accuracy requirements of the task when faced with complex backgrounds and other challenging scenarios. In response to this problem, we propose a new model, which uses the improved ResNet-22 network to extract deep features with more semantic information. Multilayer feature fusion is used to obtain a high-quality score map to reduce the influence of interference factors in the complex background on the tracker. In addition, we propose a more powerful Corner Distance IoU (intersection over union) loss function so that the algorithm can better regression to the bounding box. In the experiments, the tracker was extensively evaluated on the object tracking benchmark data sets, OTB2013 and OTB2015, and the visual object tracking data sets, VOT2016 and VOT2017, and achieved competitive performance, proving the effectiveness of this method.  相似文献   

2.
Transferring visual prior for online object tracking   总被引:1,自引:0,他引:1  
Visual prior from generic real-world images can be learned and transferred for representing objects in a scene. Motivated by this, we propose an algorithm that transfers visual prior learned offline for online object tracking. From a collection of real-world images, we learn an overcomplete dictionary to represent visual prior. The prior knowledge of objects is generic, and the training image set does not necessarily contain any observation of the target object. During the tracking process, the learned visual prior is transferred to construct an object representation by sparse coding and multiscale max pooling. With this representation, a linear classifier is learned online to distinguish the target from the background and to account for the target and background appearance variations over time. Tracking is then carried out within a Bayesian inference framework, in which the learned classifier is used to construct the observation model and a particle filter is used to estimate the tracking result sequentially. Experiments on a variety of challenging sequences with comparisons to several state-of-the-art methods demonstrate that more robust object tracking can be achieved by transferring visual prior.  相似文献   

3.
There existed many visual tracking methods that are based on sparse representation model, most of them were either generative or discriminative, which made object tracking more difficult when objects have undergone large pose change, illumination variation or partial occlusion. To address this issue, in this paper we propose a collaborative object tracking model with local sparse representation. The key idea of our method is to develop a local sparse representation-based discriminative model (SRDM) and a local sparse representation-based generative model (SRGM). In the SRDM module, the appearance of a target is modeled by local sparse codes that can be formed as training data for a linear classifier to discriminate the target from the background. In the SRGM module, the appearance of the target is represented by sparse coding histogram and a sparse coding-based similarity measure is applied to compute the distance between histograms of a target candidate and the target template. Finally, a collaborative similarity measure is proposed for measuring the difference of the two models, and then the corresponding likelihood of the target candidates is input into a particle filter framework to estimate the target state sequentially over time in visual tracking. Experiments on some publicly available benchmarks of video sequences showed that our proposed tracker is robust and effective.  相似文献   

4.
This paper addresses issues in visual tracking where videos contain object intersections, pose changes, occlusions, illumination changes, motion blur, and similar color distributed background. We apply the structural local sparse representation method to analyze the background region around the target. After that, we reduce the probability of prominent features in the background and add new information to the target model. In addition, a weighted search method is proposed to search the best candidate target region. To a certain extent, the weighted search method solves the local optimization problem. The proposed scheme, designed to track single human through complex scenarios from videos, has been tested on some video sequences. Several existing tracking methods are applied to the same videos and the corresponding results are compared. Experimental results show that the proposed tracking scheme demonstrates a very promising performance in terms of robustness to occlusions, appearance changes, and similar color distributed background.  相似文献   

5.
Sparse representation has been attracting much more attention in visual tracking. However most sparse representation based trackers only focus on how to model the target appearance and do not consider the learning of sparse representation when the training samples are imprecise, and hence may drift or fail in the challenging scene. In this paper, we present a novel online tracking algorithm. The tracker integrates the online multiple instance learning into the recent sparse representation scheme. For tracking, the integrated sparse representation combining texture, intensity and local spatial information is proposed to model the target. This representation takes both occlusion and appearance change into account. Then, an efficient online learning approach is proposed to select the most distinguishable features to separate the target from the background samples. In addition, the sparse representation is dynamically updated online with respect to the current context. Both qualitative and quantitative evaluations on challenging benchmark video sequences demonstrate that the proposed tracking algorithm performs favorably against several state-of-the-art methods.  相似文献   

6.
In this paper, we propose a tracking algorithm that can robustly handle appearance variations in tracking process. Our method is based on seeds–active appearance model, which is composed by structural sparse coding. In order to compensate for illumination changes, heavy occlusion and appearance self-updating problem, we proposed a mixture online learning scheme for modeling the target object appearance model. The proposed object tracking scheme involves three stages: training, detection and tracking. In the training stage, an incremental SVM model that directly measures the candidates samples and target difference. The proposed mixture generate–discriminative method can well separate two highly correlated positive candidates images. In the detection stage, the trained weighted vector is used to separate the target object in positive candidates images with respect to the seeds images. In the tracking stage, we employ the particle filter to track the object through an appearance adaptive updating algorithm with seeds–active constrained sparse representation. Based on a set of comprehensive experiments, our algorithm has demonstrated better performance than alternatives reported in the current literature.  相似文献   

7.
针对传统稀疏表示跟踪算法在复杂背景中易出现跟踪漂移问题,该文提出一种局部感知下的稀疏优化目标跟踪方法。首先,将首帧确定的目标区域进行非重叠均匀分割,并利用目标的全局特征和局部特征联合建模。然后,提出一种局部感知校验方法约束稀疏优化匹配过程,从而确定最优匹配样本。最后,在模板更新中提出一种决策方法对遮挡进行检测,并针对不同遮挡情况采取相应的更新策略,使得更新后的模板集更加完善。实验在10个标准库视频序列中测试,并与目前较流行的目标跟踪算法在跟踪效果、成功率等方面进行比较,实验结果表明,提出的跟踪方法在局部遮挡、目标形变、复杂背景等条件下跟踪准确、适应性强。  相似文献   

8.
陈利霞  李子  袁华  欧阳宁 《电视技术》2015,39(17):16-20
针对基于单一字典训练稀疏表示的图像融合算法忽略图像局部特征的问题,提出了基于块分类稀疏表示的图像融合算法。此算法是根据图像局部特征的差异将图像块分为平滑、边缘和纹理三种结构类型,对边缘和纹理结构分别训练出各自的冗余字典。平滑结构利用算术平均法进行融合,边缘和纹理结构由对应字典利用稀疏表示算法进行融合,并对边缘结构稀疏表示中的残余量进行小波变换融合。实验结果证明,该算法相对于单一字典稀疏表示算法,在融合图像的主观评价和客观评价指标上都有显著改进,并且算法速度也有提高。  相似文献   

9.
在视频监控领域,包含PTZ(pan-tilt-zoom)相机的双目主从系统可以同时获取跟踪目标的全景信息和高分辨率信息,因此得到了广泛研究与应用。针对智能视频监控的需求,提出了一种基于地平面约束的双目PTZ主从跟踪方法。该方法分为离线标定和在线实时跟踪两个阶段。离线阶段,利用两相机不同视角间的目标匹配关系计算地平面所诱导的单应矩阵,提出了一种从两相机同步视频流中自动估计单应矩阵的方法,该方法与传统方法相比具有不需要标定物和人工干预的优点,然后采用匹配特征点的方法估计相机的主点和等效焦距。在线实时跟踪时,通过单应变换建立主从相机之间的坐标关联,并利用离线阶段标定的主点和等效焦距估计从相机控制参数,从而实现主从跟踪。与其他算法相比,该方法可以应用于宽基线的情形,能够适应目标深度的变化,满足了实时性的要求。室内、外场景的多组实验验证了所提方法的有效性。  相似文献   

10.
[目的]为了降低稀疏表示目标跟踪算法的计算复杂度,[方法]在粒子滤波框架下提出了基于局部结构变换域稀疏外观模型的视觉目标跟踪算法.[结果]该算法在目标区域附近提取重叠的局部图像块,并计算出所有局部图像块的二维离散余弦变换,获得图像块的变换域系数.变换域的能量集中特性被采用来降低字典的维度与候选样本的数量,并且对系数压缩一定的自由度可以抑制噪声与遮挡影响.采用被裁剪的样本与字典获得局部图像块的稀疏编码,然后将当前目标区域中所有小图像块的稀疏向量加权融合得到目标区域的稀疏表示值,并通过决策模型获取最优跟踪结果.与现有三种最新的跟踪算法比较的实验结果表明,[结论]所提算法的跟踪性能接近或超过对比算法,同时大大减小了f1范数最小化的计算复杂度.  相似文献   

11.
Object tracking based on sparse representation formulates tracking as searching the candidate with minimal reconstruction error in target template subspace. The key problem lies in modeling the target robustly to vary appearances. The appearance model in most sparsity-based trackers has two main problems. The first is that global structural information and local features are insufficiently combined because the appearance is modeled separately by holistic and local sparse representations. The second problem is that the discriminative information between the target and the background is not fully utilized because the background is rarely considered in modeling. In this study, we develop a robust visual tracking algorithm by modeling the target as a model for discriminative sparse appearance. A discriminative dictionary is trained from the local target patches and the background. The patches display the local features while their position distribution implies the global structure of the target. Thus, the learned dictionary can fully represent the target. The incorporation of the background into dictionary learning also enhances its discriminative capability. Upon modeling the target as a sparse coding histogram based on this learned dictionary, our tracker is embedded into a Bayesian state inference framework to locate a target. We also present a model update scheme in which the update rate is adjusted automatically. In conjunction with the update strategy, the proposed tracker can handle occlusion and alleviate drifting. Comparative results on challenging benchmark image sequences show that the tracking method performs favorably against several state-of-the-art algorithms.  相似文献   

12.
针对实际视觉跟踪中目标表观与前背景的非线性变化,论文提出一种基于偏最小二乘分析(PLS)表示与随机梯度的目标优化跟踪方法。该方法将目标跟踪转化为表示误差与分类损失的联合优化问题。首先,为了提高算法对前背景表观变化的稳定性,利用PLS理论的非线性对目标区域的前背景信息进行表达,并通过空间聚类构造多个线性外观模型来描述目标区域的动态变化,建立带约束条件的表观特征库;然后,提出一种确定性搜索机制,构造联合优化目标函数,使表示误差与分类损失最小化;结合表观建模特点,构建随机梯度分类器,对模型进行增量特征更新,最终实现对目标的稳定准确跟踪。经多场景对比实验验证,该算法能有效应对目标前背景的多种复杂变化。  相似文献   

13.
Bag-of-words models have been widely used to obtain the global representation for action recognition. However, these models ignored the structure information, such as the spatial and temporal contextual information for action representation. In this paper, we propose a novel structured codebook construction method to encode spatial and temporal contextual information among local features for video representation. Given a set of training videos, our method first extracts local motion and appearance features. Next, we encode the spatial and temporal contextual information among local features by constructing correlation matrices for local spatio-temporal features. Then, we discover the common patterns of movements to construct the structured codebook. After that, actions can be represented by a set of sparse coefficients with respect to the structured codebook. Finally, a simple linear SVM classifier is applied to predict the action class based on the action representation. Our method has two main advantages compared to traditional methods. First, our method automatically discovers the mid-level common patterns of movements that capture rich spatial and temporal contextual information. Second, our method is robust to unwanted background local features mainly because most unwanted background local features cannot be sparsely represented by the common patterns and they are treated as residual errors that are not encoded into the action representation. We evaluate the proposed method on two popular benchmarks: KTH action dataset and UCF sports dataset. Experimental results demonstrate the advantages of our structured codebook construction.  相似文献   

14.
The task of object tracking is very important since its various applications. However, most object tracking methods are based on visible images, which may fail when visible images are unreliable, for example when the illumination conditions are poor. To address this issue, in this paper a fusion tracking method which combines information from RGB and thermal infrared images (RGB-T) is presented based on the fact that infrared images reveal thermal radiation of objects thus providing complementary features. Particularly, a fusion tracking method based on dynamic Siamese networks with multi-layer fusion, termed as DSiamMFT, is proposed. Visible and infrared images are firstly processed by two dynamic Siamese Networks, namely visible and infrared network, respectively. Then, multi-layer feature fusion is performed to adaptively integrate multi-level deep features between visible and infrared networks. Response maps produced from different fused layer features are then combined through an elementwise fusion approach to produce the final response map, based on which the target can be located. Extensive experiments on large datasets with various challenging scenarios have been conducted. The results demonstrate that the proposed method shows very competitive performance against the-state-of-art RGB-T trackers. The proposed approach also improves tracking performance significantly compared to methods based on images of single modality.  相似文献   

15.
基于改进深层网络的人脸识别算法   总被引:4,自引:0,他引:4       下载免费PDF全文
目前的人脸识别算法在其特征提取过程中采用手工设计(hand-crafted)特征或利用深度学习自动提取特征.本文提出一种基于改进深层网络自动提取特征的人脸识别算法,可以更准确地提取出目标的鉴别性特征.算法首先对图像进行ZCA(Zero-mean Component Analysis)白化等预处理,减小特征相关性,降低网络训练复杂度.然后,基于卷积、池化、多层稀疏自动编码器构建深层网络特征提取器.所使用的卷积核是通过单独的无监督学习获得的.此改进的深层网络通过预训练和微调,得到一个自动的深层特征提取器.最后,利用Softmax回归模型对提取的特征进行分类.本文算法在多个常用人脸库上进行了实验,表明了其在性能上比传统方法和普通深度学习方法都有所提高.  相似文献   

16.
Siamese tracking is one of the most promising object tracking methods today due to its balance of performance and speed. However, it still performs poorly when faced with some challenges such as low light or extreme weather. This is caused by the inherent limitations of visible images, and a common way to cope with it is to introduce infrared data as an aid to improve the robustness of tracking. However, most of the existing RGBT trackers are variants of MDNet (Hyeonseob Nam and Bohyung Han, Learning multi-domain convolutional neural networks for visual tracking, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4293–4302.), which have significant limitations in terms of operational efficiency. On the contrary, the potential of Siamese tracking in the field of RGBT tracking has not been effectively exploited due to the reliance on large-scale training data. To solve this dilemma, in this paper, we propose an end-to-end Siamese RGBT tracking framework that is based on cross-modal feature enhancement and self-attention (SiamFEA). We draw on the idea of migration learning and employ local fine-tuning to reduce the dependence on large-scale RGBT data and verify the feasibility of this approach, and then we propose a reliable fusion approach to efficiently fuse the features of different modalities. Specifically, we first propose a cross-modal feature enhancement module to exploit the complementary properties of dual-modality, followed by capturing non-local attention in channel and spatial dimensions for adaptive weighted fusion, respectively. Our network was trained end-to-end on the LasHeR (Chenglong Li, Wanlin Xue, Yaqing Jia, Zhichen Qu, Bin Luo, Jin Tang, LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking, CoRR abs/2104.13202, 2021) training set and reached new SOTAs on GTOT (C. Li, H. Cheng, S. Hu, X. Liu, J. Tang, L. Lin, Learning collaborative sparse representation for grayscale-thermal tracking, IEEE Trans. Image Process, 25 (12) (2016) 5743–5756.), RGBT234 (C. Li, X. Liang, Y. Lu, N. Zhao, and J. Tang, “Rgb-t object tracking: Benchmark and baseline,” Pattern Recognition, vol. 96, p. 106977, 2019.), and LasHeR (Chenglong Li, Wanlin Xue, Yaqing Jia, Zhichen Qu, Bin Luo, Jin Tang, LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking, CoRR abs/2104.13202, 2021) while running in real-time.  相似文献   

17.
权伟  陈锦雄  余南阳 《电子学报》2014,42(5):875-882
为了研究无约束环境下长时间可视跟踪问题,提出了一种在线学习多重检测的对象跟踪方法.该方法以随机蕨作为基础检测器结构,通过在线学习的方式,将目标对象的整体和局部表观,以及由场景学习中发掘的同步对象同时作为检测学习的基础数据,该检测器因而具备了对这多种对象的独立检测能力.由于其各个检测部分发挥了各自不同的作用,本文从测量的角度将检测器对这三种对象检测的结果进行融合,通过计算检测关于目标的配置概率进而确定目标位置,实现对象跟踪任务.基于真实视频序列的实验结果验证了本文方法的有效性和稳定性,以及较现有的跟踪方法在跟踪性能上的提高.  相似文献   

18.
19.
付晓  沈远彤  付丽华  杨迪威 《电子学报》2018,46(5):1041-1046
稀疏自编码网络在自然语言、图像处理等领域都取得了显著效果.已有的研究表明增加网络提取的特征个数可以优化稀疏自编码网络的处理效果,同时该操作将导致网络训练耗时过长.为尽可能减少网络的训练时间,本文提出了一种基于特征聚类的稀疏自编码快速算法.本算法首先根据K均值聚类最优数确定本质特征的个数,再由网络训练得到本质特征,并通过旋转扭曲增加特征的多样性,使网络处理效果得到提升的同时,减少网络训练耗间.实验在标准的手写体识别数据库MNIST和人脸数据库CMU-PIE上进行,结果表明本文所提算法能在保证网络正确率有所提升的同时,大幅度缩短网络训练耗时.  相似文献   

20.
为了提高生成型目标跟踪算法在遮挡、背景干扰 等复杂条件下的性能,在稀疏编码模型中引入l0范数正 则化约束,以减少冗余编码信息并改善目标表观重构效果。同时提出一种新的基于非凸近端 加速梯度的快速迭代算法, 解决由此产生的非凸非光滑优化问题。设计了一种增量低秩学习策略,和传统方法需 要将目标观测数据作为 一个整体进行低秩学习不同,本文方法通过l0正则化稀疏编码能够有效地对目标低秩特 征子空间进行在线学习和更 新。在多个视频序列上的实验表明:基于l0正则化的增量低秩学习方法能有效提高目标 跟踪算法的准确率和鲁棒性; 和8种优秀的跟踪算法相比,本文算法在中心误差稳健性和重叠率稳健性两个指标上都取得 了最好结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号