首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 37 毫秒
1.
目的 立体视频能提供身临其境的逼真感而越来越受到人们的喜爱,而视觉显著性检测可以自动预测、定位和挖掘重要视觉信息,可以帮助机器对海量多媒体信息进行有效筛选。为了提高立体视频中的显著区域检测性能,提出了一种融合双目多维感知特性的立体视频显著性检测模型。方法 从立体视频的空域、深度以及时域3个不同维度出发进行显著性计算。首先,基于图像的空间特征利用贝叶斯模型计算2D图像显著图;接着,根据双目感知特征获取立体视频图像的深度显著图;然后,利用Lucas-Kanade光流法计算帧间局部区域的运动特征,获取时域显著图;最后,将3种不同维度的显著图采用一种基于全局-区域差异度大小的融合方法进行相互融合,获得最终的立体视频显著区域分布模型。结果 在不同类型的立体视频序列中的实验结果表明,本文模型获得了80%的准确率和72%的召回率,且保持了相对较低的计算复杂度,优于现有的显著性检测模型。结论 本文的显著性检测模型能有效地获取立体视频中的显著区域,可应用于立体视频/图像编码、立体视频/图像质量评价等领域。  相似文献   

2.
Perceptually salient regions of stereoscopic images significantly affect visual comfort (VC). In this paper, we propose a new objective approach for predicting VC of stereoscopic images according to visual saliency. The proposed approach includes two stages. The first stage involves the extraction of foreground saliency and depth contrast from a disparity map to generate a depth saliency map, which in turn is combined with 2D saliency to obtain a stereoscopic visual saliency map. The second stage involves the extraction of saliency-weighted VC features, and feeding them into a prediction metric to produce VC scores of the stereoscopic images. We demonstrate the effectiveness of the proposed approach compared with the conventional prediction methods on the IVY Lab database, with performance gain ranging from 0.016 to 0.198 in terms of correlation coefficients.  相似文献   

3.
针对先前的立体图像显著性检测模型未充分考虑立体视觉舒适度和视差图分布特征对显著区域检测的影响,提出了一种结合立体视觉舒适度因子的显著性计算模型.该模型在彩色图像显著性提取中,首先利用SLIC算法对输入图像进行超像素分割,随后进行颜色相似区域合并后再进行二维图像显著性计算;在深度显著性计算中,首先对视差图进行预处理;然后基于区域对比度进行显著性计算;最后,结合立体视觉舒适度因子对二维显著图和深度显著图进行融合,得到立体图像显著图.在不同类型立体图像上的实验结果表明,该模型获得了85%的准确率和78%的召回率,优于现有常用的显著性检测模型,并与人眼立体视觉注意力机制保持良好的一致性.  相似文献   

4.
目的 智能适配显示的图像/视频重定向技术近年受到广泛关注。与图像重定向以及2D视频重定向相比,3D视频重定向需要同时考虑视差保持和时域保持。现有的3D视频重定向方法虽然考虑了视差保持却忽略了对视差舒适度的调整,针对因视差过大和视差突变造成视觉不舒适度这一问题,提出了一种基于时空联合视差优化的立体视频重定向方法,将视频视差范围控制在舒适区间。方法 在原始视频上建立均匀网格,并提取显著信息和视差,进而得到每个网格的平均显著值;根据相似性变化原理构建形状保持能量项,利用目标轨迹以及原始视频的视差变化构建时域保持能量项,并结合人眼辐辏调节原理构建视差舒适度调整能量项;结合各个网格的显著性,联合求解所有能量项得到优化后的网格顶点坐标,将其用于确定网格形变,从而生成指定宽高比的视频。结果 实验结果表明,与基于细缝裁剪的立体视频重定向方法对比,本文方法在形状保持、时域保持及视差舒适度方面均具有更好的性能。另外,使用现有的客观质量评价方法对重定向结果进行评价,本文方法客观质量评价指标性能优于均匀缩放和细缝裁剪的视频重定向方法,时间复杂度较低,每帧的时间复杂度至少比细缝裁剪方法降低了98%。结论 提出的时空联合的视差优化方法同时在时域和舒适度上对视差进行优化,并考虑了时域保持,具有良好的视差优化与时域保持效果,展现了较高的稳定性和鲁棒性。本文方法能够用于3D视频的重定向,在保持立体视觉舒适性的同时适配不同尺寸的3D显示屏幕。  相似文献   

5.
This paper presents an accurate saliency detection algorithm customized for 3D images which contain abundant depth cue. Firstly, depth feature is calculated based on the sharp regions’ positions within the focal stack. Then, we compute the coarse saliency map by subtracting the background region from the all-focus image according to the depth feature. Finally, we employ the contrast information in the coarse saliency map to obtain the final result. Experiments on light field dataset demonstrate that our approach favorably outperforms five state-of-the-art methods in terms of precision, recall and F-Measure. Moreover, the depth feature is validated to be a valuable complement to existing visual saliency analysis under the circumstance that the background regions are complex or similar to salient object regions.  相似文献   

6.
This paper presents a new attention model for detecting visual saliency in news video. In the proposed model, bottom-up (low level) features and top-down (high level) factors are used to compute bottom-up saliency and top-down saliency respectively. Then, the two saliency maps are fused after a normalization operation. In the bottom-up attention model, we use quaternion discrete cosine transform in multi-scale and multiple color spaces to detect static saliency. Meanwhile, multi-scale local motion and global motion conspicuity maps are computed and integrated into motion saliency map. To effectively suppress the background motion noise, a simple histogram of average optical flow is adopted to calculate motion contrast. Then, the bottom-up saliency map is obtained by combining the static and motion saliency maps. In the top-down attention model, we utilize high level stimulus in news video, such as face, person, car, speaker, and flash, to generate the top-down saliency map. The proposed method has been extensively tested by using three popular evaluation metrics over two widely used eye-tracking datasets. Experimental results demonstrate the effectiveness of our method in saliency detection of news videos compared to several state-of-the-art methods.  相似文献   

7.
罗晓林  罗雷 《计算机科学》2016,43(Z6):171-174, 183
针对多视点视频的压缩问题,提出一种基于视觉显著性分析的编码算法。该算法根据人眼对显著性区域的失真更加敏感这一特性,通过控制显著性区域与非显著性区域的编码质量来有效提高多视点视频编码的效率。首先,利用融合颜色与运动信息的视频显著性滤波器提取出多视点视频图像像素级精度的视觉显著性图;然后,将所有视点视频的视觉显著性图转换为编码宏块的显著性表示;最后,利用感知视频编码的原理实现基于显著性的宏块质量自适应控制。实验结果表明,该算法有效地提高了多视点视频编码的率失真效率及主观视频质量。  相似文献   

8.
Perceptually salient regions have a significant effect on visual comfort in stereoscopic 3D (S3D) images. The conventional method of obtaining saliency maps is linear combination, which often weakens the saliency influence and distorts the original disparity range significantly. In this paper, we propose visual comfort enhancement in S3D images using saliency-adaptive nonlinear disparity mapping. First, we obtain saliency-adaptive disparity maps with visual sensitivity to maintain the disparity-based saliency influence. Then, we perform nonlinear disparity mapping based on a sigmoid function to minimize disparity distortions. Finally, we generate visually comfortable S3D images based on depth-image-based-rendering (DIBR). Experimental results demonstrate that the proposed method successfully improves visual comfort in S3D images by producing comfortable S3D images with high mean opinion score (MOS) while keeping the overall viewing image quality.  相似文献   

9.
目的 针对人眼观看立体图像内容可能存在的视觉不舒适性,基于视差对立体图像视觉舒适度的影响,提出了一种结合全局线性和局部非线性视差重映射的立体图像视觉舒适度提升方法。方法 首先,考虑双目融合限制和视觉注意机制,分别结合空间频率和立体显著性因素提取立体图像的全局和局部视差统计特征,并利用支持向量回归构建客观的视觉舒适度预测模型作为控制视差重映射程度的约束;然后,通过构建的预测模型对输入的立体图像的视觉舒适性进行分析,就欠舒适的立体图像设计了一个两阶段的视差重映射策略,分别是视差范围的全局线性重映射和针对提取的潜在欠舒适区域内视差的局部非线性重映射;最后,根据重映射后的视差图绘制得到舒适度提升后的立体图像。结果 在IVY Lab立体图像舒适度测试库上的实验结果表明,相较于相关有代表性的视觉舒适度提升方法对于欠舒适立体图像的处理结果,所提出方法在保持整体场景立体感的同时,能更有效地提升立体图像的视觉舒适度。结论 所提出方法能够根据由不同的立体图像特征构建的视觉舒适度预测模型来自动实施全局线性和局部非线性视差重映射过程,达到既改善立体图像视觉舒适度、又尽量减少视差改变所导致的立体感削弱的目的,从而提升立体图像的整体3维体验。  相似文献   

10.
目的 视觉显著性在众多视觉驱动的应用中具有重要作用,这些应用领域出现了从2维视觉到3维视觉的转换,从而基于RGB-D数据的显著性模型引起了广泛关注。与2维图像的显著性不同,RGB-D显著性包含了许多不同模态的线索。多模态线索之间存在互补和竞争关系,如何有效地利用和融合这些线索仍是一个挑战。传统的融合模型很难充分利用多模态线索之间的优势,因此研究了RGB-D显著性形成过程中多模态线索融合的问题。方法 提出了一种基于超像素下条件随机场的RGB-D显著性检测模型。提取不同模态的显著性线索,包括平面线索、深度线索和运动线索等。以超像素为单位建立条件随机场模型,联合多模态线索的影响和图像邻域显著值平滑约束,设计了一个全局能量函数作为模型的优化目标,刻画了多模态线索之间的相互作用机制。其中,多模态线索在能量函数中的权重因子由卷积神经网络学习得到。结果 实验在两个公开的RGB-D视频显著性数据集上与6种显著性检测方法进行了比较,所提模型在所有相关数据集和评价指标上都优于当前最先进的模型。相比于第2高的指标,所提模型的AUC(area under curve),sAUC(shuffled AUC),SIM(similarity),PCC(Pearson correlation coefficient)和NSS(normalized scanpath saliency)指标在IRCCyN数据集上分别提升了2.3%,2.3%,18.9%,21.6%和56.2%;在DML-iTrack-3D数据集上分别提升了2.0%,1.4%,29.1%,10.6%,23.3%。此外还进行了模型内部的比较,验证了所提融合方法优于其他传统融合方法。结论 本文提出的RGB-D显著性检测模型中的条件随机场和卷积神经网络充分利用了不同模态线索的优势,将它们有效融合,提升了显著性检测模型的性能,能在视觉驱动的应用领域发挥一定作用。  相似文献   

11.
Abstract— With the maturation of three‐dimensional (3‐D) technologies, display systems can provide higher visual quality to enrich the viewer experience. However, the depth information required for 3‐D displays is not available in conventional 2‐D recorded contents. Therefore, the conversion of existing 2‐D video to 3‐D video becomes an important issue for emerging 3‐D applications. This paper presents a system which automatically converts 2‐D videos to 3‐D format. The proposed system combines three major depth cues: the depth from motion, the scene depth from geometrical perspective, and the fine‐granularity depth from the relative position. The proposed system uses a block‐based method incorporating a joint bilateral filter to efficiently generate visually comfortable depth maps and to diminish the blocky artifacts. By means of the generated depth map, 2‐D videos can be readily converted into 3‐D format. Moreover, for conventional 2‐D displays, a 2‐D image/video depth perception enhancement application is also presented. With the depth‐aware adjustment of color saturation, contrast, and edge, the stereo effect of the 2‐D content can be enhanced. A user study on subjective quality shows that the proposed method has promising results on depth quality and visual comfort.  相似文献   

12.
目的 符合用户视觉特性的3维图像体验质量评价方法有助于准确、客观地体现用户观看3D图像或视频时的视觉感知体验,从而给优化3维内容提供一定的思路。现有的评价方法仅从图像失真、深度感知和视觉舒适度中的一个维度或两个维度出发对立体图像进行评价,评价结果的准确性有待进一步提升。为了更加全面和准确地评价3D图像的视觉感知体验,提出了一种用户多维感知的3D图像体验质量评价算法。方法 首先对左右图像的差异图像和融合图像提取自然场景统计参数表示失真特征;然后对深度图像提取敏感区域,对敏感区域绘制失真前后深度变换直方图,统计深度变化情况以及利用尺度不变特征变换(SIFT)关键点匹配算法计算匹配点数目,两者共同表示深度感知特征;接下来对视觉显著区域提取视差均值、幅值表示舒适度特征;最后综合考虑图像失真、深度感知和视觉舒适度3个维度特征,将3个维度特征归一化后联合成体验质量特征向量,采用支持向量回归(SVR)训练评价模型,并得到最终的体验质量得分。结果 在LIVE和Waterloo IVC数据库上的实验结果表明,所提出的方法与人们的主观感知的相关性达到了0.942和0.858。结论 该方法充分利用了立体图像的特性,评价结果优于比较的几种经典算法,所构建模型的评价结果与用户的主观体验有更好的一致性。  相似文献   

13.
A visual attention-based bit allocation strategy for video compression is proposed. Saliency-based attention prediction is used to detect interesting regions in video. From the top salient locations from the computed saliency map, a guidance map is generated to guide the bit allocation strategy through a new constrained global optimization approach, which can be solved in a closed form and independently of video frame content. Fifty video sequences (300 frames each) and eye-tracking data from 14 subjects were collected to evaluate both the accuracy of the attention prediction model and the subjective quality of the encoded video. Results show that the area under the curve of the guidance map is 0.773 ± 0.002, significantly above chance (0.500). Using a new eye-tracking-weighted PSNR (EWPSNR) measure of subjective quality, more than 90% of the encoded video clips with the proposed method achieve better subjective quality compared to standard encoding with matched bit rate. The improvement in EWPSNR is up to over 2 dB and on average 0.79 dB.  相似文献   

14.
Visual discomfort is one of the most frequent complaints of the viewers while watching 3D images and videos. Large disparity and large amount of motion are two main causes of visual discomfort. To quantify this influence, three objectives are set in this paper. The first one is the comparative analysis on the influence of different types of motion, i.e., static stereoscopic image, planar motion and in-depth motion, on visual discomfort. The second one is the investigation on the influence factors for each motion type, for example, the disparity offset, the disparity amplitude and velocity. The third one is to propose an objective model for visual discomfort. Thirty-six synthetic stereoscopic video stimuli with different types of motion are used in this study. In the subjective test, an efficient paired comparison method called Adaptive Square Design (ASD) was used to reduce the number of comparisons for each observer and keep the results reliable. The experimental results showed that motion does not always induce more visual discomfort than static conditions. The in-depth motion generally induces more visual discomfort than the planar motion. The relative disparity between the foreground and the background, and the motion velocity are identified as main factors for visual discomfort. According to the subjective results, an objective model for comparing visual discomfort induced by different types of motion is proposed which shows high correlation with the subjective perception.  相似文献   

15.
Visual saliency is an important research topic in the field of computer vision due to its numerous possible applications. It helps to focus on regions of interest instead of processing the whole image or video data. Detecting visual saliency in still images has been widely addressed in literature with several formulations. However, visual saliency detection in videos has attracted little attention, and is a more challenging task due to additional temporal information. A common approach for obtaining a spatio-temporal saliency map is to combine a static saliency map and a dynamic saliency map. In our work, we model the dynamic textures in a dynamic scene with local binary patterns to compute the dynamic saliency map, and we use color features to compute the static saliency map. Both saliency maps are computed using a bio-inspired mechanism of human visual system with a discriminant formulation known as center surround saliency, and are fused in a proper way. The proposed model has been extensively evaluated with diverse publicly available datasets which contain several videos of dynamic scenes, and comparison with state-of-the art methods shows that it achieves competitive results.  相似文献   

16.
The visual brain fuses the left and right images projected onto the two eyes from a stereoscopic 3D (S3D) display, perceives parallax, and rebuilds a sense of depth. In this process, the eyes adjust vergence and accommodation to adapt to the depths and parallax of the points they gazed at. Conflicts between accommodation and vergence when viewing S3D content potentially lead to visual discomfort. A variety of approaches have been taken towards understanding the perceptual bases of discomfort felt when viewing S3D, including extreme disparities or disparity gradients, negative disparities, dichoptic presentations, and so on. However less effort has been applied towards understanding the role of eye movements as they relate to visual discomfort when viewing S3D. To study eye movements in the context of S3D viewing discomfort, a Shifted-S3D-Image-Database (SSID) is constructed using 11 original nature scene S3D images and their 6 shifted versions. We conducted eye-tracking experiments on humans viewing S3D images in SSID while simultaneously collecting their judgments of experienced visual discomfort. From the collected eye-tracking data, regions of interest (ROIs) were extracted by kernel density estimation using the fixation data, and an empirical formula fitted between the disparities of salient objects marked by the ROIs and the mean opinion scores (MOS). Finally, eye-tracking data was used to analyze the eye movement characteristics related to S3D image quality. Fifteen eye movement features were extracted, and a visual discomfort predication model learned using a support vector regressor (SVR). By analyzing the correlations between features and MOS, we conclude that angular disparity features have a strong correlation with human judgments of discomfort.  相似文献   

17.
融合对象性和视觉显著度的单目图像2D转3D   总被引:1,自引:0,他引:1       下载免费PDF全文
受对象性测度和视觉显著度的启发,提出一种适用于单目图像2D转3D的对象窗深度中心环绕分布假设,给出融合对象性测度和视觉显著度的单目图像深度估计算法。首先计算图像的视觉显著度并将其映射成深度;其次在图像上随机采样若干个窗,并计算这些窗的对象性测度;再次,定义一个能量函数用于度量深度和对象性测度对彼此的影响程度,并通过迭代优化的方法改进深度和对象性测度的估计结果;最后,根据深度信息进行3D视频合成。实验结果表明,融入对象性测度信息后,显著改进了基于视觉显著度2D转3D的深度估计质量,保证了估计深度在对象边界处的不连续过渡和其他区域的平滑过渡。  相似文献   

18.
周静波  黄伟 《控制与决策》2021,36(7):1707-1713
基于低秩矩阵恢复(low-rank matrix recovery,LRMR)的显著性目标检测模型将图像特征分解为与背景关联的低秩分量和与显著性目标相关联的稀疏分量,并从稀疏分量中获得显著性目标.现有的显著性检测方法很少考虑低秩分量与稀疏分量之间的相互关系,导致检测的显著性目标零散或不完整.为此,提出基于低秩矩阵恢复的显著性目标检测与细化方法来规避该限制.首先,所提方法采用ell_1范数稀疏约束和拉普拉斯正则项对初始显著图进行计算;在显著性细化阶段,由于非局部的ell_0优化可以有效地对显著性区域及其邻接区域之间的相互关系进行建模,结合初始显著图,采用非局部ell_0梯度优化,最小化显著性区域中显著值的变化,从而保证显著性目标的完整性.在4个显著性目标检测数据集上进行实验,通过实验结果验证所提算法的优越性.  相似文献   

19.
周莺  张基宏  梁永生  柳伟 《计算机科学》2015,42(11):118-122
为了更准确有效地提取人眼观察视频的显著性区域,提出一种基于视觉运动特性的视频时空显著性区域提取方法。该方法首先通过分析视频每帧的频域对数谱得到空域显著图,利用全局运动估计和块匹配得到时域显著图,再结合人眼观察视频时的视觉特性,根据对不同运动特性视频的主观感知,动态融合时空显著图。实验分析从主客观两个方面衡量。视觉观测和量化指标均表明, 与其他经典方法相比,所提方法提取的显著性区域能够更准确地反映人眼的视觉注视区域。  相似文献   

20.
This paper presents a spatio-temporal saliency model that predicts eye movement during video free viewing. This model is inspired by the biology of the first steps of the human visual system. The model extracts two signals from video stream corresponding to the two main outputs of the retina: parvocellular and magnocellular. Then, both signals are split into elementary feature maps by cortical-like filters. These feature maps are used to form two saliency maps: a static and a dynamic one. These maps are then fused into a spatio-temporal saliency map. The model is evaluated by comparing the salient areas of each frame predicted by the spatio-temporal saliency map to the eye positions of different subjects during a free video viewing experiment with a large database (17000 frames). In parallel, the static and the dynamic pathways are analyzed to understand what is more or less salient and for what type of videos our model is a good or a poor predictor of eye movement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号