首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 72 毫秒
1.
Although the deep CNN-based super-resolution methods have achieved outstanding performance, their memory cost and computational complexity severely limit their practical employment. Knowledge distillation (KD), which can efficiently transfer knowledge from a cumbersome network (teacher) to a compact network (student), has demonstrated its advantages in some computer vision applications. The representation of knowledge is vital for knowledge transferring and student learning, which is generally defined in hand-crafted manners or uses the intermediate features directly. In this paper, we propose a model-agnostic meta knowledge distillation method under the teacher–student architecture for the single image super-resolution task. It provides a more flexible and accurate way to help teachers transmit knowledge in accordance with the abilities of students via knowledge representation networks (KRNets) with learnable parameters. Specifically, the texture-aware dynamic kernels are generated from local information to decompose the distillation problem into texture-wise supervision for further promoting the recovery quality of high-frequency details. In addition, the KRNets are optimized in a meta-learning manner to ensure the knowledge transferring and the student learning are beneficial to improving the reconstructed quality of the student. Experiments conducted on various single image super-resolution datasets demonstrate that our proposed method outperforms existing defined knowledge representation-related distillation methods and can help super-resolution algorithms achieve better reconstruction quality without introducing any extra inference complexity.  相似文献   

2.
No-reference image quality assessment is of great importance to numerous image processing applications, and various methods have been widely studied with promising results. These methods exploit handcrafted features in the transformation or space domain that are discriminated for image degradations. However, abundant a priori knowledge is required to extract these handcrafted features. The convolutional neural network (CNN) is recently introduced into the no-reference image quality assessment, which integrates feature learning and regression into one optimization process. Therefore, the network structure generates an effective model for estimating image quality. However, the image quality score obtained by the CNN is based on the mean of all of the image patch scores without considering the human visual system, such as edges and contour of images. In this paper, we combine the CNN and the Prewitt magnitude of segmented images and obtain the image quality score using the mean of all the products of the image patch scores and weights based on the result of segmented images. Experimental results on various image distortion types demonstrate that the proposed algorithm achieves good performance.  相似文献   

3.
Video frame interpolation is a technology that generates high frame rate videos from low frame rate videos by using the correlation between consecutive frames. Presently, convolutional neural networks (CNN) exhibit outstanding performance in image processing and computer vision. Many variant methods of CNN have been proposed for video frame interpolation by estimating either dense motion flows or kernels for moving objects. However, most methods focus on estimating accurate motion. In this study, we exhaustively analyze the advantages of both motion estimation schemes and propose a cascaded system to maximize the advantages of both the schemes. The proposed cascaded network consists of three autoencoder networks, that process the initial frame interpolation and its refinement. The quantitative and qualitative evaluations demonstrate that the proposed cascaded structure exhibits a promising performance compared to currently existing state-of-the-art-methods.  相似文献   

4.
Sketch based image retrieval (SBIR), which uses free-hand sketches to search the images containing similar objects/scenes, is attracting more and more attentions as sketches could be got more easily with the development of touch devices. However, this task is difficult as the huge differences between sketches and images. In this paper, we propose a cross-domain representation learning framework to reduce these differences for SBIR. This framework aims to transfer sketches to images with the information learned both in the sketch domain and image domain by the proposed domain migration generative adversarial network (DMGAN). Furthermore, to reduce the representation gap between the generated images and natural images, a similarity learning network (SLN) is also proposed with the new designed loss function incorporating semantic information. Extensive experiments have been done from different aspects, including comparison with state-of-the-art methods. The results show that the proposed DMGAN and SLN really work for SBIR.  相似文献   

5.
In this study, we propose a new deep learning architecture named Multi-Level Dense Network (MLDNet) for multi-focus image fusion (MFIF). We introduce shallow and dense feature extraction in our feature extraction module to extract images features in a more robust way. In particular, we extracted the features from a mixture of many distributions from prior to the complex distribution through densely connected convolutional layers, then the extracted features are fused to form dense local feature maps. We added global feature fusion into the proposed architecture in order to merge the dense local feature maps of each source image into a fused image representation for the reconstruction of the final fused image. Our proposed MLDNet learns feature extraction, feature fusion and reconstruction within the same network to provide an end-to-end solution for MFIF. Experimental results demonstrate that our proposed method achieved significant performance against different state-of-the-art MFIF methods.  相似文献   

6.
Light Field (LF) image angular super-resolution aims to synthesize a high angular resolution LF image from a low angular resolution one, and is drawing increased attention because of its wide applications. In order to reconstruct a high angular resolution LF image, many learning based LF image angular super-resolution methods have been proposed. However, most existing methods are based on LF Epipolar Plane Image or Epipolar Plane Image volume representation, which underuse the LF image structure. The LF view spatial correlation and neighboring LF views angular correlations which can reflect LF image structure are not fully explored, which reduces LF angular super-resolution quality. In order to alleviate this problem, this paper introduces an Epipolar Plane Image Volume Stack (EPI-VS) representation for LF angular super-resolution. The EPI-VS is constituted by arranging all LF views in a raster order, which benefits in exploring LF view spatial correlation and neighboring LF views angular correlations. Based on such representation, we further propose an LF angular super-resolution network. 3D convolutions are applied in the whole super-resolution network to better accommodate the input EPI-VS data and allow information propagation between two spatial and one directional dimensions of EPI-VS data. Extensive experiments on synthetic and real-world LF scenes demonstrate the effectiveness of the proposed network. Moreover, we also illustrate the superiority of our network by applying it in scene depth estimation task.  相似文献   

7.
余家林  孙季丰  李万益 《电子学报》2016,44(8):1899-1908
为了准确有效的重构多视角图像中的三维人体姿态,该文提出一种基于多核稀疏编码的人体姿态估计算法.首先,针对连续帧姿态估计的歧义问题,该文设计了一种用于表达多视角图像的HA-SIFT描述子,其中,人体局部拓扑、肢体相对位置及外观信息被同时编码;然后,在多核学习框架下建立同时考虑特征空间内在流形结构与姿态空间几何信息的目标函数,并在希尔伯特空间优化目标函数以更新稀疏编码、过完备字典与多核权值;最后,利用姿态字典原子的线性组合来估计对应未知输入的三维人体姿态.实验结果表明,与核稀疏编码、Laplace稀疏编码及Bayesian稀疏编码相比,文本方法具有更高的估计精度.  相似文献   

8.
Aiming at the performance degradation of the existing presentation attack detection methods due to the illumination variation, a two-stream vision transformers framework (TSViT) based on transfer learning in two complementary spaces is proposed in this paper. The face images of RGB color space and multi-scale retinex with color restoration (MSRCR) space are fed to TSViT to learn the distinguishing features of presentation attack detection. To effectively fuse features from two sources (RGB color space images and MSRCR images), a feature fusion method based on self-attention is built, which can effectively capture the complementarity of two features. Experiments and analysis on Oulu-NPU, CASIA-MFSD, and Replay-Attack databases show that it outperforms most existing methods in intra-database testing and achieves good generalization performance in cross-database testing.  相似文献   

9.
结合稀疏表示与匹配梯度分布的图像复原   总被引:1,自引:1,他引:0  
刘哲  杨静  陈路 《光电子.激光》2015,26(6):1186-1193
针对基于稀疏表示的传统图像复原方法无法准确恢 复图像小尺度细节的不足,提出了一种结合稀疏 表示与匹配梯度分布的图像复原方法。首先在稀疏表示图像复原模型的基础上,利用参数化 的超拉 普拉斯分布估计原始图像的梯度分布;然后,通过对图像的梯度分布进行全局约束,利用梯 度直方图匹配 操作匹配图像梯度分布,使复原图像的梯度分布尽可能接近原始图像。仿真实验结果表明 , 本文算法能够取得较优的复原效果, 并且能以较高精度复原图像的细节信息。  相似文献   

10.
飞行器和空间成像制导装备在大气中高速飞行时会受到湍流干扰,导致光学系统接收到的图像发生模糊降质、像素偏移、信噪比降低等问题,开展退化图像复原技术及方法研究就成为空间光学成像系统获得较高性能图像的重要途径。通过对退化图像复原技术研究进展的系统梳理和分析研究,本文首先介绍了图像退化模型,接着给出了退化图像复原方法的分类,然后比较系统地介绍了确定正则化图像复原方法、随机正则化图像复原方法、基于局部相似性的图像复原方法、基于示例学习的图像复原方法等几种新型的单幅退化图像复原方法,其后分析了视频复原的特征、介绍了新近的几种典型的视频图像复原方法,最后分析总结出了图像复原的难点所在。对于促进我国退化图像复原技术的研究和发展具有一定的参考价值。  相似文献   

11.
In this paper, we propose a new adaptive bit rate (ABR) streaming method. This method is based on estimating and monitoring users' video streaming experience, their quality of experience (QoE). This ensures a good user QoE and optimises bandwidth utilisation by monitoring video buffer fill rate to ensure minimal data traffic. First, we achieve a QoE evaluation model based on network bandwidth, video segment representation, and dropped video frame rate parameters. Second, following our QoE evaluation model, we formulate an ABR method using the reinforcement learning (RL) paradigm to select video representations and using a breakpoint detection mechanism to monitor end‐user QoE variation. The proposed ABR method is called “QoE‐aware adaptive bit rate (Q2ABR)” and is composed of three individual modules, one for QoE estimation using machine learning methods, one for QoE variation monitoring using the breakpoint detection mechanism, and one for video representation selection using reinforcement learning. The design objective of Q2ABR is to ensure the overall QoE of these users while maintaining a minimum variation in the standard deviation of the users' QoE values. Third, the performance of the Q2ABR method is evaluated and compared with several existing ABR approaches in the literature using real traces that we collect on different transport scenarios (such as bus and train, among others). Since this method considers the user's perception of video quality as a regulator for optimising the overall video distribution network, good results are ensured in terms of the user's experience and buffer fill rate.  相似文献   

12.
孙超  吕俊伟  刘峰  周仁来 《激光与红外》2017,47(12):1559-1564
针对红外图像空间分辨率低、成像质量不高的问题,提出了基于迁移学习的红外图像超分辨率方法。该方法以基于卷积神经网络的自然图像超分辨率方法为基础进行改进:增加网络的层数进行更深层次的学习训练,串联多层小的卷积核使其能够利用更多的图像信息,以“相差图”为目标进行训练,减小网络训练时间,提升网络收敛速度;利用迁移学习知识,再以少量高质量红外图像为目标样本,对自然图像超分辨率的网络进行二次训练,将网络权重经过微调后迁移应用到红外图像的超分辨率上。实验结果表明:基于卷积神经网络的超分辨率方法能够有效迁移应用到红外图像的超分辨率上,且改进后的网络具有更好的自然及红外图像的超分辨率性能,验证了本文所提方法的有效性及优越性。  相似文献   

13.
Double JPEG compression detection plays a vital role in multimedia forensics, to find out whether a JPEG image is authentic or manipulated. However, it still remains to be a challenging task in the case when the quality factor of the first compression is much higher than that of the second compression, as well as in the case when the targeted image blocks are quite small. In this work, we present a novel end-to-end deep learning framework taking raw DCT coefficients as input to distinguish between single and double compressed images, which performs superior in the above two cases. Our proposed framework can be divided into two stages. In the first stage, we adopt an auxiliary DCT layer with sixty-four 8 × 8 DCT kernels. Using a specific layer to extract DCT coefficients instead of extracting them directly from JPEG bitstream allows our proposed framework to work even if the double compressed images are stored in spatial domain, e.g. in PGM, TIFF or other bitmap formats. The second stage is a deep neural network with multiple convolutional blocks to extract more effective features. We have conducted extensive experiments on three different image datasets. The experimental results demonstrate the superiority of our framework when compared with other state-of-the-art double JPEG compression detection methods either hand-crafted or learned using deep networks in the literature, especially in the two cases mentioned above. Furthermore, our proposed framework can detect triple and even multiple JPEG compressed images, which is scarce in the literature as far as we know.  相似文献   

14.
Displaying images on different devices, requires resizing of the media. Traditional image resizing methods result in quality degradation. Content-aware retargeting algorithms aim to resize images for displaying them on a new device with the goal of preserving important contents of the image. Quality assessment of retargeted images can be employed to choose among outputs of different retargeting methods or help the optimization of such methods. In this paper we propose a learning based quality assessment method for retargeted images. An optical flow algorithm is used to find the correspondence between regions in the scaled and retargeted images. Three groups of features are defined to cover different aspects of distortions that are important to human observers. Area related features are used to detect how the areas of salient regions are retained and how much geometrical deformities are produced in the image. Also, to better assess the retargeted image we introduce features to show how well the aspect ratios of objects are retained. More importantly, we introduce the concept of measuring the homogeneity of distribution of deformities throughout the image. Experimental results demonstrate that our quality estimation method has better correlation with subjective scores and outperforms existing methods.  相似文献   

15.
Image restoration and simplification methods that respect important features such as edges play a fundamental role in digital image processing. However, known edge-preserving methods like common nonlinear diffusion methods tend to round vertices for large diffusion times. In this paper, we adapt the diffusion tensor for anisotropic diffusion to avoid this effects in images containing rotated and sheared rectangles, respectively. In this context, we propose a new method for estimating rotation angles and shear parameters based on the so-called structure tensor. Further, we show how the knowledge of appropriate diffusion tensors can be used in variational models. Numerical examples including orientation estimation, denoising and segmentation demonstrate the good performance of our methods.   相似文献   

16.
The high performance of state-of-the-art deep learning methods for 3D hand pose estimation heavily depends on a large annotated training set. However, it is difficult and time-consuming to obtain the annotations for 3D hand poses. To leverage unannotated images to reduce the annotation cost, we propose a semi-supervised method based on Multi-Task and Multi-View Consistency (MTMVC) for hand pose estimation. First, we obtain the joints based on heatmap prediction and coordinate regression parallelly and encourage their consistency. Second, we introduce multi-view consistency to encourage the predicted poses to be rotation-invariant. Thirdly, to make the network pay more attention to the hand region, we propose a spatially weighted consistency. Experiments on four public datasets showed that our proposed MTMVC outperformed existing semi-supervised hand pose estimation methods, and by only using half of the annotations, the accuracy of our method was comparable to those of several state-of-the-art fully supervised methods.  相似文献   

17.
Due to the light absorption and scattering, captured underwater images usually contain severe color distortion and contrast reduction. To address the above problems, we combine the merits of deep learning and conventional image enhancement technology to improve the underwater image quality. We first propose a two-branch network to compensate the global distorted color and local reduced contrast, respectively. Adopting this global–local network can greatly ease the learning problem, so that it can be handled by using a lightweight network architecture. To cope with the complex and changeable underwater environment, we then design a compressed-histogram equalization to complement the data-driven deep learning, in which the parameters are fixed after training. The proposed compression strategy is able to generate vivid results without introducing over-enhancement and extra computing burden. Experiments demonstrate that our method significantly outperforms several state-of-the-arts in both qualitative and quantitative qualities.  相似文献   

18.
红外图像中的行人检测一直是计算机视觉领域的研究热点与难点。针对传统的红外行人检测方法需要人工设计目标表达特征的弊端,本文从深度学习的角度出发,提出一种可以自动构建目标表达特征的红外行人检测卷积神经网络。在对卷积神经网络的实现原理进行分析的基础上,设计了红外行人检测卷积神经网络的初始结构,然后通过实验对初始结构进行调整,得到最终的检测神经网络。对实拍红外人体数据库进行行人检测的实验结果表明,该方法在保持低虚警率的同时可以对红外图像中的行人进行稳健检测,优于传统方法。  相似文献   

19.
Underwater images play an essential role in acquiring and understanding underwater information. High-quality underwater images can guarantee the reliability of underwater intelligent systems. Unfortunately, underwater images are characterized by low contrast, color casts, blurring, low light, and uneven illumination, which severely affects the perception and processing of underwater information. To improve the quality of acquired underwater images, numerous methods have been proposed, particularly with the emergence of deep learning technologies. However, the performance of underwater image enhancement methods is still unsatisfactory due to lacking sufficient training data and effective network structures. In this paper, we solve this problem based on a conditional generative adversarial network (cGAN), where the clear underwater image is achieved by a multi-scale generator. Besides, we employ a dual discriminator to grab local and global semantic information, which enforces the generated results by the multi-scale generator realistic and natural. Experiments on real-world and synthetic underwater images demonstrate that the proposed method performs favorable against the state-of-the-art underwater image enhancement methods.  相似文献   

20.
In this paper, we propose novel algorithms for total variation (TV) based image restoration and parameter estimation utilizing variational distribution approximations. Within the hierarchical Bayesian formulation, the reconstructed image and the unknown hyper parameters for the image prior and the noise are simultaneously estimated. The proposed algorithms provide approximations to the posterior distributions of the latent variables using variational methods. We show that some of the current approaches to TV-based image restoration are special cases of our framework. Experimental results show that the proposed approaches provide competitive performance without any assumptions about unknown hyper parameters and clearly outperform existing methods when additional information is included.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号