首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Infrared and visible image fusion aims to synthesize a single fused image containing salient targets and abundant texture details even under extreme illumination conditions. However, existing image fusion algorithms fail to take the illumination factor into account in the modeling process. In this paper, we propose a progressive image fusion network based on illumination-aware, termed as PIAFusion, which adaptively maintains the intensity distribution of salient targets and preserves texture information in the background. Specifically, we design an illumination-aware sub-network to estimate the illumination distribution and calculate the illumination probability. Moreover, we utilize the illumination probability to construct an illumination-aware loss to guide the training of the fusion network. The cross-modality differential aware fusion module and halfway fusion strategy completely integrate common and complementary information under the constraint of illumination-aware loss. In addition, a new benchmark dataset for infrared and visible image fusion, i.e., Multi-Spectral Road Scenarios (available at https://github.com/Linfeng-Tang/MSRS), is released to support network training and comprehensive evaluation. Extensive experiments demonstrate the superiority of our method over state-of-the-art alternatives in terms of target maintenance and texture preservation. Particularly, our progressive fusion framework could round-the-clock integrate meaningful information from source images according to illumination conditions. Furthermore, the application to semantic segmentation demonstrates the potential of our PIAFusion for high-level vision tasks. Our codes will be available at https://github.com/Linfeng-Tang/PIAFusion.  相似文献   

2.
In the realm of conventional deep-learning-based pan-sharpening approaches, there has been an ongoing struggle to harmonize the input panchromatic (PAN) and multi-spectral (MS) images across varied channels. Existing methods have often been stymied by spectral distortion and an inadequate texture representation. To address these limitations, we present an innovative constraint-based image generation strategy tailored for the pan-sharpening task. Our method employs a multi-scale conditional invertible neural network, named PSCINN, which is capable of converting the ground truth MS image into a downscaled MS image and a latent variable, all under the guidance of the PAN image. Subsequently, the resampled latent variable, obtained from a prior distribution, and the low-resolution MS image are harnessed to predict the pan-sharpened image in an information-preserving manner, with the PAN image providing essential guidance during the reversion process. Furthermore, we meticulously architect a conditional invertible block to construct a Jacobian Determinant for the spectral information recovery. This structure effectively pre-processes the conditioning PAN image into practical texture information, thereby preventing the spectral information in the pan-sharpened result from potential contamination. The proposed PSCINN outperforms existing state-of-the-art pan-sharpening methodologies, both in terms of objective and subjective results. Post-treatment experiments underscore a substantial enhancement in the perceived quality attributed to our method. The source code for PSCINN will be accessible at https://github.com/jiaming-wang/PSCINN.  相似文献   

3.
Accurate retinal vessel segmentation is very challenging. Recently, the deep learning based method has greatly improved performance. However, the non-vascular structures usually harm the performance and some low contrast small vessels are hard to be detected after several down-sampling operations. To solve these problems, we design a deep fusion network (DF-Net) including multiscale fusion, feature fusion and classifier fusion for multi-source vessel image segmentation. The multiscale fusion module allows the network to detect blood vessels with different scales. The feature fusion module fuses deep features with vessel responses extracted from a Frangi filter to obtain a compact yet domain invariant feature representation. The classifier fusion module provides the network more supervision. DF-Net also predicts the parameter of the Frangi filter to avoid manually picking the best parameters. The learned Frangi filter enhances the feature map of the multiscale network and restores the edge information loss caused by down-sampling operations. The proposed end-to-end network is easy to train and the inference time for one image is 41ms on a GPU. The model outperforms state-of-the-art methods and achieves the accuracy of 96.14%, 97.04%, 98.02% from three publicly available fundus image datasets DRIVE, STARE, CHASEDB1, respectively. The code is available at https://github.com/y406539259/DF-Net.  相似文献   

4.
Establishing reliable correspondences by a deep neural network is an important task in computer vision, and it generally requires permutation-equivariant architecture and rich contextual information. In this paper, we design a Permutation-Equivariant Split Attention Network (called PESA-Net), to gather rich contextual information for the feature matching task. Specifically, we propose a novel “Split–Squeeze–Excitation–Union” (SSEU) module. The SSEU module not only generates multiple paths to exploit the geometrical context of putative correspondences from different aspects, but also adaptively captures channel-wise global information by explicitly modeling the interdependencies between the channels of features. In addition, we further construct a block by fusing the SSEU module, Multi-Layer Perceptron and some normalizations. The proposed PESA-Net is able to effectively infer the probabilities of correspondences being inliers or outliers and simultaneously recover the relative pose by essential matrix. Experimental results demonstrate that the proposed PESA-Net relative surpasses state-of-the-art approaches for pose estimation and outlier rejection on both outdoor scenes and indoor scenes (i.e., YFCC100M and SUN3D). Source codes: https://github.com/x-gb/PESA-Net.  相似文献   

5.
Pansharpening is about fusing a high spatial resolution panchromatic image with a simultaneously acquired multispectral image with lower spatial resolution. In this paper, we propose a Laplacian pyramid pansharpening network architecture for accurately fusing a high spatial resolution panchromatic image and a low spatial resolution multispectral image, aiming at getting a higher spatial resolution multispectral image. The proposed architecture considers three aspects. First, we use the Laplacian pyramid method whose blur kernels are designed according to the sensors’ modulation transfer functions to separate the images into multiple scales for fully exploiting the crucial spatial information at different spatial scales. Second, we develop a fusion convolutional neural network (FCNN) for each scale, combining them to form the final multi-scale network architecture. Specifically, we use recursive layers for the FCNN to share parameters across and within pyramid levels, thus significantly reducing the network parameters. Third, a total loss consisting of multiple across-scale loss functions is employed for training, yielding higher accuracy. Extensive experimental results based on quantitative and qualitative assessments exploiting benchmarking datasets demonstrate that the proposed architecture outperforms state-of-the-art pansharpening methods. Code is available at https://github.com/ChengJin-git/LPPN.  相似文献   

6.
Due to the huge gap between the high dynamic range of natural scenes and the limited (low) range of consumer-grade cameras, a single-shot image can hardly record all the information of a scene. Multi-exposure image fusion (MEF) has been an effective way to solve this problem by integrating multiple shots with different exposures, which is in nature an enhancement problem. During fusion, two perceptual factors including the informativeness and the visual realism should be concerned simultaneously. To achieve the goal, this paper presents a deep perceptual enhancement network for MEF, termed as DPE-MEF. Specifically, the proposed DPE-MEF contains two modules, one of which responds to gather content details from inputs while the other takes care of color mapping/correction for final results. Both extensive experimental results and ablation studies are conducted to show the efficacy of our design, and demonstrate its superiority over other state-of-the-art alternatives both quantitatively and qualitatively. We also verify the flexibility of the proposed strategy on improving the exposure quality of single images. Moreover, our DPE-MEF can fuse 720p images in more than 60 pairs per second on an Nvidia 2080Ti GPU, making it attractive for practical use. Our code is available at https://github.com/dongdong4fei/DPE-MEF.  相似文献   

7.
Despite the tremendous achievements of deep convolutional neural networks (CNNs) in many computer vision tasks, understanding how they actually work remains a significant challenge. In this paper, we propose a novel two-step understanding method, namely Salient Relevance (SR) map, which aims to shed light on how deep CNNs recognize images and learn features from areas, referred to as attention areas, therein. Our proposed method starts out with a layer-wise relevance propagation (LRP) step which estimates a pixel-wise relevance map over the input image. Following, we construct a context-aware saliency map, SR map, from the LRP-generated map which predicts areas close to the foci of attention instead of isolated pixels that LRP reveals. In human visual system, information of regions is more important than of pixels in recognition. Consequently, our proposed approach closely simulates human recognition. Experimental results using the ILSVRC2012 validation dataset in conjunction with two well-established deep CNN models, AlexNet and VGG-16, clearly demonstrate that our proposed approach concisely identifies not only key pixels but also attention areas that contribute to the underlying neural network's comprehension of the given images. As such, our proposed SR map constitutes a convenient visual interface which unveils the visual attention of the network and reveals which type of objects the model has learned to recognize after training. The source code is available at https://github.com/Hey1Li/Salient-Relevance-Propagation.  相似文献   

8.
殷学梅  周军华  朱耀琴 《计算机应用》2018,38(10):3017-3024
针对在传统基于工作流的协同设计中,不同专业设计人员交流和任务协调困难导致产品设计效率低下的问题,提出复杂产品"一元三层"数据模型和基于数据驱动的复杂产品协同设计技术。首先采用多维多粒度的数据建模和本体描述方法完成了对复杂产品的信息建模,然后采用基于本体的语义检索技术完成协同设计过程任务的数据订阅,最后实现基于数据订阅/发布的复杂产品任务协同技术。实验结果表明,基于数据驱动的复杂产品协同设计技术解决了传统协同设计过程中不同专业设计人员之间交流与任务协调的困难,实现复杂产品协同设计过程的螺旋式上升,从而提高了产品设计效率。  相似文献   

9.
In this paper, an unsupervised learning-based approach is presented for fusing bracketed exposures into high-quality images that avoids the need for interim conversion to intermediate high dynamic range (HDR) images. As an objective quality measure – the colored multi-exposure fusion structural similarity index measure (MEF-SSIMc) – is optimized to update the network parameters, the unsupervised learning can be realized without using any ground truth (GT) images. Furthermore, an unreferenced gradient fidelity term is added in the loss function to recover and supplement the image information for the fused image. As shown in the experiments, the proposed algorithm performs well in terms of structure, texture, and color. In particular, it maintains the order of variations in the original image brightness and suppresses edge blurring and halo effects, and it also produces good visual effects that have good quantitative evaluation indicators. Our code will be publicly available at https://github.com/cathying-cq/UMEF.  相似文献   

10.
Visual impairment assistance systems play a vital role in improving the standard of living for visually impaired people (VIP). With the development of deep learning technologies and assistive devices, many assistive technologies for VIP have achieved remarkable success in environmental perception and navigation. In particular, convolutional neural network (CNN)-based models have surpassed the level of human recognition and achieved a strong generalization ability. However, the large memory and computation consumption in CNNs have been one of the main barriers to deploying them into resource-limited systems for visual impairment assistance applications. To this end, most cheap convolutions (e.g., group convolution, depth-wise convolution, and shift convolution) have recently been used for memory and computation reduction but with a specific architecture design. Furthermore, it results in a low discriminability of the compressed networks by directly replacing the standard convolution with these cheap ones. In this paper, we propose to use knowledge distillation to improve the performance of compact student networks with cheap convolutions. In our case, the teacher is a network with the standard convolution, while the student is a simple transformation of the teacher architecture without complicated redesigning. In particular, we introduce a novel online distillation method, which online constructs the teacher network without pre-training and conducts mutual learning between the teacher and student network, to improve the performance of the student model. Extensive experiments demonstrate that the proposed approach achieves superior performance to simultaneously reduce memory and computation overhead of cutting-edge CNNs on different datasets, including CIFAR-10/100 and ImageNet ILSVRC 2012, compared to the previous CNN compression and acceleration methods. The codes are publicly available at https://github.com/EthanZhangYC/OD-cheap-convolution.  相似文献   

11.
A principal task in dissecting the genetics of complex traits is to identify causal genes for disease phenotypes. Millions of genes have been sequenced in data-driven genomics era, but their causal relationships with disease phenotypes remain limited, due to the difficulty of elucidating underlying causal genes by laboratory-based strategies. Here, we proposed an innovative deep learning computational modeling alternative (DPPCG framework) for identifying causal (coding) genes for a specific disease phenotype. In terms of male infertility, we introduced proteins as intermediate cell variables, leveraging integrated deep knowledge representations (Word2vec, ProtVec, Node2vec, and Space2vec) quantitatively represented as ‘protein deep profiles’. We adopted deep convolutional neural network (CNN) classifier to model protein deep profiles relationships with male infertility, creatively training deep CNN models of single-label binary classification and multi-label eight classification. We demonstrate the capabilities of DPPCG framework by integrating and fully harnessing the utility of heterogeneous biomedical big data, including literature, protein sequences, protein–protein interactions, gene expressions, and gene–phenotype relationships, and effective indirect prediction of 794 causal genes of male infertility and associated pathological processes. We present this research in an interactive ‘Smart Protein’ intelligent (demo) system (http://www.smartprotein.cloud/public/home). Researchers can benefit from our intelligent system by (i) accessing a shallow gene/protein-radar service involving research status and a knowledge graph-based vertical search; (ii) querying and downloading protein deep profile matrices; (iii) accessing intelligent recommendations for causal genes of male infertility and associated pathological processes, and references for model architectures, parameter settings, and training outputs; and (iv) carrying out personalized analysis such as online K-Means clustering.  相似文献   

12.
The application of crowdsourced social media data in flood mapping and other disaster management initiatives is a burgeoning field of research, but not one that is without challenges. In identifying these challenges and in making appropriate recommendations for future direction, it is vital that we learn from the past by taking a constructively critical appraisal of highly-praised projects in this field, which through real-world implementations have pioneered the use of crowdsourced geospatial data in modern disaster management. These real-world applications represent natural experiments, each with myriads of lessons that cannot be easily gained from computer-confined simulations. This paper reports on lessons learnt from a 3-year implementation of a highly-praised project- the PetaJakarta.org project. The lessons presented derive from the key success factors and the challenges associated with the PetaJakarta.org project. To contribute in addressing some of the identified challenges, desirable characteristics of future social media-based disaster mapping systems are discussed. It is envisaged that the lessons and insights shared in this study will prove invaluable within the broader context of designing socio-technical systems for crowdsourcing and harnessing disaster-related information.  相似文献   

13.
Event classification is inherently sequential and multimodal. Therefore, deep neural models need to dynamically focus on the most relevant time window and/or modality of a video. In this study, we propose the Multimodal Attentive Fusion Network (MAFnet), an architecture that can dynamically fuse visual and audio information for event recognition. Inspired by prior studies in neuroscience, we couple both modalities at different levels of visual and audio paths. Furthermore, the network dynamically highlights a modality at a given time window relevant to classify events. Experimental results in AVE (Audio-Visual Event), UCF51, and Kinetics-Sounds datasets show that the approach can effectively improve the accuracy in audio-visual event classification. Code is available at: https://github.com/numediart/MAFnet  相似文献   

14.
The research on universal adversarial perturbations (UAPs) is significant to trustworthy deep learning. To disentangle the UAPs with the training data dependency and the target model dependency, the exploration of procedural noise functions is a feasible method. However, the current procedural adversarial noise attack method has several characteristics like visually significant anisotropy and gradient artifacts that may impact the stealthiness of adversarial examples. This study proposes a novel model-free and data-free UAP method based on the procedural noise functions with two variants: Simplex noise attack and Worley noise attack. The attack method can achieve deceit on the neural networks with a more aesthetic rendering effect. A detailed empirical study is provided to validate the effectiveness of the proposed attack method. The extensive experiments show that the UAPs generated by the proposed method achieve considerable attack performance on the ImageNet dataset and the CIFAR-10 dataset. Moreover, this study implements the performance evaluation and robustness analysis of existing defense methods against the proposed UAPs. It has the potential to enhance research on the robustness of neural networks in real applications. The code is available at https://github.com/momo1986/adversarial_example_simplex_worley.  相似文献   

15.
In this work we address the challenging case of answering count queries in web search, such as number of songs by John Lennon. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at https://github.com/ghoshs/CoQEx and https://nlcounqer.mpi-inf.mpg.de/.  相似文献   

16.
17.
In the image fusion field, the design of deep learning-based fusion methods is far from routine. It is invariably fusion-task specific and requires a careful consideration. The most difficult part of the design is to choose an appropriate strategy to generate the fused image for a specific task in hand. Thus, devising learnable fusion strategy is a very challenging problem in the community of image fusion. To address this problem, a novel end-to-end fusion network architecture (RFN-Nest) is developed for infrared and visible image fusion. We propose a residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach. A novel detail-preserving loss function, and a feature enhancing loss function are proposed to train RFN. The fusion model learning is accomplished by a novel two-stage training strategy. In the first stage, we train an auto-encoder based on an innovative nest connection (Nest) concept. Next, the RFN is trained using the proposed loss functions. The experimental results on public domain data sets show that, compared with the existing methods, our end-to-end fusion network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation. The code of our fusion method is available at https://github.com/hli1221/imagefusion-rfn-nest.  相似文献   

18.
Tremendous advances in different areas of knowledge are producing vast volumes of data, a quantity so large that it has made necessary the development of new computational algorithms. Among the algorithms developed, we find Machine Learning models and specific data mining techniques that might be useful for all areas of knowledge. The use of computational tools for data analysis is increasingly required, given the need to extract meaningful information from such large volumes of data. However, there are no free access libraries, modules, or web services that comprise a vast array of analytical techniques in a user-friendly environment for non-specific users. Those that exist raise high usability barriers for those untrained in the field as they usually have specific installation requirements and require in-depth programming knowledge, or may result expensive. As an alternative, we have developed DMAKit, a user-friendly web platform powered by DMAKit-lib, a new library implemented in Python, which facilitates the analysis of data of different kind and origins. Our tool implements a wide array of state-of-the-art data mining and pattern recognition techniques, allowing the user to quickly implement classification, prediction or clustering models, statistical evaluation, and feature analysis of different attributes in diverse datasets without requiring any specific programming knowledge. DMAKit is especially useful for users who have large volumes of data to be analyzed but do not have the informatics, mathematical, or statistical knowledge to implement models. We expect this platform to provide a way to extract information and analyze patterns through data mining techniques for anyone interested in applying them with no specific knowledge required. Particularly, we present several cases of study in the areas of biology, biotechnology, and biomedicine, where we highlight the applicability of our tool to ease the labor of non-specialist users to apply data analysis and pattern recognition techniques. DMAKit is available for non-commercial use as an open-access library, licensed under the GNU General Public License, version GPL 3.0. The web platform is publicly available at https://pesb2.cl/dmakitWeb. Demonstrative and tutorial videos for the web platform are available in https://pesb2.cl/dmakittutorials/. Complete urls for relevant content are listed in the Data Availability section.  相似文献   

19.
This paper presents a foreground extraction method for live-streaming videos using dual-side cameras on mobile devices. Compared to conventional methods, which estimate both foreground and background models from the front camera, the proposed method uses the rear camera to infer the reference background model. To this end, the short-term trajectory analysis is first performed to cluster point trajectories of the front camera, and then the long-term trajectory analysis is performed to compare the paths of the clustered trajectories with the reference path obtained from the rear camera. In particular, clusters having high correlation are classified as background using the Gaussian mixture model. Additionally, a pixel-wise segmentation map is obtained via graph-based segmentation. Experimental results show that the proposed method is robust under a variety of camera motion, outperforming state-of-the-art methods. Code and dataset can be found at https://github.com/YCL92/dualCamSeg.  相似文献   

20.
In the conceptual design stage, designers usually initiate a design concept through an association activity. The activity helps designers collect and retrieve reference information regarding a current design subject instead of starting from scratch. By modifying previous designs, designers can create a new design in a much shorter time. To computerize this process, this paper proposes an intelligent design retrieval system involving soft computing techniques for both feature and object association functions. A feature association method that utilizes fuzzy relation and fuzzy composition is developed to increase the searching spectrum. In the mean time, object association functions composed by a fuzzy neural network allow designers to control the similarity of retrieved designs. Our implementation result shows that the intelligent design retrieval system with two soft computing based association functions can retrieve target reference designs as expected.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号