共查询到20条相似文献,搜索用时 22 毫秒
1.
Multimedia Tools and Applications - In image captioning, exploring the advanced semantic concepts is very important for boosting captioning performance. Although much progress has been made in this... 相似文献
3.
In this work, we present a novel multi-scale feature fusion network (M-FFN) for image captioning task to incorporate discriminative features and scene contextual information of an image. We construct multi-scale feature fusion network by leveraging spatial transformation and multi-scale feature pyramid networks via feature fusion block to enrich spatial and global semantic information. In particular, we take advantage of multi-scale feature pyramid network to incorporate global contextual information by employing atrous convolutions on top layers of convolutional neural network (CNN). And, the spatial transformation network is exploited on early layers of CNN to remove intra-class variability caused by spatial transformations. Further, the feature fusion block integrates both global contextual information and spatial features to encode the visual information of an input image. Moreover, spatial-semantic attention module is incorporated to learn attentive contextual features to guide the captioning module. The efficacy of the proposed model is evaluated on the COCO dataset. 相似文献
4.
基于注意力机制的推荐模型在进行特征提取时用到的绝对位置是一个静态且孤立的信息.为克服上述缺点,提出基于翻译结构的相对位置注意力机制推荐模型.以时序排列用户历史行为并构造相对位置表征,分别在计算注意力权重和输出中加入相对位置表征,加深注意力编码层和解码层并用平均注意力进行预处理.实验结果表明,与基于注意力机制的模型相比,所提模型更能捕获用户偏好的动态变化,挖掘更深层的信息,更适合处理长序列. 相似文献
6.
In order to cope with the ambiguity of spatial relative position concepts, we propose a new definition of the relative position between two objects in a fuzzy set framework. This definition is based on a morphological and fuzzy pattern-matching approach, and consists of comparing an object to a fuzzy landscape representing the degree of satisfaction of a directional relationship to a reference object. It has good formal properties, it is flexible, it fits the intuition, and it can be used for structural pattern recognition under imprecision. Moreover, it also applies in 3D and for fuzzy objects issued from images 相似文献
7.
The socio-economic development of the World Wide Web gathered a large momentum through Web 2.0 and currently, the Web of Data is adding a further technological driver to this. The tremendous growth of media data combined with structured (linked) data promises further opportunities for digital market places. Although the integration of media content with linked data is only beginning there are already working groups and projects addressing this issue. We show how media and existing data sets can be seamlessly integrated and thus give possibilities for an extended user experience while interacting with media content on the web. We focus on automatic semantic enhancement services that can link arbitrary and open accessible data and introduce opportunities for media annotation, fragmentation and presentation. Our use case scenario is based on the Red Bull Content Pool, a media management system for videos, images and articles about Red Bull related content covering a multitude of sports events. 相似文献
8.
Neural Computing and Applications - The automatic narration of a natural scene is an important trait in artificial intelligence that unites computer vision and natural language processing. Caption... 相似文献
10.
The impetus behind Semantic Web research remains the vision of supplementing availability with utility; that is, the World Wide Web provides availability of digital media, but the Semantic Web will allow presently available digital media to be used in unseen ways. An example of such an application is multimedia retrieval. At present, there are vast amounts of digital media available on the web. Once this media gets associated with machine-understandable metadata, the web can serve as a potentially unlimited supplier for multimedia web services, which could populate themselves by searching for keywords and subsequently retrieving images or articles, which is precisely the type of system that is proposed in this paper. Such a system requires solid interoperability, a central ontology, semantic agent search capabilities, and standards. Specifically, this paper explores this cross-section of image annotation and Semantic Web services, models the web service components that constitute such a system, discusses the sequential, cooperative execution of these Semantic Web services, and introduces intelligent storage of image semantics as part of a semantic link space. 相似文献
11.
The paper presents a new adaptive full reference method for quality measurement of image enhancement algorithms. The method is based on the analysis of basic edges??sharp edges which are distant from another edges. The proposed basic edges metrics calculates error values in two areas related to typical artifacts of image enhancement algorithms: basic edges area and basic edges neighborhood. The metrics are illustrated with an application to image resampling and image deblurring but it is also applicable for image deringing and image denoising. 相似文献
12.
Recent photography techniques such as sculpting with light show great potential in compositing beautiful images from fixed‐viewpoint photos under multiple illuminations. The process relies heavily on the artists’ experience and skills using the available tools. An apparent trend in recent works is to facilitate the interaction making it less time‐consuming and addressable not only to experts, but also novices. We propose a method that automatically creates enhanced light montages that are comparable to those produced by artists. It detects and emphasizes cues that are important for perception by introducing a technique to extract depth and shape edges from an unconstrained light stack. Studies show that these cues are associated with silhouettes and suggestive contours which artists use to sketch and construct the layout of paintings. Textures, due to perspective distortion, offer essential cues that depict shape and surface slant. We balance the emphasis between depth edges and reflectance textures to enhance the sense of both shape and reflectance properties. Our light montage technique works perfectly with a few to hundreds of illuminations for each scene. Experiments show great results for static scenes making it practical for small objects, interiors and small‐scale outdoor scenes. Dynamic scenes may be captured using spatially distributed light setups such as light domes. The approach could also be applied to time‐lapse photos, with the sun as the main light source. 相似文献
13.
针对X射线图像对比度不高,图像偏暗,边缘模糊,噪声大的问题,提出了一种小波变换和模糊理论相结合的图像增强新方法.首先,将射线图像进行小波分解获得低频子带和高频子带,然后,对含有图像基本面貌特征和主要能量信息的低频子带采用广义模糊算子进行处理,能较好地提升图像对比度和局部亮度,对含有噪声和细节信息的高频子带利用软阈值去噪方法进行去噪处理,同时定义了一种新的增强算子,在去噪的同时进行细节增强,最后,对处理后的图像进行小波重构.实验结果表明:该方法可以有效去除图像噪声,提升图像对比度和清晰度,视觉效果良好. 相似文献
14.
The importance of describing relationships between objects has been highlighted in works in very different areas, including image understanding. Among these relationships, directional relative position relations are important since they provide an important information about the spatial arrangement of objects in the scene. Such concepts are rather ambiguous, they defy precise definitions, but human beings have a rather intuitive and common way of understanding and interpreting them. Therefore in this context, fuzzy methods are appropriate to provide consistent definitions that integrate both quantitative and qualitative knowledge, thus providing a computational representation and interpretation of imprecise spatial relations, expressed in a linguistic way, and including quantitative knowledge. Several fuzzy approaches have been developed in the literature, and the aim of this paper is to review and compare them according to their properties and according to the types of questions they seek to answer. 相似文献
15.
Journal of Real-Time Image Processing - Dynamic range compression has become an important function used in modern digital video cameras to improve visual quality of color images suffered from low... 相似文献
16.
In this survey, we argue that using structured vocabularies is capital to the success of image annotation. We analyze literature on image annotation uses and user needs, and we stress the need for automatic annotation. We briefly expose the difficulties posed to machines for this task and how it relates to controlled vocabularies. We survey contributions in the field showing how structures are introduced. First we present studies that use unstructured vocabulary, focusing on those introducing links between categories or between features. Then we review work using structured vocabularies as an input and analyze how the structure is exploited. 相似文献
17.
In this work, a method to enhance images based on a new artificial life model is presented. The model is inspired on the behavior of a herbivore organism, when this organism is in a certain environment and selects its food. This organism travels through the image iteratively, selecting the more suitable food and eating parts of it in each iteration. The path that the organism travels through in the image is defined by a priori knowledge about the environment and how it should move in it. Here, we modeled the control and perception centers of the organism, as well as the simulation of its actions and effects on the environment. To demonstrate the efficiency of our method quantitative and qualitative results of the enhancement of synthetic and real images with low contrast and different levels of noise are presented. Obtained results confirm the ability of the new artificial life model for improving the contrast of the objects in the input images. 相似文献
18.
为了解决分数阶微分应用于图像处理中难以确定分数阶微分阶次的问题,首先分析了图像的分数阶微分增强效果在一定范围内随着分数阶微分阶次的增大而增大以及图像的平均亮度越大,恰可感知的亮度差异就越大的特点;然后,根据图像的整体灰度分布和局部灰度值构造了自适应分段函数来确定分数阶微分阶次。实验结果表明,该方法能自动寻找最佳微分阶次,增强后的图像视觉效果明显,图像增强视觉效果接近或超过最佳微分阶次下的视觉效果,增强图像的对比度明显高于最佳微分阶次下的对比度。 相似文献
19.
The modified slant and modified slant Haar transforms are proved to be efficient for image bandwidth compression. At times, it may be desirable use the existing hardware and software facilities for different purposes. This paper investigates the utility of the modified slant and modified slant Haar transforms for dynamic range reduction and enhancement of image. Both, the linear and nonlinear filtering techniques are used and the results are interpreted. 相似文献
|