首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 19 毫秒
1.
Automatic affect recognition in real-world environments is an important task towards a natural interaction between humans and machines. The recent years, several advancements have been accomplished in determining the emotional states with the use of Deep Neural Networks (DNNs). In this paper, we propose an emotion recognition system that utilizes the raw text, audio and visual information in an end-to-end manner. To capture the emotional states of a person, robust features need to be extracted from the various modalities. To this end, we utilize Convolutional Neural Networks (CNNs) and propose a novel transformer-based architecture for the text modality that can robustly capture the semantics of sentences. We develop an audio model to process the audio channel, and adopt a variation of a high resolution network (HRNet) to process the visual modality. To fuse the modality-specific features, we propose novel attention-based methods. To capture the temporal dynamics in the signal, we utilize Long Short-Term Memory (LSTM) networks. Our model is trained on the SEWA dataset of the AVEC 2017 research sub-challenge on emotion recognition, and produces state-of-the-art results in the text, visual and multimodal domains, and comparable performance in the audio case when compared with the winning papers of the challenge that use several hand-crafted and DNN features. Code is available at: https://github.com/glam-imperial/multimodal-affect-recognition.  相似文献   

2.
With recent advance in Earth Observation techniques, the availability of multi-sensor data acquired in the same geographical area has been increasing greatly, which makes it possible to jointly depict the underlying land-cover phenomenon using different sensor data. In this paper, a novel multi-attentive hierarchical fusion net (MAHiDFNet) is proposed to realize the feature-level fusion and classification of hyperspectral image (HSI) with Light Detection and Ranging (LiDAR) data. More specifically, a triple branch HSI-LiDAR Convolutional Neural Network (CNN) backbone is first developed to simultaneously extract the spatial features, spectral features and elevation features of the land-cover objects. On this basis, hierarchical fusion strategy is adopted to fuse the oriented feature embeddings. In the shallow feature fusion stage, we propose a novel modality attention (MA) module to generate the modality integrated features. By fully considering the correlation and heterogeneity between different sensor data, feature interaction and integration is released by the proposed MA module. At the same time, self-attention modules are also adopted to highlight the modality specific features. In the deep feature fusion stage, the obtained modality specific features and modality integrated features are fused to construct the hierarchical feature fusion framework. Experiments on three real HSI-LiDAR datasets demonstrate the effectiveness of the proposed framework. The code will be public on https://github.com/SYFYN0317/-MAHiDFNet.  相似文献   

3.
The International Society for the Study of Vascular Anomalies (ISSVA) provides a classification for vascular anomalies that enables specialists to unambiguously classify diagnoses. This classification is only available in PDF format and is not machine-readable, nor does it provide unique identifiers that allow for structured registration. In this paper, we describe the process of transforming the ISSVA classification into an ontology. We also describe the structure of this ontology, as well as two applications of the ontology using examples from the domain of rare disease research. We used the expertise of an ontology expert and clinician during the development process. We semi-automatically added mappings to relevant external ontologies using automated ontology matching systems and manual assessment by experts. The ISSVA ontology should contribute to making data for vascular anomaly research more Findable, Accessible, Interoperable, and Reusable (FAIR). The ontology is available at https://bioportal.bioontology.org/ontologies/ISSVA.  相似文献   

4.
As a special group, visually impaired people (VIP) find it difficult to access and use visual information in the same way as sighted individuals. In recent years, benefiting from the development of computer hardware and deep learning techniques, significant progress have been made in assisting VIP with visual perception. However, most existing datasets are annotated in single scenario and lack of sufficient annotations for diversity obstacles to meet the realistic needs of VIP. To address this issue, we propose a new dataset called Walk On The Road (WOTR), which has nearly 190 K objects, with approximately 13.6 objects per image. Specially, WOTR contains 15 categories of common obstacles and 5 categories of road judging objects, including multiple scenario of walking on sidewalks, tactile pavings, crossings, and other locations. Additionally, we offer a series of baselines by training several advanced object detectors on WOTR. Furthermore, we propose a simple but effective PC-YOLO to obtain excellent detection results on WOTR and PASCAL VOC datasets. The WOTR dataset is available at https://github.com/kxzr/WOTR  相似文献   

5.
The Internet of Musical Things (IoMusT) refers to the extension of the Internet of Things paradigm to the musical domain. Interoperability represents a central issue within this domain, where heterogeneous Musical Things serving radically different purposes are envisioned to communicate between each other. Automatic discovery of resources is also a desirable feature in IoMusT ecosystems. However, the existing musical protocols are not adequate to support discoverability and interoperability across the wide heterogeneity of Musical Things, as they are typically not flexible, lack high resolution, are not equipped with inference mechanisms that could exploit on board the information on the whole application environment. Besides, they hardly ever support easy integration with the Web. In addition, IoMusT applications are often characterized by strict requirements in terms of latency of the exchanged messages. Semantic Web of Things technologies have the potential to overcome the limitations of existing musical protocols by enabling discoverability and interoperability across heterogeneous Musical Things. In this paper we propose the Musical Semantic Event Processing Architecture (MUSEPA), a semantically-based architecture designed to meet the IoMusT requirements of low-latency communication, discoverability, interoperability, and automatic inference. The architecture is based on the CoAP protocol, a semantic publish/subscribe broker, and the adoption of shared ontologies for describing Musical Things and their interactions. The code implementing MUSEPA can be accessed at: https://github.com/CIMIL/MUSEPA/.  相似文献   

6.
Infrared and visible image fusion aims to synthesize a single fused image containing salient targets and abundant texture details even under extreme illumination conditions. However, existing image fusion algorithms fail to take the illumination factor into account in the modeling process. In this paper, we propose a progressive image fusion network based on illumination-aware, termed as PIAFusion, which adaptively maintains the intensity distribution of salient targets and preserves texture information in the background. Specifically, we design an illumination-aware sub-network to estimate the illumination distribution and calculate the illumination probability. Moreover, we utilize the illumination probability to construct an illumination-aware loss to guide the training of the fusion network. The cross-modality differential aware fusion module and halfway fusion strategy completely integrate common and complementary information under the constraint of illumination-aware loss. In addition, a new benchmark dataset for infrared and visible image fusion, i.e., Multi-Spectral Road Scenarios (available at https://github.com/Linfeng-Tang/MSRS), is released to support network training and comprehensive evaluation. Extensive experiments demonstrate the superiority of our method over state-of-the-art alternatives in terms of target maintenance and texture preservation. Particularly, our progressive fusion framework could round-the-clock integrate meaningful information from source images according to illumination conditions. Furthermore, the application to semantic segmentation demonstrates the potential of our PIAFusion for high-level vision tasks. Our codes will be available at https://github.com/Linfeng-Tang/PIAFusion.  相似文献   

7.
Tremendous advances in different areas of knowledge are producing vast volumes of data, a quantity so large that it has made necessary the development of new computational algorithms. Among the algorithms developed, we find Machine Learning models and specific data mining techniques that might be useful for all areas of knowledge. The use of computational tools for data analysis is increasingly required, given the need to extract meaningful information from such large volumes of data. However, there are no free access libraries, modules, or web services that comprise a vast array of analytical techniques in a user-friendly environment for non-specific users. Those that exist raise high usability barriers for those untrained in the field as they usually have specific installation requirements and require in-depth programming knowledge, or may result expensive. As an alternative, we have developed DMAKit, a user-friendly web platform powered by DMAKit-lib, a new library implemented in Python, which facilitates the analysis of data of different kind and origins. Our tool implements a wide array of state-of-the-art data mining and pattern recognition techniques, allowing the user to quickly implement classification, prediction or clustering models, statistical evaluation, and feature analysis of different attributes in diverse datasets without requiring any specific programming knowledge. DMAKit is especially useful for users who have large volumes of data to be analyzed but do not have the informatics, mathematical, or statistical knowledge to implement models. We expect this platform to provide a way to extract information and analyze patterns through data mining techniques for anyone interested in applying them with no specific knowledge required. Particularly, we present several cases of study in the areas of biology, biotechnology, and biomedicine, where we highlight the applicability of our tool to ease the labor of non-specialist users to apply data analysis and pattern recognition techniques. DMAKit is available for non-commercial use as an open-access library, licensed under the GNU General Public License, version GPL 3.0. The web platform is publicly available at https://pesb2.cl/dmakitWeb. Demonstrative and tutorial videos for the web platform are available in https://pesb2.cl/dmakittutorials/. Complete urls for relevant content are listed in the Data Availability section.  相似文献   

8.
9.
In this work we address the challenging case of answering count queries in web search, such as number of songs by John Lennon. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at https://github.com/ghoshs/CoQEx and https://nlcounqer.mpi-inf.mpg.de/.  相似文献   

10.
Convolutional Neural Networks have dominated the field of computer vision for the last ten years, exhibiting extremely powerful feature extraction capabilities and outstanding classification performance. The main strategy to prolong this trend in the state-of-the-art literature relies on further upscaling networks in size. However, costs increase rapidly while performance improvements may be marginal. Our main hypothesis is that adding additional sources of information can help to increase performance and that this approach is more cost-effective than building bigger networks, which involve higher training time, larger parametrisation space and higher computational resources requirements. In this paper, an ensemble method for accurate image classification is proposed, fusing automatically detected features through a Convolutional Neural Network and a set of manually defined statistical indicators. Through a combination of the predictions of a CNN and a secondary classifier trained on statistical features, a better classification performance can be achieved cheaply. We test five different CNN architectures and multiple learning algorithms in a diverse number of datasets to validate our proposal. According to the results, the inclusion of additional indicators and an ensemble classification approach help to increase the performance in all datasets. Both code and datasets are publicly available via GitHub at: https://github.com/jahuerta92/cnn-prob-ensemble.  相似文献   

11.
Despite the tremendous achievements of deep convolutional neural networks (CNNs) in many computer vision tasks, understanding how they actually work remains a significant challenge. In this paper, we propose a novel two-step understanding method, namely Salient Relevance (SR) map, which aims to shed light on how deep CNNs recognize images and learn features from areas, referred to as attention areas, therein. Our proposed method starts out with a layer-wise relevance propagation (LRP) step which estimates a pixel-wise relevance map over the input image. Following, we construct a context-aware saliency map, SR map, from the LRP-generated map which predicts areas close to the foci of attention instead of isolated pixels that LRP reveals. In human visual system, information of regions is more important than of pixels in recognition. Consequently, our proposed approach closely simulates human recognition. Experimental results using the ILSVRC2012 validation dataset in conjunction with two well-established deep CNN models, AlexNet and VGG-16, clearly demonstrate that our proposed approach concisely identifies not only key pixels but also attention areas that contribute to the underlying neural network's comprehension of the given images. As such, our proposed SR map constitutes a convenient visual interface which unveils the visual attention of the network and reveals which type of objects the model has learned to recognize after training. The source code is available at https://github.com/Hey1Li/Salient-Relevance-Propagation.  相似文献   

12.
In the literature on classification problems, it is widely discussed how the presence of label noise can bring about severe degradation in performance. Several works have applied Prototype Selection techniques, Ensemble Methods, or both, in an attempt to alleviate this issue. Nevertheless, these methods are not always able to sufficiently counteract the effects of noise. In this work, we investigate the effects of noise on a particular class of Ensemble Methods, that of Dynamic Selection algorithms, and we are especially interested in the behavior of the Fire-DES++ algorithm, a state of the art algorithm which applies the Edited Nearest Neighbors (ENN) algorithm to deal with the effects of noise and imbalance. We propose a method which employs multiple Dynamic Selection sets, based on the Bagging-IH algorithm, which we dub Multiple-Set Dynamic Selection (MSDS), in an attempt to supplant the ENN algorithm on the filtering step. We find that almost all methods based on Dynamic Selection are severely affected by the presence of label noise, with the exception of the K-Nearest Oracles-Union algorithm. We also find that our proposed method can alleviate the issues caused by noise in some scenarios. We have made the code for our method available at https://github.com/fnw/baggingds.  相似文献   

13.
This paper proposes a template-based approach to semi-automatically create contextualized learning tasks out of several sources from the Web of Data. The contextualization of learning tasks opens the possibility of bridging formal learning that happens in a classroom, and informal learning that happens in other physical spaces, such as squares or historical buildings. The tasks created cover different cognitive levels and are contextualized by their location and the topics covered. We applied this approach to the domain of History of Art in the Spanish region of Castile and Leon. We gathered data from DBpedia, Wikidata and the Open Data published by the regional government and we applied 32 templates to obtain 16K learning tasks. An evaluation with 8 teachers shows that teachers would accept their students to carry out the tasks generated. Teachers also considered that the 85% of the tasks generated are aligned with the content taught in the classroom and were found to be relevant to learn in other informal spaces. The tasks created are available at https://casuallearn.gsic.uva.es/sparql.  相似文献   

14.
Recently, Graph Convolutional Networks (GCNs) and their variants become popular to learn graph-related tasks. These tasks include link prediction, node classification, and node embedding, among many others. In the node classification problem, the input is a graph with some labeled nodes and the features associated with these nodes and the objective is to predict the unlabeled nodes. While the GCNs have been successfully applied to this problem, some caveats that are inherited from classical deep learning remain unsolved. One such inherited caveat is that, during classification, GCNs only consider the nodes that are a few neighbors away from the labeled nodes. However, considering only a few steps away nodes could not effectively exploit the underlying graph topological information. To remedy this problem, the state-of-the-art methods leverage the network diffusion approaches, such as personalized PageRank and its variants, to fully account for the graph topology. However, these approaches overlook the fact that the network diffusion methods favour high degree nodes in the graph, resulting in the propagation of the labels to the unlabeled,hub nodes. In order to overcome bias, in this paper, we propose to utilize a dimensionality reduction technique, which is conjugate with personalized PageRank. Testing on four real-world networks that are commonly used in benchmarking GCNs’ performance for the node classification task, we systematically evaluate the performance of the proposed methodology and show that our approach outperforms existing methods for wide ranges of parameter values. Since our method requires only a few training epochs, it releases the heavy training burden of GCNs. The source code of the proposed method is freely available at https://github.com/mustafaCoskunAgu/ScNP/blob/master/TRJMain.m.  相似文献   

15.
In this paper, an unsupervised learning-based approach is presented for fusing bracketed exposures into high-quality images that avoids the need for interim conversion to intermediate high dynamic range (HDR) images. As an objective quality measure – the colored multi-exposure fusion structural similarity index measure (MEF-SSIMc) – is optimized to update the network parameters, the unsupervised learning can be realized without using any ground truth (GT) images. Furthermore, an unreferenced gradient fidelity term is added in the loss function to recover and supplement the image information for the fused image. As shown in the experiments, the proposed algorithm performs well in terms of structure, texture, and color. In particular, it maintains the order of variations in the original image brightness and suppresses edge blurring and halo effects, and it also produces good visual effects that have good quantitative evaluation indicators. Our code will be publicly available at https://github.com/cathying-cq/UMEF.  相似文献   

16.
Curated collections of models are essential for the success of Machine Learning (ML) and Data Analytics in Model-Driven Engineering (MDE). However, current datasets are either too small or not properly curated. In this paper, we present ModelSet, a dataset composed of 5,466 Ecore models and 5,120 UML models which have been manually labelled to support ML tasks. We describe the structure of the dataset and explain how to use the associated library to develop ML applications in Python. Finally, we present some applications which can be addressed using ModelSet.Tool Website: https://github.com/modelset  相似文献   

17.
External knowledge (a.k.a. side information) plays a critical role in zero-shot learning (ZSL) which aims to predict with unseen classes that have never appeared in training data. Several kinds of external knowledge, such as text and attribute, have been widely investigated, but they alone are limited with incomplete semantics. Some very recent studies thus propose to use Knowledge Graph (KG) due to its high expressivity and compatibility for representing kinds of knowledge. However, the ZSL community is still in short of standard benchmarks for studying and comparing different external knowledge settings and different KG-based ZSL methods. In this paper, we proposed six resources covering three tasks, i.e., zero-shot image classification (ZS-IMGC), zero-shot relation extraction (ZS-RE), and zero-shot KG completion (ZS-KGC). Each resource has a normal ZSL benchmark and a KG containing semantics ranging from text to attribute, from relational knowledge to logical expressions. We have clearly presented these resources including their construction, statistics, data formats and usage cases w.r.t. different ZSL methods. More importantly, we have conducted a comprehensive benchmarking study, with a few classic and state-of-the-art methods for each task, including a method with KG augmented explanation. We discussed and compared different ZSL paradigms w.r.t. different external knowledge settings, and found that our resources have great potential for developing more advanced ZSL methods and more solutions for applying KGs for augmenting machine learning. All the resources are available at https://github.com/China-UK-ZSL/Resources_for_KZSL.  相似文献   

18.
19.
Entity Resolution (ER) is the task of detecting different entity profiles that describe the same real-world objects. To facilitate its execution, we have developed JedAI, an open-source system that puts together a series of state-of-the-art ER techniques that have been proposed and examined independently, targeting parts of the ER end-to-end pipeline. This is a unique approach, as no other ER tool brings together so many established techniques. Instead, most ER tools merely convey a few techniques, those primarily developed by their creators. In addition to democratizing ER techniques, JedAI goes beyond the other ER tools by offering a series of unique characteristics: (i) It allows for building and benchmarking millions of ER pipelines. (ii) It is the only ER system that applies seamlessly to any combination of structured and/or semi-structured data. (iii) It constitutes the only ER system that runs seamlessly both on stand-alone computers and clusters of computers — through the parallel implementation of all algorithms in Apache Spark. (iv) It supports two different end-to-end workflows for carrying out batch ER (i.e., budget-agnostic), a schema-agnostic one based on blocks, and a schema-based one relying on similarity joins. (v) It adapts both end-to-end workflows to budget-aware (i.e., progressive) ER. We present in detail all features of JedAI, stressing the core characteristics that enhance its usability, and boost its versatility and effectiveness. We also compare it to the state-of-the-art in the field, qualitatively and quantitatively, demonstrating its state-of-the-art performance over a variety of large-scale datasets from different domains.The central repository of the JedAI’s code base is here: https://github.com/scify/JedAIToolkit .A video demonstrating the JedAI’s Web application is available here: https://www.youtube.com/watch?v=OJY1DUrUAe8.  相似文献   

20.
We present a novel strategy for approximate furthest neighbor search that selects a set of candidate points using the data distribution. This strategy leads to an algorithm, which we call DrusillaSelect, that is able to outperform existing approximate furthest neighbor strategies. Our strategy is motivated by a study of the behavior of the furthest neighbor search problem, which has significantly different structure than the nearest neighbor search problem, and can be understood with the help of an information-theoretic hardness measure that we introduce. We also present a variant of the algorithm that gives an absolute approximation guarantee; under some assumptions, the guaranteed approximation can be achieved in provably less time than brute-force search. Performance studies indicate that DrusillaSelect can achieve comparable levels of approximation to other algorithms, even on the hardest datasets, while giving up to an order of magnitude speedup. An implementation is available in the mlpack machine learning library (found at http://www.mlpack.org).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号