首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 813 毫秒
1.
This paper describes our bio‐event extraction system developed for the BioNLP 2009 Shared Task 2, with focus on its capability of extracting secondary biological event arguments from literature. Shared Task 2 is particularly interesting because when browsing literature, biologists often need to understand conditions surrounding biological events, which are usually expressed by secondary event arguments (e.g., binding sites). To achieve our goal, we take an approach that extracts n‐ary relations from text using event extraction constraints automatically generated from a training corpus. Event constraints consist of sequences of trigger words and semantic roles which we automatically identify using Conditional Random Fields (CRFs). Unlike most other systems participating in this shared task, our system is light‐weight and relies on neither external resources (e.g., Ontologies and dictionaries) nor natural language processing software (e.g., POS taggers and parsers). The official test results show that our approach performed well on extracting secondary arguments in Task 2, yielding the highest precision at 76.62% and the second highest F‐measure at 43.22%.  相似文献   

2.
The scientific literature is the main source for comprehensive, up‐to‐date biological knowledge. Automatic extraction of this knowledge facilitates core biological tasks, such as database curation and knowledge discovery. We present here a linguistically inspired, rule‐based and syntax‐driven methodology for biological event extraction. We rely on a dictionary of trigger words to detect and characterize event expressions and syntactic dependency based heuristics to extract their event arguments. We refine and extend our prior work to recognize speculated and negated events. We show that heuristics based on syntactic dependencies, used to identify event arguments, extend naturally to also identify speculation and negation scope. In the BioNLP’09 Shared Task on Event Extraction, our system placed third in the Core Event Extraction Task (F‐score of 0.4462), and first in the Speculation and Negation Task (F‐score of 0.4252). Of particular interest is the extraction of complex regulatory events, where it scored second place. Our system significantly outperformed other participating systems in detecting speculation and negation. These results demonstrate the utility of a syntax‐driven approach. In this article, we also report on our more recent work on supervised learning of event trigger expressions and discuss event annotation issues, based on our corpus analysis.  相似文献   

3.
This article presents a novel approach to event extraction from biological text using Markov Logic. It can be described by three design decisions: (1) instead of building a pipeline using local classifiers, we design and learn a joint probabilistic model over events in a sentence; (2) instead of developing specific inference and learning algorithms for our joint model, we apply Markov Logic, a general purpose Statistical Relation Learning language, for this task; (3) we represent events as relations over the token indices of a sentence, as opposed to structures that relate event entities to gene or protein mentions. In this article, we extend our original work by providing an error analysis for binding events. Moreover, we investigate the impact of different loss functions to precision, recall and F‐measure. Finally, we show how to extract events of different types that share the same event clue. This extension allowed us to improve our performance our performance even further, leading to the third best scores for task 1 (in close range to the second place) and the best results for task 2 with a 14% point margin.  相似文献   

4.
In our approach to event extraction, dependency graphs constitute the fundamental data structure for knowledge capture. Two types of trimming operations pave the way to more effective relation extraction. First, we simplify the syntactic representation structures resulting from parsing by pruning informationally irrelevant lexical material from dependency graphs. Second, we enrich informationally relevant lexical material in the simplified dependency graphs with additional semantic meta data at several layers of conceptual granularity. These two aggregation operations on linguistic representation structures are intended to avoid overfitting of machine learning‐based classifiers which we use for event extraction (besides manually curated dictionaries). Given this methodological framework, the corresponding JReX system developed by the Julie Lab Team from Friedrich‐Schiller‐Universität Jena (Germany) scored on 2nd rank among 24 competing teams for Task 1 in the “BioNLP’09 Shared Task on Event Extraction,” with 45.8% recall, 47.5% precision and 46.7% F1‐score on all 3,182 events. In more recent experiments, based on slight modifications of JReX and using the same data sets, we were able to achieve 45.9% recall, 57.7% precision, and 51.1% F1‐score.  相似文献   

5.
事件抽取旨在把含有事件信息的非结构化文本以结构化的形式予以呈现。现有的基于监督学习的事件抽取方法往往受限于数据稀疏和分布不平衡问题,具有较低的召回率。针对这一问题,该文提出一种利用框架语义优化事件抽取的方法,引入框架类型作为泛化特征,在此基础上进行框架类型和事件类型的映射,然后结合框架类型识别模型和事件类型识别模型进行协作判定,以此优化事件抽取的召回性能。实验结果显示,针对触发词(事件类型)识别任务,相较于仅使用事件类型识别模型,该文提出的框架语义辅助的事件类型识别模型能够提高抽取召回率6.44%(5.74%),提高F值1.45%(0.83%)。  相似文献   

6.
李劲  张华  辜希武 《计算机科学》2012,39(7):154-160
个人简历(Curriculum Vitae,Vita)通常包含了丰富的数据,如个人信息、教育背景以及工作经历等。从大量的个人简历中抽取出有用的信息并提供检索服务,可以提供更加全面和完整的个人资料。个人简历中包含的信息可以看成是按时间排序的事件序列。进一步地,可以从不同的个人简历所包含的事件中挖掘出事件之间的关联关系。提出了一个从个人简历中提取并检索事件的框架,它可以自动地从互联网上搜索并下载个人简历文档,并从中提取出感兴趣的事件保存在数据库里,以进一步查询和检索事件。所完成的工作包括:(1)提出了一个事件表示模型,用于描述事件的基本属性及检索事件;(2)基于条件随机场提出了一个概率模型,用于从个人简历中自动提取事件;(3)通过挖掘事件属性之间的共现性,提出了基于事件的检索方法。  相似文献   

7.
Hierarchical visual event pattern mining and its applications   总被引:1,自引:0,他引:1  
In this paper, we propose a hierarchical visual event pattern mining approach and utilize the patterns to address the key problems in video mining and understanding field. We classify events into primitive events (PEs) and compound events (CEs), where PEs are the units of CEs, and CEs serve as smooth priors and rules for PEs. We first propose a tensor-based video representation and Joint Matrix Factorization (JMF) for unsupervised primitive event categorization. Then we apply frequent pattern mining techniques to discover compound event pattern structures. After that, we utilize the two kinds of event patterns to address the applications of event recognition and anomaly detection. First we extend the Sequential Monte Carlo (SMC) method to recognition of live, sequential visual events. To accomplish this task we present a scheme that alternatively recognizes primitive and compound events in one framework. Then, we categorize the anomalies into abnormal events (never seen events) and abnormal contexts (rule breakers), and the two kinds of anomalies are detected simultaneously by embedding a deviation criterion into the SMC framework. Extensive experiments have been conducted which demonstrate that the proposed approach is effective as compared to other major approaches.  相似文献   

8.
9.
The BioNLP’09 Shared Task deals with extracting information on molecular events, such as gene expression and protein localization, from natural language text. Information in this benchmark are given as tuples including protein names, trigger terms for each event, and possible other participants such as bindings sites. We address all three tasks of BioNLP’09: event detection, event enrichment, and recognition of negation and speculation. Our method for the first two tasks is based on a deep parser; we store the parse tree of each sentence in a relational database scheme. From the training data, we collect the dependencies connecting any two relevant terms of a known tuple, that is, the shortest paths linking these two constituents. We encode all such linkages in a query language to retrieve similar linkages from unseen text. For the third task, we rely on a hierarchy of hand‐crafted regular expressions to recognize speculation and negated events. In this paper, we added extensions regarding a post‐processing step that handles ambiguous event trigger terms, as well as an extension of the query language to relax linkage constraints. On the BioNLP Shared Task test data, we achieve an overall F1‐measure of 32%, 29%, and 30% for the successive Tasks 1, 2, and 3, respectively.  相似文献   

10.
The prediction of future events has great importance in many applications. The prediction is based on episode rules which are composed of events and two time constraints which require all the events in the episode rule and in the predicate of the rule to occur in a time interval, respectively. In an event stream, a sequence of events which matches the predicate of the rule satisfying the specified time constraint is called an occurrence of the predicate. After finding the occurrence, the consequent event which will occur in a time interval can be predicted. However, the time intervals computed from some occurrences for predicting the event can be contained in the time intervals computed from other occurrence and become redundant. As a result, how to design an efficient and effective event predictor in a stream environment is challenging. In this paper, an effective scheme is proposed to avoid matching the predicate events corresponding to redundant time intervals for prediction. Based on the scheme, we respectively consider two methodologies, forward retrieval and backward retrieval, for the efficient matching of predicate events over event streams. The approach based on forward retrieval construct a queue structure to incrementally maintain parts of the matched results as events arrive, and thus it avoids backward scans of the event stream. On the other hand, the approach based on backward retrieval maintains the recently arrived events in a tree structure. The matching of predicate events is triggered by identifiable events and achieved by an efficient retrieval on the tree structure, which avoids exhaustive scans of the arrived events. By running a series of experiments, we show that each of the proposed approaches has its advantages on particular data distributions and parameter settings.  相似文献   

11.
We describe a system for extracting complex events among genes and proteins from biomedical literature, developed in context of the BioNLP’09 Shared Task on Event Extraction. For each event, the system extracts its text trigger, class, and arguments. In contrast to the approaches prevailing prior to the shared task, events can be arguments of other events, resulting in a nested structure that better captures the underlying biological statements. We divide the task into independent steps which we approach as machine learning problems. We define a wide array of features and in particular make extensive use of dependency parse graphs. A rule‐based postprocessing step is used to refine the output in accordance with the restrictions of the extraction task. In the shared task evaluation, the system achieved an F‐score of 51.95% on the primary task, the best performance among the participants. Currently, with modifications and improvements described in this article, the system achieves 52.86% F‐score on Task 1, the primary task, improving on its original performance. In addition, we extend the system also to Tasks 2 and 3, gaining F‐scores of 51.28% and 50.18%, respectively. The system thus addresses the BioNLP’09 Shared Task in its entirety and achieves the best performance on all three subtasks.  相似文献   

12.
Due to the explosive growth of social-media applications, enhancing event-awareness by social mining has become extremely important. The contents of microblogs preserve valuable information associated with past disastrous events and stories. To learn the experiences from past events for tackling emerging real-world events, in this work we utilize the social-media messages to characterize real-world events through mining their contents and extracting essential features for relatedness analysis. On one hand, we established an online clustering approach on Twitter microblogs for detecting emerging events, and meanwhile we performed event relatedness evaluation using an unsupervised clustering approach. On the other hand, we developed a supervised learning model to create extensible measure metrics for offline evaluation of event relatedness. By means of supervised learning, our developed measure metrics are able to compute relatedness of various historical events, allowing the event impacts on specified domains to be quantitatively measured for event comparison. By combining the strengths of both methods, the experimental results showed that the combined framework in our system is sensible for discovering more unknown knowledge about event impacts and enhancing event awareness.  相似文献   

13.
Event detection is a crucial task for wireless sensor network applications, especially environment monitoring. Existing approaches for event detection are mainly based on some predefined threshold values, and thus are often inaccurate and incapable of capturing complex events. For example, in coal mine monitoring scenarios, gas leakage or water osmosis can hardly be described by the overrun of specified attribute thresholds, but some complex pattern in the full-scale view of the environmental data. To address this issue, we propose a non-threshold based approach for the real 3D sensor monitoring environment. We employ energy-efficient methods to collect a time series of data maps from the sensor network and detect complex events through matching the gathered data to spatio-temporal data patterns. Finally, we conduct trace driven simulations to prove the efficacy and efficiency of this approach on detecting events of complex phenomena from real-life records.  相似文献   

14.
In this paper we present a framework for building policy‐based autonomic distributed agent systems. The autonomic mechanisms of configuration and recovery are supported through a distributed event processing model and a set of policy enforcement mechanisms embedded in an agent framework. Policies are event‐driven rules derived from the system's functional and non‐functional requirements. Agents in the network monitor the system state for policy violation conditions, generate appropriate events, and communicate them to other agents for cooperative filtering, aggregation, and handling. A set of agents perform policy enforcement actions whenever events signifying any policy violation conditions occur. Policies are defined using a specification framework based on XML. The policy enforcement agents interpret the policies given in XML. We illustrate the utility of this framework in the context of an agent‐based distributed network monitoring application. We also present an experimental evaluation of our approach. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

15.
Virologists are not only interested in point mutations in a genome, but also in relationships between mutations. In this work, we present a design study to support the discovery of correlated mutation events (called co‐occurrences) in populations of viral genomes. The key challenge is to identify potentially interesting pairs of events within the vast space of event combinations. In our work, we identify analyst requirements and develop a prototype through a participatory process. The key ideas of our approach are to use interest metrics to create dynamic filtering that guides the viewer to interesting and relevant correlations of genome mutations, and to provide visual encodings designed to fit scientists' mental map of the data, along with dynamic filtering techniques. We demonstrate the strength of our approach in virology‐situated case studies, and offer suggestions for extending our strategy to other sequence‐based domains.  相似文献   

16.
An increasing number of data applications such as monitoring weather data, data streaming, data web logs, and cloud data, are going online and are playing vital in our every-day life. The underlying data of such applications change very frequently, especially in the cloud environment. Many interesting events can be detected by discovering such data from different distributed sources and analyzing it for specific purposes (e.g., car accident detection or market analysis). However, several isolated events could be erroneous due to the fact that important data sets are either discarded or improperly analyzed as they contain missing data. Such events therefore need to be monitored globally and be detected jointly in order to understand their patterns and correlated relationships. In the context of current cloud computing infrastructure, no solutions exist for enabling the correlations between multi-source events in the presence of missing data. This paper addresses the problem of capturing the underlying latent structure of the data with missing entries based on association rules. This necessitate to factorize the data set with missing data.The paper proposes a novel model to handle high amount of data in cloud environment. It is a model of aggregated data that are confidences of association rules. We first propose a method to discover the association rules locally on each node of a cloud in the presence of missing rules. Afterward, we provide a tensor based model to perform a global correlation between all the local models of each node of the network.The proposed approach based on tensor decomposition, deals with a multi modal network where missing association rules are detected and their confidences are approximated. The approach is scalable in terms of factorizing multi-way arrays (i.e. tensor) in the presence of missing association rules. It is validated through experimental results which show its significance and viability in terms of detecting missing rules.  相似文献   

17.
Geometric scaling transformations do not respect the biological processes which govern the size and shape of living creatures. In this paper, we describe an approach to scaling which can be related to biological function. We use known biological laws of allometry which are expressed as power laws to control the mesh deformation in the frequency domain. This approach is motivated by the relation between fractal biological systems and their underlying power‐law spectra. We demonstrate our approach to biology‐aware character scaling on triangle meshes representing quadrupedal mammals.  相似文献   

18.
Yin  Zikai  Xu  Tong  Zhu  Hengshu  Zhu  Chen  Chen  Enhong  Xiong  Hui 《World Wide Web》2020,23(2):853-871

Recent years have witnessed a growing trend that offline social events are organized via online platforms. Along this line, large efforts have been devoted to recommending appropriate social events for users. However, most prior arts only pay attention to the selections of users, while the selections of events (organizers), which lead to the “two-way selection” process, are usually ignored. Intuitively, distinguishing the two-way selections in historical attendances can help us better understand the social event participation and decision making process in a holistic manner. To that end, in this paper, we propose a novel two-stage framework for social event participation analysis. To be specific, by adapting the classic Gale-Shapley algorithm for stable matching, we design utility functions for both users and event organizers, and then solve two layers of optimization tasks to estimate parameters, i.e., capturing user profiling for event selection, as well as event rules for attender selection. Experimental results on real-world data set validate that our method can effectively predict the event invitation and acceptance, compared with the combinations of one-way-selection baselines. This phenomenon clearly demonstrates the hypothesis that two-way selection process could better reflect the decision making of social event participation.

  相似文献   

19.
用规则抽取句子中事件信息   总被引:2,自引:0,他引:2  
信息抽取是数据挖掘的重要课题.目前的研究主要通过机器学习的方法对信息进行抽取.但是机器学习对训练数据的质量要求高,学习过程中参数设置复杂.而利用事先构建好的规则可以简单有效的从文本中提取事件信息.提出一种基于抽取规则对句子中的事件信息进行抽取的方法,摆脱了繁杂的机器学习过程.该方法利用本体对动词与事件角色匹配规则、事件角色抽取规则、时间信息抽取规则和地点信息抽取规则进行定义,用OWL对这些抽取规则进行了描述,然后应用这些规则抽取句子中的动词词义信息、事件角色信息、时间信息和地点信息,并用本文提出的一种新评测指标对事件信息进行评测.实验表明该方法从句子中抽取事件信息是有效的.  相似文献   

20.
Nowadays, there is an increasing demand to monitor, analyze, and control large scale distributed systems. Events detected during monitoring are temporally correlated, which is helpful to resource allocation, job scheduling, and failure prediction. To discover the correlations among detected events, many existing approaches concentrate detected events into an event database and perform data mining on it. We argue that these approaches are not scalable to large scale distributed systems as monitored events grow so fast that event correlation discovering can hardly be done with the power of a single computer. In this paper, we present a decentralized approach to efficiently detect events, filter irrelative events, and discover their temporal correlations. We propose a MapReduce-based algorithm, MapReduce-Apriori, to data mining event association rules, which utilizes the computational resource of multiple dedicated nodes of the system. Experimental results show that our decentralized event correlation mining algorithm achieves nearly ideal speedup compared to centralized mining approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号