首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background and objective

In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization.

Methods

We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching.

Results

Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching.

Conclusions

We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated.  相似文献   

2.
Objectives Medical knowledge extraction (MKE) plays a key role in natural language processing (NLP) research in electronic medical records (EMR), which are the important digital carriers for recording medical activities of patients. Named entity recognition (NER) and medical relation extraction (MRE) are two basic tasks of MKE. This study aims to improve the recognition accuracy of these two tasks by exploring deep learning methods.Methods This study discussed and built two application scenes of bidirectional long short-term memory combined conditional random field (BiLSTM-CRF) model for NER and MRE tasks. In the data preprocessing of both tasks, a GloVe word embedding model was used to vectorize words. In the NER task, a sequence labeling strategy was used to classify each word tag by the joint probability distribution through the CRF layer. In the MRE task, the medical entity relation category was predicted by transforming the classification problem of a single entity into a sequence classification problem and linking the feature combinations between entities also through the CRF layer.Results Through the validation on the I2B2 2010 public dataset, the BiLSTM-CRF models built in this study got much better results than the baseline methods in the two tasks, where the F1-measure was up to 0.88 in NER task and 0.78 in MRE task. Moreover, the model converged faster and avoided problems such as overfitting.Conclusion This study proved the good performance of deep learning on medical knowledge extraction. It also verified the feasibility of the BiLSTM-CRF model in different application scenarios, laying the foundation for the subsequent work in the EMR field.  相似文献   

3.

Background and objective

As people increasingly engage in online health-seeking behavior and contribute to health-oriented websites, the volume of medical text authored by patients and other medical novices grows rapidly. However, we lack an effective method for automatically identifying medical terms in patient-authored text (PAT). We demonstrate that crowdsourcing PAT medical term identification tasks to non-experts is a viable method for creating large, accurately-labeled PAT datasets; moreover, such datasets can be used to train classifiers that outperform existing medical term identification tools.

Materials and methods

To evaluate the viability of using non-expert crowds to label PAT, we compare expert (registered nurses) and non-expert (Amazon Mechanical Turk workers; Turkers) responses to a PAT medical term identification task. Next, we build a crowd-labeled dataset comprising 10 000 sentences from MedHelp. We train two models on this dataset and evaluate their performance, as well as that of MetaMap, Open Biomedical Annotator (OBA), and NaCTeM''s TerMINE, against two gold standard datasets: one from MedHelp and the other from CureTogether.

Results

When aggregated according to a corroborative voting policy, Turker responses predict expert responses with an F1 score of 84%. A conditional random field (CRF) trained on 10 000 crowd-labeled MedHelp sentences achieves an F1 score of 78% against the CureTogether gold standard, widely outperforming OBA (47%), TerMINE (43%), and MetaMap (39%). A failure analysis of the CRF suggests that misclassified terms are likely to be either generic or rare.

Conclusions

Our results show that combining statistical models sensitive to sentence-level context with crowd-labeled data is a scalable and effective technique for automatically identifying medical terms in PAT.  相似文献   

4.
ObjectiveThe goal of this study is to explore transformer-based models (eg, Bidirectional Encoder Representations from Transformers [BERT]) for clinical concept extraction and develop an open-source package with pretrained clinical models to facilitate concept extraction and other downstream natural language processing (NLP) tasks in the medical domain.MethodsWe systematically explored 4 widely used transformer-based architectures, including BERT, RoBERTa, ALBERT, and ELECTRA, for extracting various types of clinical concepts using 3 public datasets from the 2010 and 2012 i2b2 challenges and the 2018 n2c2 challenge. We examined general transformer models pretrained using general English corpora as well as clinical transformer models pretrained using a clinical corpus and compared them with a long short-term memory conditional random fields (LSTM-CRFs) mode as a baseline. Furthermore, we integrated the 4 clinical transformer-based models into an open-source package.Results and ConclusionThe RoBERTa-MIMIC model achieved state-of-the-art performance on 3 public clinical concept extraction datasets with F1-scores of 0.8994, 0.8053, and 0.8907, respectively. Compared to the baseline LSTM-CRFs model, RoBERTa-MIMIC remarkably improved the F1-score by approximately 4% and 6% on the 2010 and 2012 i2b2 datasets. This study demonstrated the efficiency of transformer-based models for clinical concept extraction. Our methods and systems can be applied to other clinical tasks. The clinical transformer package with 4 pretrained clinical models is publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerNER. We believe this package will improve current practice on clinical concept extraction and other tasks in the medical domain.  相似文献   

5.

Objective

Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives.

Methods

Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt.

Results

The evaluated POS taggers drop in accuracy by 8.5–15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3–91.0% on clinical texts. ClinAdapt reports 93.2–93.9%.

Conclusions

ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks.  相似文献   

6.
ObjectivesNormalizing mentions of medical concepts to standardized vocabularies is a fundamental component of clinical text analysis. Ambiguity—words or phrases that may refer to different concepts—has been extensively researched as part of information extraction from biomedical literature, but less is known about the types and frequency of ambiguity in clinical text. This study characterizes the distribution and distinct types of ambiguity exhibited by benchmark clinical concept normalization datasets, in order to identify directions for advancing medical concept normalization research.Materials and MethodsWe identified ambiguous strings in datasets derived from the 2 available clinical corpora for concept normalization and categorized the distinct types of ambiguity they exhibited. We then compared observed string ambiguity in the datasets with potential ambiguity in the Unified Medical Language System (UMLS) to assess how representative available datasets are of ambiguity in clinical language.ResultsWe found that <15% of strings were ambiguous within the datasets, while over 50% were ambiguous in the UMLS, indicating only partial coverage of clinical ambiguity. The percentage of strings in common between any pair of datasets ranged from 2% to only 36%; of these, 40% were annotated with different sets of concepts, severely limiting generalization. Finally, we observed 12 distinct types of ambiguity, distributed unequally across the available datasets, reflecting diverse linguistic and medical phenomena.DiscussionExisting datasets are not sufficient to cover the diversity of clinical concept ambiguity, limiting both training and evaluation of normalization methods for clinical text. Additionally, the UMLS offers important semantic information for building and evaluating normalization methods.ConclusionsOur findings identify 3 opportunities for concept normalization research, including a need for ambiguity-specific clinical datasets and leveraging the rich semantics of the UMLS in new methods and evaluation measures for normalization.  相似文献   

7.
8.
药物研发过程中,化合物-蛋白质相互作用(compound-protein interaction,CPI)预测是发现苗头化合物、药物重定位等研究的关键技术手段。近年来,深度学习被广泛应用于CPI研究,加速了药物发现中CPI预测的发展。本文重点讨论基于特征的 CPI 预测模型,首先,介绍了CPI预测中常见的数据库、化合物和蛋白质的典型特征表示方法。根据建模中的关键问题,从多模态和注意力机制两个方面,对基于特征的CPI预测模型展开论述。在此基础上,选取其中12个模型,在3个经典数据集上评估了模型在分类任务和回归任务中的性能。本文总结当前该领域面临的挑战,对未来的发展方向进行展望,为CPI预测方法进一步研究提供思路。  相似文献   

9.
In an ideal clinical Natural Language Processing (NLP) ecosystem, researchers and developers would be able to collaborate with others, undertake validation of NLP systems, components, and related resources, and disseminate them. We captured requirements and formative evaluation data from the Veterans Affairs (VA) Clinical NLP Ecosystem stakeholders using semi-structured interviews and meeting discussions. We developed a coding rubric to code interviews. We assessed inter-coder reliability using percent agreement and the kappa statistic. We undertook 15 interviews and held two workshop discussions. The main areas of requirements related to; design and functionality, resources, and information. Stakeholders also confirmed the vision of the second generation of the Ecosystem and recommendations included; adding mechanisms to better understand terms, measuring collaboration to demonstrate value, and datasets/tools to navigate spelling errors with consumer language, among others. Stakeholders also recommended capability to: communicate with developers working on the next version of the VA electronic health record (VistA Evolution), provide a mechanism to automatically monitor download of tools and to automatically provide a summary of the downloads to Ecosystem contributors and funders. After three rounds of coding and discussion, we determined the percent agreement of two coders to be 97.2% and the kappa to be 0.7851. The vision of the VA Clinical NLP Ecosystem met stakeholder needs. Interviews and discussion provided key requirements that inform the design of the VA Clinical NLP Ecosystem.  相似文献   

10.
以医院信息系统为研究基础,以临床医学文本分类为依据,从数据来源角度分析临床科研数据抽取方法和过程,使用自然语言处理、数据库检索技术对医学文本进行抽取和挖掘,取得良好效果,助力临床科研工作。  相似文献   

11.
我国的本科临床医学教育可分为三个阶段:临床前阶段、临床见习阶段、临床实习阶段,每个阶段都有不同的特点和教学任务。临床医学教学过程中应用的教学方法主要有以授课为基础的教学(lecture-based learning,LBL)、以团队为基础的学习(team-based learning,TBL)、案例教学(case-based learning,CBL)和以问题为基础的学习(problem-based learning,PBL)。LBL是“以教师为中心”,注重知识的准确性、系统性、连贯性;TBL、CBL和PBL是“以学生为中心”,注重学生的主观能动性和学习积极性的培养。每种教学方法都有各自的优缺点,本研究根据本科医学教育三个阶段的不同特点,浅析不同医学教学方法在三阶段中的合理应用。临床前阶段,联合应用LBL和TBL;临床见习阶段,联合应用LBL、TBL和CBL教学方法;临床实习阶段,联合应用CBL和PBL。  相似文献   

12.

Objective

To specify the problem of patient-level temporal aggregation from clinical text and introduce several probabilistic methods for addressing that problem. The patient-level perspective differs from the prevailing natural language processing (NLP) practice of evaluating at the term, event, sentence, document, or visit level.

Methods

We utilized an existing pediatric asthma cohort with manual annotations. After generating a basic feature set via standard clinical NLP methods, we introduce six methods of aggregating time-distributed features from the document level to the patient level. These aggregation methods are used to classify patients according to their asthma status in two hypothetical settings: retrospective epidemiology and clinical decision support.

Results

In both settings, solid patient classification performance was obtained with machine learning algorithms on a number of evidence aggregation methods, with Sum aggregation obtaining the highest F1 score of 85.71% on the retrospective epidemiological setting, and a probability density function-based method obtaining the highest F1 score of 74.63% on the clinical decision support setting. Multiple techniques also estimated the diagnosis date (index date) of asthma with promising accuracy.

Discussion

The clinical decision support setting is a more difficult problem. We rule out some aggregation methods rather than determining the best overall aggregation method, since our preliminary data set represented a practical setting in which manually annotated data were limited.

Conclusion

Results contrasted the strengths of several aggregation algorithms in different settings. Multiple approaches exhibited good patient classification performance, and also predicted the timing of estimates with reasonable accuracy.  相似文献   

13.
Implementing clinical trials with large multicenter samples is an important way to scientifically evaluate and demonstrate the curative effect of moxibustion.At present,clinical trials on moxibustion with large multicenter samples are prospering in China.It is necessary for research units to have good research professionals and technical platforms as well as a highly standardized and scientifically feasible methodology of research.Taking tasks in the ongoing national 973 project and in the sci-tech support program of the "11th 5-year plan",for example,this research captures the characteristics of moxibustion,carries out deep analysis and introduces specific methods and the important significance of clinical research tasks on moxibustion in designing multicenter plans,implementing experiments,supervising quality and strengthening compliance.  相似文献   

14.

Objective

Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval.

Materials and methods

A ‘learn by example’ approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge''s concept extraction task provided the data sets and metrics used to evaluate performance.

Results

Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks.

Discussion

With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation.

Conclusion

Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.  相似文献   

15.

Background

Word sense disambiguation (WSD) methods automatically assign an unambiguous concept to an ambiguous term based on context, and are important to many text-processing tasks. In this study we developed and evaluated a knowledge-based WSD method that uses semantic similarity measures derived from the Unified Medical Language System (UMLS) and evaluated the contribution of WSD to clinical text classification.

Methods

We evaluated our system on biomedical WSD datasets and determined the contribution of our WSD system to clinical document classification on the 2007 Computational Medicine Challenge corpus.

Results

Our system compared favorably with other knowledge-based methods. Machine learning classifiers trained on disambiguated concepts significantly outperformed those trained using all concepts.

Conclusions

We developed a WSD system that achieves high disambiguation accuracy on standard biomedical WSD datasets and showed that our WSD system improves clinical document classification.

Data sharing

We integrated our WSD system with MetaMap and the clinical Text Analysis and Knowledge Extraction System, two popular biomedical natural language processing systems. All codes required to reproduce our results and all tools developed as part of this study are released as open source, available under http://code.google.com/p/ytex.  相似文献   

16.

Objective

As clinical text mining continues to mature, its potential as an enabling technology for innovations in patient care and clinical research is becoming a reality. A critical part of that process is rigid benchmark testing of natural language processing methods on realistic clinical narrative. In this paper, the authors describe the design and performance of three state-of-the-art text-mining applications from the National Research Council of Canada on evaluations within the 2010 i2b2 challenge.

Design

The three systems perform three key steps in clinical information extraction: (1) extraction of medical problems, tests, and treatments, from discharge summaries and progress notes; (2) classification of assertions made on the medical problems; (3) classification of relations between medical concepts. Machine learning systems performed these tasks using large-dimensional bags of features, as derived from both the text itself and from external sources: UMLS, cTAKES, and Medline.

Measurements

Performance was measured per subtask, using micro-averaged F-scores, as calculated by comparing system annotations with ground-truth annotations on a test set.

Results

The systems ranked high among all submitted systems in the competition, with the following F-scores: concept extraction 0.8523 (ranked first); assertion detection 0.9362 (ranked first); relationship detection 0.7313 (ranked second).

Conclusion

For all tasks, we found that the introduction of a wide range of features was crucial to success. Importantly, our choice of machine learning algorithms allowed us to be versatile in our feature design, and to introduce a large number of features without overfitting and without encountering computing-resource bottlenecks.  相似文献   

17.

Objective

This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks.

Design

Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results.

Measurements

Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences.

Results

The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm.

Conclusion

For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty.  相似文献   

18.
目的 提出一种刑侦尸检中基于深度学习网络的受复杂背景干扰的硅藻目标自动识别与定位方法。方法 主要由两大模块组成,分别是初步定位与精确定位模块。在初步定位模块中,应用ZFNet的卷积层、池化层提取高层次的硅藻特征,然后应用RPN(Region Proposal Network)生成可能存在硅藻的区域并且初步完成硅藻目标的定位。在精确定位中,应用Fast R-CNN精确修改硅藻位置信息与识别硅藻类别。结果 应用简单、部分复杂与复杂背景的自建库图像对传统机器学习方法与本文方法进行实验验证,传统识别方法对有部分背景干扰的硅藻图片识别率约为60%,且不能识别与定位受复杂背景干扰的硅藻图像。本文方法能够有效识别与定位复杂背景下硅藻图像中的多种目标,且平均识别率达到85%。结论 本文方法能够应用于刑侦尸检中识别与定位复杂背景干扰的硅藻图像中的目标。  相似文献   

19.
Brain region-of-interesting (ROI) segmentation is an important prerequisite step for many computer-aid brain disease analyses. However, the human brain has the complicated anatomical structure. Meanwhile, the brain MR images often suffer from the low intensity contrast around the boundary of ROIs, large inter-subject variance and large inner-subject variance. To address these issues, many multi-atlas based segmentation methods are proposed for brain ROI segmentation in the last decade. In this paper, multi-atlas based methods for brain MR image segmentation were reviewed regarding several registration toolboxes which are widely used in the multi-atlas methods, conventional methods for label fusion, datasets that have been used for evaluating the multi-atlas methods, as well as the applications of multi-atlas based segmentation in clinical researches. We propose that incorporating the anatomical prior into the end-to-end deep learning architectures for brain ROI segmentation is an important direction in the future.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号