Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine

Authors:	Lin Lawrence Guo Stephen R Pfohl Jason Fries Jose Posada Scott Lanyon Fleming Catherine Aftandilian Nigam Shah Lillian Sung

Affiliation:	1.Program in Child Health Evaluative Sciences, The Hospital for Sick Children, Toronto, Canada;2.Biomedical Informatics Research, Stanford University, Palo Alto, California, United States;3.Division of Haematology/Oncology, The Hospital for Sick Children, Toronto, Canada;4.Division of Pediatric Hematology/Oncology, Stanford University, Palo Alto, United States

Abstract:	Objective The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts. Methods Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects. Results Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common ( n = 11) than discrimination deterioration ( n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches ( n = 15) were more common than feature-level approaches ( n = 2), with the most common approaches being model refitting ( n = 12), probability calibration ( n = 7), model updating ( n = 6), and model selection ( n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination. Conclusion There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.

Keywords:	dataset shift machine learning clinical data systematic review

设为首页 | 免责声明 | 关于勤云 | 加入收藏