Optimizing a medical image registration algorithm based on profiling data for real-time performance期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Optimizing a medical image registration algorithm based on profiling data for real-time performance

Authors:	Gulo Carlos A S J Sementille Antonio C Tavares João Manuel R S

Affiliation:	1.CNPq National Scientific and Technological Development Council, Research Group PIXEL - UNEMAT, Brasilia, Brazil ;2.Programa Doutoral em Engenharia Informática, Instituto de Ciência e Inova??o em Engenharia Mecanica e Engenharia Industrial, Faculdade de Engenharia, Universidade do Porto, Porto, Portugal ;3.Departamento de Ciências da Computa??o, Faculdade de Ciências, Universidade Estadual Paulista-UNESP, Prudente, Brazil ;4.Instituto de Ciência e Inova??o em Engenharia Mecanica e Engenharia Industrial, Departamento de Engenharia Mecanica, Faculdade de Engenharia, Universidade do Porto, Porto, Portugal ;

Abstract:	Image registration is a commonly task in medical image analysis. Therefore, a significant number of algorithms have been developed to perform rigid and non-rigid image registration. Particularly, the free-form deformation algorithm is frequently used to carry out non-rigid registration task; however, it is a computationally very intensive algorithm. In this work, we describe an approach based on profiling data to identify potential parts of this algorithm for which parallel implementations can be developed. The proposed approach assesses the efficient of the algorithm by applying performance analysis techniques commonly available in traditional computer operating systems. Hence, this article provides guidelines to support researchers working on medical image processing and analysis to achieve real-time non-rigid image registration applications using common computing systems. According to our experimental findings, significant speedups can be accomplished by parallelizing sequential snippets, i.e., code regions that are executed more than once. For the selected costly functions previously identified in the studied free-form deformation algorithm, the developed parallelization decreased the runtime by up to seven times relatively to the related single thread based implementation. The implementations were developed based on the Open Multi-Processing application programming interface. In conclusion, this study confirms that based on the call graph visualization and detected performance bottlenecks, one can easily find and evaluate snippets which are potential optimization targets in addition to throughput in memory accesses.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏