首页 | 官方网站   微博 | 高级检索  
     


Application kernels: HPC resources performance monitoring and variance analysis
Authors:Nikolay A Simakov  Joseph P White  Robert L DeLeon  Amin Ghadersohi  Thomas R Furlani  Matthew D Jones  Steven M Gallo  Abani K Patra
Abstract:Application kernels are computationally lightweight benchmarks or applications run repeatedly on high performance computing (HPC) clusters in order to track the Quality of Service (QoS) provided to the users. They have been successful in detecting a variety of hardware and software issues, some severe, that have subsequently been corrected, resulting in improved system performance and throughput. In this work, the application kernels performance monitoring module of eXtreme Data Metrics on Demand (XDMoD) is described. Through the XDMoD framework, the application kernels have been run repetitively on the Texas Advanced Computing Center's Stampede and Lonestar4 clusters for a total of over 14,000 jobs. This provides a body of data on the HPC clusters operation that can be used to statistically analyze how the application performance, as measured by metrics such as execution time and communication bandwidth, is affected by the cluster's workload. We discuss metric distributions, carry out regression and correlation analyses, and use a PCA study to describe the variance and relate the variance to factors such as the spatial distribution of the application in the cluster. Ultimately, these types of analyses can be used to improve the application kernel mechanism, which in turn results in improved QoS of the HPC infrastructure that is delivered to the end users. Copyright © 2015 John Wiley & Sons, Ltd.
Keywords:performance monitoring  HPC  application kernels
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号