A one-class peeling method for multivariate outlier detection with applications in phase I SPC |
| |
Authors: | Waldyn G Martinez Maria L Weese L Allison Jones-Farmer |
| |
Affiliation: | Department of Information Systems & Analytics, Miami University, Oxford, Ohio |
| |
Abstract: | In phase I of statistical process control (SPC), control charts are often used as outlier detection methods to assess process stability. Many of these methods require estimation of the covariance matrix, are computationally infeasible, or have not been studied when the dimension of the data, , is large. We propose the one-class peeling (OCP) method, a flexible framework that combines statistical and machine learning methods to detect multiple outliers in multivariate data. The OCP method can be applied to phase I of SPC, does not require covariance estimation, and is well suited to high-dimensional data sets with a high percentage of outliers. Our empirical evaluation suggests that the OCP method performs well in high dimensions and is computationally more efficient and robust than existing methodologies. We motivate and illustrate the use of the OCP method in a phase I SPC application on a , dimensional data set containing Wikipedia search results for National Football League (NFL) players, teams, coaches, and managers. The example data set and R functions, OCP.R and OCPLimit.R, to compute the respective OCP distances and thresholds are available in the supplementary materials. |
| |
Keywords: | |
|
|