Skip to main content
Log in

Parallel Computation of High-Dimensional Robust Correlation and Covariance Matrices

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The computation of covariance and correlation matrices are critical to many data mining applications and processes. Unfortunately the classical covariance and correlation matrices are very sensitive to outliers. Robust methods, such as Quadrant Correlation (QC) and the Maronna method, have been proposed. However, existing algorithms for QC only give acceptable performance when the dimensionality of the matrix is in the hundreds; and the Maronna method is rarely used in practice because of its high computational cost. In this paper we develop parallel algorithms for both QC and the Maronna method. We evaluate these parallel algorithms using a real data set of the gene expression of over 6000 genes, giving rise to a matrix of over 18 million entries. In our experimental evaluation, we explore scalability in dimensionality and in the number of processors, and the trade-offs between accuracy and computational efficiency. We also compare the parallel behaviours of the two methods. From a statistical standpoint, the Maronna method is more robust than QC. From a computational standpoint, while QC requires less computation, interestingly the Maronna method is much more parallelizable than QC. After a thorough experimentation, we conclude that for many data mining applications, both QC and Maronna are viable options. Less robust, but faster, QC is the recommended choice for small parallel platforms. On the other hand, the Maronna method is the recommended choice when a high degree of robustness is required, or when the parallel platform features a large number of processors (e.g., 32).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to James Chilson, Raymond Ng, Alan Wagner or Ruben Zamar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chilson, J., Ng, R., Wagner, A. et al. Parallel Computation of High-Dimensional Robust Correlation and Covariance Matrices. Algorithmica 45, 403–431 (2006). https://doi.org/10.1007/s00453-006-1219-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-006-1219-9

Keywords

Navigation