Abstract
We present and evaluate novel parallel implementations of Self-Organizing Maps for the Cell Broadband Engine Architecture. Motivated by the interactive nature of the data-mining process, we evaluate the scalability of the implementations on two clusters using different network characteristics and incarnations (PS3TMconsole and PowerXCell 8i) of the architecture. Our implementations use varying combinations of the Power Processing Elements (PPEs) and Synergistic Processing Elements (SPEs) found in the Cell architecture. For a single processor, our implementation scaled well with the number of SPEs regardless of the incarnation. When combining multiple PS3TMconsoles, the synchronization over the slower network resulted in poor speedups and demonstrated that the use of such a low-cost cluster may be severely restricted, even without the use of SPEs. When using multiple SPEs for the PowerXCell 8i cluster, the speedup grew linearly with increasing number of SPEs for a given number of processors, and linear up to a maximum with the number of processors for a given number of SPEs. Our implementation achieved a worst-case efficiency of 67% for the maximum number of processing elements involved in the computation, but consistently higher values for smaller numbers of processing elements with speedups of up to 70.