Parallel pipelined histogram architectures
Proposed is a unique cell histogram architecture which will process k data items in parallel to compute 2q histogram bins per time step. An array of m/2q cells computes an m-bin histogram with a speed-up factor of k; k≥2 makes it faster than current dual-ported memory implementations. Furthermore, simple mechanisms for conflict-free storing of the histogram bins into an external memory array are discussed.