Abstract
Per-flow network measurement at Internet backbone links requires the efficient maintanence of large arrays of statistics counters at very high speeds (e.g. 40 Gb/s). The prevailing view is that SRAM is too expensive for implementing large counter arrays, but DRAM is too slow for providing wirespeed updates. This view is the main premise of a number of hybrid SRAM/DRAM architectural proposals [2, 3, 4, 5] that still require substantial amounts of SRAM for large arrays. In this paper, we present a contrarian view that modern commodity DRAM architectures, driven by aggressive performance roadmaps for consumer applications (e.g. video games), have advanced architecture features that can be exploited to make DRAM solutions practical. We describe two such schemes that can harness the performance of these DRAM offerings by enabling the interleaving of counter updates to multiple memory banks. These counter schemes are the first to support arbitrary increments and decrements for either integer or floating point number representations at wirespeed. We believe our preliminary success with the use of DRAM schemes for wirespeed statistics counting opens the possibilities for broader research opportunities to generalize the proposed ideas for other network measurement functions.
- G. Varghese, C. Estan, "The measurement manifesto," The Second Workshop on Hot Topics in Networks (HotNets-II), November 20-21, 2003, Cambridge, MA.Google Scholar
- D. Shah, S. Iyer, B. Prabhakar, N. McKeown, "Maintaining statistics counters in router line cards," IEEE MICRO, 2002. Google ScholarDigital Library
- S. Ramabhadran, G. Varghese, "Efficient implementation of a statistics counter architecture," ACM SIGMETRICS, 2003. Google ScholarDigital Library
- M. Roeder, B. Lin. "Maintaining exact statistics counters with a multi-level counter memory," IEEE GLOBECOM, 2004.Google Scholar
- Q. Zhao, J. Xu, Z. Liu, "Design of a novel statistics counter architecture with optimal space time efficiency," ACM SIGMETRICS, 2006. Google ScholarDigital Library
- M. Gschwin, H. P. Hofstee, B. Flachs, M. Hopkins, Y. Watanabe, T. Yamazaki, "Synergistic processing in Cell's multicore architecture," IEEE MICRO, 2006. Google ScholarDigital Library
- Intel IXP 2855 network processor product brief. Intel Corporation., Copyright 2005.Google Scholar
- S. I. Hong, S.A. McKee, M.H. Salinas, R.H. Klenke, J.H. Aylor, W.A. Wulf, "Access order and effective bandwidth for streams on a direct rambus memory," International Symposium on High-Performance Computer Architecture, pp. 80--89, January 1999. Google ScholarDigital Library
- W. Lin, S. Reinhardt, D. Burger, "Reducing DRAM latencies with an integrated memory hierarchy design," International Symposium on High-Performance Computer Architecture, January 2001. Google ScholarDigital Library
- F. A. Ware, C. Hampel, "Improving power and data efficiency with threaded memory modules," International Conference on Computer Design, October 2006.Google Scholar
- F. A. Ware, C. Hampel, "Micro-threaded row and column operations in a DRAM core," Rambus White Paper, March 2005.Google Scholar
- XDR datasheet. Rambus, Inc., Copyright 2002-2003.Google Scholar
- XDR-2 datasheet. Rambus, Inc., Copyright 2004-2005.Google Scholar
- D. Patterson, J. Hennessy, Computer Architecture: A Quantitative Approach, 2nd. ed., San Francisco: Morgan Kaufmann Publishers, 1996. Google ScholarDigital Library
- G. Shrimali, N. McKeown, "Building packet buffers using interleaved memories," In 2005 Workshop on High Performance Switching and Routing (HPSR), May 2005.Google Scholar
- P. Indyk, "Stable distributions, pseudorandom generators, embeddings, and data stream computation," IEEE-FOCS, 2000. Google ScholarDigital Library
- H. Zhao, A. Lall, M. Ogihara, O. Spatscheck, J. Wang, J. Xu, "A data streaming algorithm for estimating entropies of OD flows," ACM Internet Measurement Conference, 2007. Google ScholarDigital Library
- B. Vocking, "How asymmetry helps load balancing," IEEE-FOCS, pp. 131--140, 1999. Google ScholarDigital Library
- A. Broder, M. Mitzenmacher, "Using multiple hash functions to improve IP lookups," IEEE INFOCOM, pp. 1454--1463, 2001.Google Scholar
- F. Bonomi, M Mitzenmacher, R. Panigrahy, S. Singh, G. Varghese, "Beyond Bloom filters: From approximate membership checks to approximate state machines," ACM SIGCOMM, August 2006. Google ScholarDigital Library
Index Terms
- DRAM is plenty fast for wirespeed statistics counting
Recommendations
Design and performance analysis of a DRAM-based statistics counter array architecture
ANCS '09: Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications SystemsThe problem of maintaining efficiently a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g. 40 Gb/s) has received considerable research attention in recent years. This problem arises in a variety of ...
Design of a novel statistics counter architecture with optimal space and time efficiency
SIGMETRICS '06/Performance '06: Proceedings of the joint international conference on Measurement and modeling of computer systemsThe problem of how to efficiently maintain a large number (say millions) of statistics counters that need to be incremented at very high speed has received considerable research attention recently. This problem arises in a variety of router management ...
DRAM-based statistics counter array architecture with performance guarantee
The problem of efficiently maintaining a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g., 40 Gb/s) has received considerable research attention in recent years. This problem arises in a variety of ...
Comments