Abstract
The continuing decrease in dimensions and operating voltage of transistors has increased their sensitivity against radiation phenomena making soft errors an important challenge in future chip multiprocessors (CMPs). Hence, new techniques for detecting errors in the logic and memories that allow meeting the desired failures-in-time (FIT) budget in CMPs are required.
This paper proposes a low-cost dynamic particle strike detection mechanism through acoustic wave detectors. Our results show that our mechanism can protect both the logic and the memory arrays. As a case study, we also show how this technique can be combined with error codes to protect the last-level cache at low cost.
- T. Austin. DIVA: a reliable substrate for deep submicron microarchitecture design. In Proceedings of International Symposium on Microarchitecture (MICRO), 1999. Google ScholarDigital Library
- R. Baumann. Silicon amnesia: a tutorial on radiation induced soft errors. In International Reliability Physics Symposium (IRPS), 2001.Google Scholar
- R. Baumann. Soft errors in advanced semiconductor devices-part i: the three radiation sources. IEEE Transactions on Device and Materials Reliability, 1(1):17--22, 2001.Google ScholarCross Ref
- R. Baumann. Soft errors in advanced computer systems. In Proceedings of IEEE Design and Test of Computers, pages 258--266, Los Alamitos, CA, USA, 2005. IEEE Computer Society.Google ScholarDigital Library
- L. K. Baxter. Capacitive Sensors: Design and Applications. John Wiley and Sons, 1996.Google ScholarCross Ref
- I. Corporation. Intel's Nehalem data sheet. Intel Corporation.Google Scholar
- B. C. Daly, T. B. Norris, J. Chen, and J. B. Khurgin. Picosecond acoustic phonon pulse propagation in silicon. Phys. Rev. B, 70:214307, Dec 2004.Google ScholarCross Ref
- A. Dixit and A. Wood. The impact of new technology on soft error rates. In Proceedings of the International Reliability Physics Symposium (IRPS), 2011.Google ScholarCross Ref
- W. Foy. Position-Location Solutions by Taylor-Series Estimation. IEEE Transactions on Aerospace Electronic Systems, 12:187--194, Mar. 1976.Google ScholarCross Ref
- M. Hammig. The design and construction of a mechanical radiation detector. In Proceedings of IEEE Nuclear Science Symposium, pages 803--805, Dept. of Nucl. Eng., Michigan Univ., Ann Arbor, MI, 1998. IEEE.Google ScholarCross Ref
- M. Hammig. Nuclear radiation detection via the detection of pliable microstructures. In Proceedings of Nuclear Instruments and Methods in Physics Research, pages 278--281, Los Alamitos, CA, USA, 1999. Elsevier Science.Google ScholarCross Ref
- E. Hannah. Cosmic ray detectors for integrated circuit chips. United States Patent Number 7309866B2, December 2007. Available online (17 pages).Google Scholar
- S. S. Hung L D, Goshima M. Zigzag-hvp: A cost-effective technique to mitigate soft errors in caches with word-based access. In IPSJ Digital Courier, Washington, DC, USA, 2006. IEEE Computer Society.Google Scholar
- M. K. Kim J, Hardavellas N. Multi-bit error tolerant caches using two-dimensional error coding. In Proceedings of International Symposium on Microarchitecture (MICRO), Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarDigital Library
- L. Li, V. Degalahal, N. Vijaykrishnan, M. Kandemir, and M. Irwin. Soft error and energy consumption interactions: a data cache perspective. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2004. Google ScholarDigital Library
- Z. K. Maiz J, Hareland S. Characterization of multi-bit soft error events in advanced srams. In IEEE International Electron Devices Meeting, 2003. IEDM'03 Technical Digest, pages 21--24, Los Alamitos, CA, USA, March 2003. IEEE Computer Society.Google Scholar
- C. McMillan and P. McMillan. Characterizing rifle performance using circular error probable measured via a flatbed scanner. Creative Commons Attribution-Noncommercial-No Derivative Works, December 2008.Google Scholar
- S. Mukherjee. Architecture Design for Soft Errors. 1st edition, 2009. Google ScholarDigital Library
- S. Mukherjee, J. Emer, T. Fossum, and S. Reinhardt. Cache scrubbing in microprocessor. In Proceedings of International Symposium on Pacific Rim Dependable Computing (PRDC), 2004. Google ScholarDigital Library
- S. Mukherjee, M. Kontz, and S. Reinhardt. Detailed design and evaluation of redundant multithreading alternatives. In Proceedings of International Symposium on Computer Architecture (ISCA), 2002. Google ScholarDigital Library
- C. P. Two-dimensional parity checking. In Proceedings of International Symposium on Microarchitecture (MICRO), Washington, DC, USA, 1961. IEEE Computer Society.Google Scholar
- C. C. Paige and M. A. Saunders. Lsqr: An algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Softw., 8:43--71, March 1982. Google ScholarDigital Library
- G. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. August. SWIFT: Software implemented fault tolerance. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), 2005. Google ScholarDigital Library
- G. Reis, J. Chang, N. Vachharajani, R. Rangan, D. August, and S. Mukherjee. Design and evaluation of hybrid fault-detection systems. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA), 2005. Google ScholarDigital Library
- E. Rotenberg. AR-SMT: A microarchitectural approach to fault tolerance in microprocessors. In Proceedings of International Symposium on Fault-Tolerant Computing (FTC), page 84, 1999. Google ScholarDigital Library
- A. Saleh, J. Serrano, and J. Patel. Reliability of scrubbing recovery techniques for memory systems. IEEE Transactions on Reliability, 39(1):114--122, 1990.Google ScholarCross Ref
- N. Seifert, P. Slankard, M. Kirsch, B. Narasimham, V. Zia, B. C. Brookresonand A. Voand S. Mitraand B. Gill, and J. Maiz. Radiation-induced soft error rates of advanced cmos bulk devices. In Proceedings of International Reliability Physics Symposium, pages 217--225, Los Alamitos, CA, USA, March 2006. IEEE Computer Society.Google Scholar
- G. Shen, R. Zetik, and R. Thoma. Performance comparison of toa and tdoa based location estimation algorithms in los environment. Proceedings of Workshop on Positioning, Navigation and Communication(WPNC), pages 71--78, 2008.Google Scholar
- K. Sundaramoorthy, Z. Purser, and E. Rotenberg. Slipstream processors: improving both performance and fault tolerance. In Proceedings of the ninth international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2000. Google ScholarDigital Library
- M. William, O. Roger, and M. Daniel. Capacitance bar sensor. United States Patent US4947131, August 1990.Google Scholar
Index Terms
- Setting an error detection infrastructure with low cost acoustic wave detectors
Recommendations
Setting an error detection infrastructure with low cost acoustic wave detectors
ISCA '12: Proceedings of the 39th Annual International Symposium on Computer ArchitectureThe continuing decrease in dimensions and operating voltage of transistors has increased their sensitivity against radiation phenomena making soft errors an important challenge in future chip multiprocessors (CMPs). Hence, new techniques for detecting ...
Avoiding core's DUE & SDC via acoustic wave detectors and tailored error containment and recovery
ISCA '14: Proceeding of the 41st annual international symposium on Computer architecutureThe trend of downsizing transistors and operating voltage scaling has made the processor chip more sensitive against radiation phenomena making soft errors an important challenge. New reliability techniques for handling soft errors in the logic and ...
Design of a Low-Cost Underwater Acoustic Modem
There has been an increasing interest in creating short-range, low data rate, underwater wireless sensor networks for scientific marine exploration and monitoring. However, the lack of an inexpensive, underwater acoustic modem is preventing the ...
Comments