Skip to main content
Log in

Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

This article presents a software protection technique against radiation-induced faults which is based on a multi-threaded strategy. Data triplication and instructions flow duplication or triplication techniques are used to improve system reliability and thus, ensure a correct system operation. To achieve this objective, a relaxed lockstep model to synchronize the execution of both, redundant threads and variables under protection on different processing units is defined. The evaluation was performed by means of simulated fault injection campaigns in a multi-core ARM system. Results show that despite being considered techniques that imply an evident overhead in memory and instructions (Duplication With Comparison and Re-Execution – DWC-R and Triple Modular Redundancy – TMR), spreading the replicas in different instruction flows not only produce similar results than classic techniques, but also improves the computational and recovery time in presence of soft-errors. In addition, this paper highlights the importance of protecting memory-allocated data, since the instruction flow triplication is not enough to improve the overall system reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Benedetto JM, Eaton PH, Mavis DG, Gadlage M, Turflinger T (2006) Digital single event transient trends with technology node scaling. IEEE Trans Nuclear Sci 53:3462–3465

    Article  Google Scholar 

  2. Gaillard R (2011) Single event effects: mechanisms and classification. In: Nicolaidis M (ed) Soft errors in modern electronic systems, vol. 41 of frontiers in electronic testing. Springer, Dordrecht, pp 27–54,

  3. Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Dev Mater Reliab 5:305–316

    Article  Google Scholar 

  4. Iturbe X, Venu B, Ozer E, Das S (2016) A triple core lock-step (TCLS) ARM®; cortex®;-R5 processor for safety-critical and ultra-reliable applications. In: Proc. 2016 46th Annual IEEE/IFIP international conference on dependable systems and networks workshop (DSN-W). IEEE, pp 246–249

  5. Goloubeva O, Rebaudengo M, Reorda S, Violante M (2006) Software-implemented hardware fault tolerance, vol XIV. Springer

  6. Quinn H, Baker Z, Fairbanks T, Tripp JL, Duran G (2015) Software resilience and the effectiveness of software mitigation in microcontrollers. IEEE Trans Nuclear Sci, 62:2532–2538

    Article  Google Scholar 

  7. Cuenca-Asensi S, Martinez-Alvarez A, Restrepo-Calle F, Palomo FR, Guzman-Miranda H, Aguirre MA (2011) A novel co-design approach for soft errors mitigation in embedded systems. IEEE Trans Nuclear Sci 58:1059–1065

    Article  Google Scholar 

  8. Oz I, Arslan S (2019) A survey on multithreading alternatives for soft error fault tolerance. ACM Comput Surv 52:27,1–27,38

    Article  Google Scholar 

  9. Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36

    Article  Google Scholar 

  10. Mukherjee S, Kontz M, Reinhardt SK (2002) Detailed design and evaluation of redundant multithreading alternatives. ACM SIGARCH Comput Architect News 30:99–110

    Article  Google Scholar 

  11. Wang C, seop Kim H, Wu Y, Ying V (2007) Compiler-managed software-based redundant multi-threading for transient fault detection. In: Proc. International symposium on code generation and optimization (CGO2007). IEEE, pp 244–258

  12. Shye A, Blomstedt J, Moseley T, Reddi V, Connors D (2009) PLR: a software approach to transient fault tolerance for multicore architectures. IEEE Trans Depend Secur Comput 6: 135–148

    Article  Google Scholar 

  13. Rodrigues G, Rosa F, Kastensmidt FL, Reis R, Ost L (2017) Investigating parallel TMR approaches and thread disposability in Linux. In: Proc. 2017 24th IEEE international conference on electronics, circuits and systems (ICECS). IEEE, pp 393– 396

  14. de Oliveira A, Tambara LA, Kastensmidt FL (2017) Applying lockstep in dual-core ARM cortex-a9 to mitigate radiation-induced soft errors. In: 2017 IEEE 8th Latin American symposium on circuits & systems (LASCAS). IEEE, pp 1–4

  15. de Oliveira AB, Rodrigues G, Kastensmidt FL (2017) Analyzing lockstep dual-core ARM cortex-a9 soft error mitigation in freeRTOS applications. In: Proceedings of the 30th symposium on integrated circuits and systems design chip on the sands - SBCCI 2017, SBCCI ’17. ACM Press, New York, pp 84–89

  16. Rodrigues G, ROSA F, de Oliveira A, Kastensmidt FL, Ost L, Reis R (2017) Analyzing the impact of fault tolerance methods in ARM processors under soft errors running linux and parallelization APIs. IEEE Trans Nuclear Sci 64(8):2196–2203

    Google Scholar 

  17. Rodrigues G, Kastensmidt FL, Reis R, Rosa F, Ost L (2016) Analyzing the impact of using pthreads versus OpenMP under fault injection in ARM cortex-a9 dual-core. In: 2016 16th European conference on radiation and its effects on components and systems (RADECS). IEEE, pp 1–6

  18. Hukerikar S, Teranishi K, Diniz PC, Lucas RF (2017) RedThreads: an interface for application-level fault detection/correction through adaptive redundant multithreading. Int J Parallel Prog 46:225–251

    Article  Google Scholar 

  19. Monson JS, Wirthlin M, Hutchings B (2010) Fault injection results of linux operating on an FPGA embedded platform. In: Proc. 2010 international conference on reconfigurable computing and FPGAs. IEEE, pp 37–42

  20. So H, Didehban M, Shrivastava A, Lee K (2019) A software-level redundant multithreading for soft/hard error detection and recovery. In: Proc. 2019 design, automation & test in europe conference & exhibition (DATE). IEEE, pp 1559–1562

  21. Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2019) Softerror mitigation for multi-core processors based on thread replication. In: Proc. 2019 IEEE Latin American test symposium (LATS). IEEE, pp 1–5

  22. Reinhardt SK, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. ACM SIGARCH Comput Architect News 28:25–36

    Article  Google Scholar 

  23. Martinez-Alvarez A, Cuenca-Asensi S, Restrepo-Calle F, Palomo Pinto FR, Guzman-Miranda H, Aguirre MA (2012) Compiler-directed soft error mitigation for embedded systems. IEEE Trans Depend Secur Comput 9:159–172

    Article  Google Scholar 

  24. Pallister J, Hollis SJ, Bennett J (2013) BEEBS: open benchmarks for energy measurements on embedded platforms. arXiv:https://arxiv.org/abs/1308.5174

  25. Isaza-Gonzalez J, Serrano-Cases A, Restrepo-Calle F, Cuenca-Asensi S, Martinez-Alvarez A (2016) Dependability evaluation of COTS microprocessors via on-chip debugging facilities. In: Proc. 2016 17th Latin-American test symposium (LATS). IEEE, pp 27–32

  26. Reyneri LM, Serrano-Cases A, Morilla Y, Cuenca-Asensi S, Martínez-Álvarez A (2019) A compact model to evaluate the effects of high level C++ code hardening in radiation environments. Electronics 8:653

    Article  Google Scholar 

  27. Reis G, Chang J, Vachharajani N, Rangan R, August D, Mukherjee S (2005) Design and evaluation of hybrid fault-detection systems. In: Proc. 32nd International symposium on computer architecture (ISCA2005). IEEE, pp 148–159

Download references

Acknowledgements

This work was funded by the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund through the following projects: ‘Evaluación temprana de los efectos de radiación mediante simulación y virtualización. Estrategias de mitigación en arquitecturas de microprocesadores avanzados’, (Ref: ESP2015-68245-C4-3-P, MINECO/FEDER, UE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Martínez-Álvarez.

Additional information

Responsible Editor: L.M.B. Pöhls

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Serrano-Cases, A., Restrepo-Calle, F., Cuenca-Asensi, S. et al. Multi-Threaded Mitigation of Radiation-Induced Soft Errors in Bare-Metal Embedded Systems. J Electron Test 36, 47–57 (2020). https://doi.org/10.1007/s10836-019-05846-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-019-05846-4

Keywords

Navigation