skip to main content
research-article

System Resilience through Health Monitoring and Reconfiguration

Published:14 January 2024Publication History
Skip Abstract Section

Abstract

We demonstrate an end-to-end framework to improve the resilience of man-made systems to unforeseen events. The framework is based on a physics-based digital twin model and three modules tasked with real-time fault diagnosis, prognostics and reconfiguration. The fault diagnosis module uses model-based diagnosis algorithms to detect and isolate faults and generates interventions in the system to disambiguate uncertain diagnosis solutions. We scale up the fault diagnosis algorithm to the required real-time performance through the use of parallelization and surrogate models of the physics-based digital twin. The prognostics module tracks fault progression and trains the online degradation models to compute remaining useful life of system components. In addition, we use the degradation models to assess the impact of the fault progression on the operational requirements. The reconfiguration module uses PDDL-based planning endowed with semantic attachments to adjust the system controls to minimize the fault impact on the system operation. We define a resilience metric and use a fuel system example to demonstrate how the metric improves with our framework.

REFERENCES

  1. [1] Amos B., Rodriguez I. Jimenez, Sacks J., Boots B., and Kolter J. Zico. 2018. Differentiable MPC for end-to-end planning and control. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18). 8299–8310.Google ScholarGoogle Scholar
  2. [2] Archard J. F.. 1953. Contact and Rubbing of flat surfaces. Journal of Applied Physics 24, 8 (1953), 981988. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Arshad N. and Heimbigner D.. 2005. A Comparison of Planning Based Models for Component Reconfiguration. Technical Report CU-CS-995-05. Colorado University.Google ScholarGoogle Scholar
  4. [4] Arshad N., Heimbigner D., and Wolf A.. 2003. Deployment and dynamic reconfiguration planning for distributed software systems. In IEEE ICTAI.Google ScholarGoogle Scholar
  5. [5] M. S. Arulampalam, S. Maskell, and N. Gordon. 2002. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing 50, 2 (2002), 174–188.Google ScholarGoogle Scholar
  6. [6] J. Bajada, M. Fox, and D. Long. 2015. Temporal planning with semantic attachment of non-linear monotonic continuous behaviours. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 1523–1529.Google ScholarGoogle Scholar
  7. [7] Bektas O. and Jones J. A.. 2016. NARX time series model for remaining useful life estimation of gas turbine engines. In PHM Society European Conference, Vol. 3. Issue 1. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] O. Bektas, J. Marshall, and J. Jones. 2020. Comparison of Computational Prognostic Methods for Complex Systems Under Dynamic Regimes: A Review of Perspectives. Archives of Computational Methods in Engineering 27, 4 (2020), 999–1011.Google ScholarGoogle Scholar
  9. [9] Bernardini S., Fox M., Long D., and Piacentini C.. 2017. Boosting search guidance in problems with semantic attachments. In International Conference on Automated Planning and Scheduling, Vol. 27. 2937.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Bertolucci R., Capitanelli A., Maratea M., Mastrogiovanni F., and Vallati M.. 2019. Automated planning encodings for the manipulation of articulated objects in 3d with gravity. In AI* IA - International Conference of the Italian Association for Artificial Intelligence. Springer, 135150.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Bhusal N., Abdelmalak M., Kamruzzaman M., and Benidris M.. 2020. Power system resilience: Current practices, challenges, and future directions. IEEE Access 8 (2020), 1806418086.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Billings S. A.. 2013. Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains. 2013016206Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] T. Blochwitz, M. Otter, J. Akesson, M. Arnold, C. Clauss, H. Elmqvist, M. Friedrich, A. Junghanns, J. Mauss, D. Neumerkel, H. Olsson, and A. Viel. 2011. The functional mockup interface for tool independent exchange of simulation models. In Proceedings of the 8th International Modelica Conference. 105–114.Google ScholarGoogle Scholar
  14. [14] Bogomolov S., Magazzeni D., Podelski A., and Wehrle M.. 2014. Planning as model checking in hybrid domains. In AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Brent R. P.. 1971. An algorithm with guaranteed convergence for finding a zero of a function. Comput. J. 14, 4 (011971), 422425. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] J. Cámara, P. Correia, R. de Lemos, and M. Vieira. 2014. Empirical resilience evaluation of an architecture-based self-adaptive software system. In Proceedings of the 10th International ACM Sigsoft Conference on Quality of Software Architectures (QoSA’14). Association for Computing Machinery, 63–72.Google ScholarGoogle Scholar
  17. [17] Cashmore M., Fox M., Long D., and Magazzeni D.. 2016. A compilation of the full PDDL+ language into SMT. In International Conference on Automated Planning and Scheduling, Vol. 26. 7987.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Cimatti A., Giunchiglia E., Giunchiglia F., and Traverso P.. 1997. Planning via model checking: A decision procedure for AR. In Recent Advances in AI Planning. Springer, 130142.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] A. Coles, A. Coles, M. Fox, and D. Long. 2012. COLIN: Planning with continuous linear numeric change. J. Artif. Int. Res. 44, 1 (2012), 1–96.Google ScholarGoogle Scholar
  20. [20] Daly J. T.. 2006. A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Computer Systems 22, 3 (2006), 303312. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Davis S., Cremaschi S., and Eden M.. 2017. Efficient surrogate model development: Optimum model form based on input function characteristics. In 27th European Symposium on Computer Aided Process Engineering. Computer Aided Chemical Engineering, Vol. 40. Elsevier, 457462.Google ScholarGoogle Scholar
  22. [22] Kleer J. de, Mackworth A., and Reiter R.. 1992. Characterizing diagnoses and systems. Journal of Artificial Inteligence 56, 2–3 (1992), 197222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Penna G. Della, Magazzeni D., and Mercorio F.. 2012. A universal planning system for hybrid domains. Appl. Intell. 36, 4 (2012), 932959.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Dornhege C., Eyerich P., Keller T., Trüg S., Brenner M., and Nebel B.. 2009. Semantic attachments for domain-independent planning systems. In Nineteenth International Conference on Automated Planning and Scheduling.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Feldman A., Provan G., and Gemund A. van. 2010. A model-based active testing approach to sequential diagnosis. JAIR 39, 1 (sep2010), 301334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Fox M. and Long D.. 2006. Modelling mixed discrete-continuous domains for planning. Journal of Artificial Intelligence Research 27, 1 (2006), 235297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Fox M. and Long D.. 2011. PDDL2.1: An extension to PDDL for expressing temporal planning domains. (2011). arXiv: https://arxiv.org/abs/arXiv:1106.4561Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Fritzson P.. 2015. Principles of Object-Oriented Modeling and Simulation with Modelica 3.3: A Cyber-Physical Approach (2 ed.). Wiley, Hoboken, NJ.Google ScholarGoogle Scholar
  29. [29] Garcia C., Prett D., and Morari M.. 1989. Model predictive control: Theory and practice—A survey. Automatica 25, 3 (1989), 335348.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Gasser P., Lustenberger P., Cinelli M., Kim W., Spada M., Burgherr P., Hirschberg S., Stojadinovic B., and Sun T. Y.. 2021. A review on resilience assessment of energy systems. Sustainable and Resilient Infrastructure 6, 5 (2021), 273299. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Guermouche A., Ropars T., Brunet E., Snir M., and Cappello F.. 2011. Uncoordinated checkpointing without domino effect for send-deterministic MPI applications. In 2011 IEEE International Parallel & Distributed Processing Symposium. 9891000. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Gupta N., Mayo J. R., Lemoine A. S., and Kaiser H.. 2020. Implementing software resiliency in HPX for extreme scale computing. (2020). Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Hertle A., Dornhege C., Keller T., and Nebel B.. 2012. Planning with semantic attachments: An object-oriented view. Proc. of ECAI 242.Google ScholarGoogle Scholar
  34. [34] Isermann R.. 2005. Model-based fault-detection and diagnosis—status and applications. Annual Reviews in Control 29, 1 (2005), 7185.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Julier S. and Uhlmann J.. 1997. New extension of the Kalman filter to nonlinear systems. In Signal Processing, Sensor Fusion, and Target Recognition VI(Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 3068). 182193. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Kalman R. E.. 1960. A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering 82, Series D (1960), 3545.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Kang E., Jackson E., and Schulte W.. 2010. An approach for effective design space exploration. In Monterey Conference on Foundations of Computer Software: Modeling, Development, and Verification of Adaptive Systems (FOCS’10). 3354.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Kim S., Choi J., and Kim N.. 2021. Challenges and opportunities of system-level prognostics. Sensors 21, 22 (2021), 1–25. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Kingma D. P. and Ba J.. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR’15). Retrieved from http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  40. [40] Li Y., Kurfess T.R., and Liang S. Y.. 2000. Stochastic prognostics for rolling element bearings. Mechanical Systems and Signal Processing 14, 5 (2000), 747762. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Marcet A. and Sargent T.. 1992. Speed of Convergence of Recursive Least Squares Learning with ARMA Perceptions. Economics Working Papers. Department of Economics and Business, Universitat Pompeu Fabra.Google ScholarGoogle Scholar
  42. [42] Matei I., Feldman A., and Kleer J. de. 2018. Model-based diagnosis: A frequency domain view. In 2018 IEEE International Conference on Prognostics and Health Management(PHM).Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Matei I., Zheng C., Chowdhury S., and Kleer J. de. 2021. Controlling draft interactions between quadcopter unmanned aerial vehicles with physics-aware modeling. Journal of Intelligent and Robotics Systems 101, 21 (2021), 1–21.Google ScholarGoogle Scholar
  44. [44] Matei I., Zhenirovskyy M., Kleer J. de, and Feldman A.. 2018. Analytic redundancy relations guided parameter estimation for model-based diagnosis. In International Workshop on Principles of Diagnosis (DX’18).Google ScholarGoogle Scholar
  45. [45] Matei I., Zhenirovskyy M., Kleer J. de, and Goebel K.. 2022. A control approach to fault disambiguation. Annual Conference of the PHM Society 14, 1 (2022), 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] D. McDermott, M. Ghallab, A. Howe, C. Knoblock, A. Ram, M. Veloso, D. Weld, and D. Wilkins. 1998. PDDL - The Planning Domain Definition Language. Technical Report CVC TR-98-003/DCS TR-1165. Yale Center for Computational Vision and Control.Google ScholarGoogle Scholar
  47. [47] McElhoe B. A.. 1966. An assessment of the navigation and course corrections for a manned flyby of mars or venus. IEEE Transactions on Aerospace Electronic Systems 2, 4 (July1966), 613623. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Minhas R., Kleer J. de, Matei I., Saha B., Janssen B., Bobrow D.G., and Kurtoglu T.. 2014. Using fault augmented modelica models for diagnostics. In International Modelica Conference. 437445.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Najarian M. and Lim G. J.. 2019. Design and assessment methodology for system resilience metrics. Risk Analysis 39, 9 (2019), 18851898. DOI:arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/risa.13274Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Panteli M., Mancarella P., Trakas D. N., Kyriakides E., and Hatziargyriou N. D.. 2017. Metrics and quantification of operational and infrastructure resilience in power systems. IEEE Transactions on Power Systems 32, 6 (2017), 47324742.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Paris P. and Erdogan F.. 1963. Closure to “Discussions of ‘A Critical Analysis of Crack Propagation Laws”’ (1963, ASME J. Basic Eng., 85, pp. 533–534). Journal of Basic Engineering 85, 4 (121963), 534534. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. 2017. Automatic differentiation in PyTorch. In NIPS 2017 Workshop on Autodiff. 1–4.Google ScholarGoogle Scholar
  53. [53] Patton R. J., Frank P. M., and Clark R. N.. 2000. Issues of Fault Diagnosis for Dynamic Systems. Springer-Verlag London.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Piacentini C., Magazzeni D., Long D., Fox M., and Dent C.. 2016. Solving realistic unit commitment problems using temporal planning: Challenges and solutions. In International Conference on Automated Planning and Scheduling, Vol. 26. 421430.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Piotrowski W., Fox M., Long D., Magazzeni D., and Mercorio F.. 2016. Heuristic planning for PDDL+ domains. In International Joint Conferences on Artificial Intelligence. 32133219.Google ScholarGoogle Scholar
  56. [56] Piotrowski W., Stern R., Sher Y., Le J., Klenk M., Kleer J. de, and Mohan S.. 2023. Learning to operate in open worlds by adapting planning models. In International Conference on Autonomous Agents and Multiagent Systems. ACM, 26102612. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Powell M. J. D.. 2007. A View of Algorithms for Optimization without Derivatives. Technical Report. University of Cambridge, UK.Google ScholarGoogle Scholar
  58. [58] Roman E.. 2002. Survey of Checkpoint/restart Implementations. Technical Report LBNL-54942. Lawrence Berkeley National Laboratory.Google ScholarGoogle Scholar
  59. [59] Saha B., Honda T., Matei I., Saund E., Kleer J. de, Kurtoglu T., and Lattmann Z.. 2014. Model-based approach for optimal maintenance strategy. In European Conference of the Prognostics and Health Management Society.Google ScholarGoogle Scholar
  60. [60] Staroswiecki M. and Comtet-Varga G.. 2001. Analytical redundancy relations for fault detection and isolation in algebraic dynamic systems. Automatica 37, 5 (2001), 687699. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Teranishi K. and Heroux M.. 2014. Toward local failure local recovery resilience model using MPI-ULFM. In Proceedings of the 21st European MPI Users’ Group Meeting. 5156. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Theilliol D., Noura H., and Ponsart J. C.. 2002. Fault diagnosis and accommodation of a three-tank system based on analytical redundancy. ISA Transactions 41, 3 (2002), 365382. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] S. Thiébaux, C. Coffrin, H. Hijazi, and J. Slaney. 2013. Planning with MIP for supply restoration in power distribution systems. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (Beijing, China) (IJCAI’13). AAAI Press, 2900–2907.Google ScholarGoogle Scholar
  64. [64] Thomas A., Amatya S., Mastrogiovanni F., and Baglietto M.. 2018. Towards perception-aware task-motion planning. In AAAI Fall Symposium on Reasoning and Learning in Real-World Systems for Long-Term Autonomy. Arlington, VA.Google ScholarGoogle Scholar
  65. [65] G. E. P. Box and G. Jenkins. 1990. Time Series Analysis, Forecasting and Control. Holden-Day.Google ScholarGoogle Scholar
  66. [66] Vallati M., Magazzeni D., Schutter B. De, Chrpa L., and McCluskey T.. 2016. Efficient macroscopic urban traffic models for reducing congestion: A PDDL+ planning approach. In AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarGoogle ScholarCross RefCross Ref
  67. [67] Virtanen P., Gommers R., Oliphant T. E., Haberland M., Reddy T., et al. 2020. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods 17, 3 (2020), 261272. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Wang Y., Chen ., Wang J., and Baldick R.. 2016. Research on resilience of power systems under natural disasters–a review. IEEE Transactions on Power Systems 31, 2 (2016), 16041613. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] Williams B., Ingham M., Chung S., Elliott P., Hofbaur M., and Sullivan G.. 2003. Model-based programming of fault-aware systems. AI Magazine 24, 4 (2003), 6161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. [70] Yu W. Kufi and Harris T.. 2001. A new stress-based fatigue life model for ball bearings. Tribology Transactions 44, 1 (2001), 1118.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Zhang X., Zhang Q., Zhao S., Ferrari R., Polycarpou M. M., and Parisini T.. 2011. Fault detection and isolation of the wind turbine benchmark: An estimation-based approach. IFAC Proceedings Volumes 44, 1 (2011), 82958300. DOI:18th IFAC World Congress.Google ScholarGoogle ScholarCross RefCross Ref
  72. [72] Zou F., Shen L., Jie Z., Zhang W., and Liu W.. 2019. A sufficient condition for convergences of adam and RMSProp. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 1111911127. DOI:Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. System Resilience through Health Monitoring and Reconfiguration

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Cyber-Physical Systems
            ACM Transactions on Cyber-Physical Systems  Volume 8, Issue 1
            January 2024
            225 pages
            ISSN:2378-962X
            EISSN:2378-9638
            DOI:10.1145/3613531
            • Editor:
            • Chenyang Lu
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 January 2024
            • Online AM: 3 November 2023
            • Accepted: 20 October 2023
            • Revised: 6 September 2023
            • Received: 8 August 2022
            Published in tcps Volume 8, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
          • Article Metrics

            • Downloads (Last 12 months)229
            • Downloads (Last 6 weeks)136

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text