Safety Verification of Neural Network Controllers
Project Overview
Although effective, DNNs are widely regarded as black-boxes within the scientific community [1]. As a result, over recent years Explainable Artificial Intelligence has become an established field of research, aiming to explain and interpret predictions of complex machine learning models [2,3].
For astrodynamics, the standard methodology for testing both the performance and behaviour of trained regression DNNs is Monte Carlo simulations, which has serious limitations. For problems with a high dimensional input space, extensive simulations are required to cover the wide array of inputs, which also tend to miss edge cases. More crucially, as Monte Carlo simulations involve random sampling to test a finite set of points, only a subset of the state-space is tested. Thus, even in an extensively large Monte Carlo simulation, input points in-between the tested set of input points are not assessed. This is of critical importance when considering the recent discovery that DNNs are vulnerable to adversarial examples, particular inputs that cause the network to fail in unexpected ways. This vulnerability is of major concern for applying DNNs in safety-critical environments [4].
In this project, we utilise many sets of Taylor polynomials via Automatic Domain Splitting to expand trajectories obtained from embedding DNNs into dynamical system ODEs. These polynomials provide a continuous description of the system's evolution, in contrast to the discrete samples obtained from Monte Carlo simulations. Utilising Differential Algebraic techniques, we may also produce a Taylor polynomial map to an event; for instance, the distance of closest approach. We implement interval arithmetic to rigorously enclose the obtained polynomials and combine this with an over-estimate of the remainder. The output allows the rigorous bounding of such an event, allowing a validated safety certificate to be obtained for a given DNN controller.
References
[1] Ribeiro, Marco, Singh, Sameer, and Guestrin, Carlos, ““Why Should I Trust You?”: Explaining the Predictions of Any Classifier”, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Association for Computational Linguistics, 2016, pp. 97–101, doi:10.18653/v1/N16-3020.
[2] Xu, Feiyu, Uszkoreit, Hans, Du, Yangzhou, et al., “Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges”, Natural Language Processing and Chinese Computing, Springer International Publishing, 2019, pp. 563–574, doi:10.1007/978-3-030-32236-6_51.
[3] Holzinger, Andreas, Saranti, Anna, Molnar, Christoph, et al., “Explainable AI Methods - A Brief Overview”, xxAI - Beyond Explainable AI, Springer International Publishing, 2022, pp. 13–38, doi:10.1007/978-3-031-04083-2_2.
[4] Yuan, Xiaoyong, He, Pan, Zhu, Qile, and Li, Xiaolin, “Adversarial Examples: Attacks and Defenses for Deep Learning”, IEEE Transactions on Neural Networks and Learning Systems 30.9 (2019), pp. 2805–2824, doi: 10.1109/TNNLS.2018.2886017.