Mission Analysis
2 Oct 2023

Comparing Reinforcement Learning with Imitation Learning for Guidance and Control in Space

Project overview

Backward propagation of optimal samples (rotating frame) used to train a G&CNET with imitation learning [1]
Backward propagation of optimal samples (rotating frame) used to train a G&CNET with imitation learning [1]

Reinforcement learning (RL) and imitation learning (IL) are two fundamentally different machine learning paradigms which can be used to train Guidance & Control Networks (G&CNETs) (1) to control a system.

A recent success by ETH suggests that RL is better suited than imitation learning to guide and control drones (2). The university group used a small 2x128 MLP network trained with RL to map the current state to the control action and managed to beat three professional human pilots in several races through a series of gates.

Space, in contrast to drones, constitutes a far more deterministic environment. The absence of contact dynamics and aerodynamic effects as well as very precise system identification for actuators used in space allows us to create high fidelity models of the dynamics. Since such models cannot be obtained for drones, it begs the question whether RL is also the best suited learning paradigms for G&CNETs in space.

Both RL and IL have been successfully applied to train G&CNETs for space applications (1,3). However, the ideal learning paradigm is likely dependent on the specific task. In this project, we focus on spacecraft guidance and control tasks, such as landing on an asteroid or interplanetary transfers. Our aim is to determine, considering factors such as the complexity of the optimal control problem and sensor noise levels, which learning paradigm achieves mission-compatible accuracy in the most time-efficient manner.

References

  1. D. Izzo, S. Origer, 2022. "Neural representation of a time optimal, constant acceleration rendezvous." Acta Astronautica, https://www.sciencedirect.com/science/article/pii/S0094576522004581
  2. E. Kaufmann, L. Bauersfeld, A. Loquercio, M. Müller, V. Koltun, D. Scaramuzza , 2023. "Champion-level drone racing using deep reinforcement learning." https://www.nature.com/articles/s41586-023-06419-4#citeas
  3. A. Zavoli, L. Federici, 2020. "Reinforcement Learning for Low-Thrust Trajectory Design of Interplanetary Missions." https://arxiv.org/abs/2008.08501
Hamburger icon
Menu
Advanced Concepts Team