Deep deterministic policy gradient with symmetric data augmentation for lateral attitude tracking control of a fixed-wing aircraft
Yifei Li, Erik-Jan van Kampen
arXiv:2407.11077·cs.LG·Published 2024-07-13·Updated 2026-04-13
The symmetry of dynamical systems can be exploited for state-transition prediction and to facilitate control policy optimization. This paper leverages system symmetry to develop sample-efficient offline reinforcement learning (RL) approaches. Under the symmetry assumption for a Markov Decision Process (MDP), a symmetric data augmentation method is proposed. The augmented samples are integrated into the dataset of Deep Deterministic Policy Gradient (DDPG) to enhance its coverage rate of the state-action space. Furthermore, sample utilization efficiency is improved by introducing a second critic trained on the augmented samples, resulting in a dual-critic structure. The aircraft's model is verified to be symmetric, and flight control simulations demonstrate accelerated policy convergence when augmented samples are employed.
TopicsDynamical Systems & PDE Learning
Tagsdynamical-systems
arXiv categoriescs.LG, cs.AI
arXiv abstract pagePDF