HypeRL: Hypernetwork-Based Reinforcement Learning for Control of Parametrized Dynamical Systems
Nicolò Botteghi, Stefania Fresca, Mengwu Guo, Andrea Manzoni
arXiv:2501.04538·cs.LG·Published 2025-01-08·Updated 2026-02-10
In this work, we devise a new, general-purpose reinforcement learning strategy for the optimal control of parametric dynamical systems. Such problems frequently arise in applied sciences and engineering and entail a significant complexity when control and/or state variables are distributed in high-dimensional space or depend on varying parameters. Traditional numerical methods, relying on either iterative minimization algorithms -- exploiting, e.g., the solution of the adjoint problem -- or dynamic programming -- also involving the solution of the Hamilton-Jacobi-Bellman (HJB) equation -- while reliable, often become computationally infeasible. In this paper, we propose HypeRL a deep reinforcement learning (DRL) framework to overcome the limitations shown by traditional methods. HypeRL aims at approximating the optimal control policy directly. Specifically, we employ an actor-critic DRL approach to learn an optimal feedback control strategy that can generalize across the range of variation of the parameters. To effectively learn such optimal control laws for different instances of the parameters, encoding the parameter information into the DRL policy and value function neural networks (NNs) is essential. HypeRL uses two additional NNs, called hypernetworks, to learn the weights and biases of the value function and the policy NNs. In this way, HypeRL effectively embeds the parametric information into the value function and policy. We validate the proposed approach on two parametric control problems, namely (I) a 1D parametric Kuramoto-Sivashinsky equation with in-domain control, and (ii) a navigation problem of particle dynamics in a parametric 2D gyre flow. We show that the knowledge of physical and task-dependent information and the encoding of this information via a hypernetwork, are essential ingredients for learning parameter-dependent control policies.
TopicsDynamical Systems & PDE Learning
Tagsdynamical-systems
arXiv categoriescs.LG
arXiv abstract pagePDF