Projektdaten
TP3: Aktives Lernen im Rahmen der Koopman Theorie
Fakultät/Einrichtung
Mathematik und Naturwissenschaften
Drittmittelgeber
Deutsche Forschungsgemeinschaft
Bewilligungssumme, Auftragssumme
486.463,00 €
Abstract:
This project is part of the research unit "Active Leaming for Systemsand Control (ALeSCo) - Data
lnfomiativity, Uncertainty, and Guarantees". We are going to use the well-established extended
dynamic mode decomposition (EDMD) within the Koopman framework to generate data-driven models of
dynamical control systems. Then, we leverage the prediction capacity of the derived surrogate
models to incorporate active leaming in alignment with novel data-infomiativity notions in model
predictive control (MPC). To this end, the assessment of retrieved data is conditioned on the
control objective to properfy balance exploration and exploitation.
The underlying idea of the Koopman framework is to lift a complex, highly-nonlinear dynamical
system to render its dynamics linear. While the associated Koopman operator is linear, it is - in
general - infinite dimensional. Here, EDMD yields a finite-dimensional approximation, which can
also be extended to control systems. To this end, the dynamics are restricted to a subspace spanned
by a finite number of observable functions (projection), before a regression problem is solved to
approximately compute the so-called compression (estimation). While EDMD is widely applied in
practice, even for extremely-challenging applications like molecular or fluid dynamics,
mathematically sound finite-data error bounds are much more scarce and typically not unifom, in
the state. Recently, we proposed the first unifom, bounds using kemel EDMD. The two key steps were
Koopman invariance ofthe imposed reproducing kemel Hilbert space (RKHS) and the choice of the kemel
as radial basis functions with compact support and an adjustable degree of smoothness. Building
upon promising results on control-Koopman operator regression (cKor) and our prior work, we will
extend these unifom, error bounds to control systems and derive, in addition, task-specific bounds,
e.g., proportional error estimates for set-point stabilization. Then, we encode prior knowledge and
incorporate active-leaming in our predictive-control algorithms. For the latter, we make use of
infomiation-theoretic concepts like data infom,ativity and develop tailored
persistency-of-excitation conditions. Further, we set up a bi-objective optimal control problem to
properly assess the exploration/exploitation trade off. Then, goal-oriented adaptive refinements,
consistency projectors, and diffusion-based exploration are.leveraged, in addition, to fester
active leaming while taking the geometry of the retrieved data into account.
Throughout the project, we apply the developed algorithms to two exemplarily-chosen representative
benchmark applications, i.e., robotics and multi-energy systems, to validate the employed
infom,ation-theoretic measures like data infomiativity and the inferred active-learning components.
Overall, we develop a novel innovative active-leaming MPC scheme within the Koopman framework.