# Yibei Li: Inverse and Forward Approaches for Optimal Control and Estimation in Agent-Based Systems

## In this pre-defense seminar, Yibei will present selected parts of her upcoming thesis.

Date of defence: Thursday, 2 June 2022 14:00

Link to the thesis: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-311742

**Time: **
Wed 2022-05-25 11.00 - 12.00

**Location: **
3721

**Language: **
English

**Lecturer: **
YIbei Li

**Abstract**

This dissertation is concerned with three topics within the field of optimal control and estimation in dynamical agent-based systems, with potential applications that meet both engineering and societal needs. Firstly, the inverse optimal control problem is studied. Given a dynamical system, the goal is to recover the underlying cost function from observations of the optimal state trajectories. Such recovery of cost functions will not only help us develop a better understanding of natural and societal phenomena, but also provide a criterion to design optimal controllers in similar contexts. Secondly, we further study synthesis of collective emergence in multi-agent systems. The problem is fit into a game theoretical framework based on modeling the strategic interactions among self-oriented agents. In this thesis the specific topic of intrinsic formation control is addressed, in which designing individual cost functions to realize desired optimal emergence is a critical issue. Finally, topics of distributed coordination are also considered for societal systems, or more specifically, in mathematical finance. The credit scoring problem is studied by incorporating dynamical networked information.

Specifically, in Paper A and Paper C, the finite-horizon inverse optimal control problem is studied for continuous-time systems, with full or partial state observations. Although the infinite-horizon inverse linear quadratic problem is well-studied with numerous results, the finite-horizon case is still an open problem. To the best of our knowledge, our result is the first complete result on necessary and sufficient conditions for the solvability of such inverse problem. The uniqueness of solutions is studied and the equivalence class of cost functions is derived. In addition, based on system invertibility a well-posed inverse problem is formulated even for the case in which the optimal synthesis can only be partially observed. As for suboptimal observations, residual optimization problems are solved to obtain a best-fit approximate cost function.

Paper B further studies the inverse optimal control problem in a stochastic set-up, where partial state observations of a discrete-time system are available under measurement noise. Firstly, by formulating the problem as a system identification task with the exact initial states as model excitations, its identifiability is justified under the relative degree assumption and statistical consistency is shown for the empirical estimation. Furthermore, as for more practical scenarios with imperfect initial states as well, the problem is fit into the framework of maximum likelihood estimation and is solved by Expectation Maximization algorithm under Gaussian assumptions.

In Paper D, the intrinsic formation control problem of a multi-agent system is formulated as both finite- and infinite-horizon noncooperative differential games. The manifold of all equivalent configurations of the desired formation is studied by considering all orientations and agent permutations, whose convergence and stability are analyzed in both cases. The main novelty of our work lies in that the desired relative pattern is not predefined in the game, and is achieved intrinsically only via different choices of the communication topology of the multi-agent system without using formation errors in the controller, which can be hard to obtain in practice. Patterns of regular polyhedra and antipodal formations are achieved by Nash equilibria while inter-agent collisions are naturally avoided.

Paper E concerns the network-based credit scoring problem and the advantages of such incorporation are studied in two scenarios. Firstly, when the score publishing is merely individual-dependent, an optimal Bayesian filter is designed for risk prediction, which serves as a reference for the lender on future financial decisions. Secondly, a recursive Bayes estimator is proposed to further improve the accuracy of score publishing by incorporating the dynamical network topology as well. It is shown that under the proposed evolution framework, the designed biased estimator has a higher precision than any efficient estimator, and the mean square errors are strictly smaller than the Cramér-Rao lower bound for clients within a certain range of scores.