2018 Soumyajit Paul’s PhD

In the context of the Robocup competition, a team of robots is competing against another in a specified physical environment. The control of the team may be centralized (e.g. in the small-size league of Robocup) or distributed (in the simulation-league).

A problem-solver has been developed in LaBRI by the Rhoban team. It combines several reinforcement learning techniques (RL) for computing good strategies in Markov decision processes. These strategies are precomputed and represented by regression forests, which allows a mobile robot to perform real-time decision making at a very low computational cost. These techniques can scale up to computing cooperation strategies between a small number of robots, but not an entire strategy for a whole team.

Layered learning (LL) allows to combine expert knowledge and reinforcement learning techniques (RL) in order to hierarchically decompose complex tasks in order to incrementally learn a series of increasingly complex sub-behaviors and combine them into complex behaviors.

The thesis will focus on using layered learning in order to effectively address a large scope of problems, including a whole team strategy in the simulation and small size leagues. A clean game-theoretic background will be used in order to minimize the predictability of the computed strategies. The goal is both to publish high-quality papers and keep developing the problem-solver of the Rhoban team.

References:

Patrick MacAlpine, Peter Stone. Overlapping layered learning. Artif. Intell. 254: 21-43 (2018).

www.robocup.org

Ludovic Hofer. Decision-making algorithms for autonomous robots, PhD Thesis, LaBRI, Univ. Bordeaux, 2017.

Ludovic Hofer, Quentin Rouxel. An Operational Method Toward Efficient Walk Control Policies for Humanoid Robots. ICAPS 2017: 489-497.