search

actions - event

state: published
- cancelpublished
- view workflow

From UAV mission planning to Reinforcement Learning in large datasets. Some lessons learnt and perspectives

old_uid	9028
title	From UAV mission planning to Reinforcement Learning in large datasets. Some lessons learnt and perspectives
start_date	2010/07/05
schedule	10h
online	no
summary	Consider the question of autonomous planning for an unmanned air vehicle (UAV). Tackling such a problem requires modeling the possible non-deterministic and time- dependent evolution of the environment. It is also desirable that its resolution rapidly provides close-to-optimal plans and that the agent be able to adapt an existing plan when some new (possibly inconsistent) data appears. I will introduce this presentation with the example of a helicopter UAV patrol problem. My goal here is to draw lessons from the experience of solving this problem and transfer them to a more general framework. As this problem admits an intuitive representation as a Time-dependent Markov Decision Process, we shall study the TiMDPpoly algorithm designed to solve it and analyze how it answers (or not) the above requirements. Then we will focus on a current attempt at transposing the asynchronous Dynamic Programming scheme of TiMDPpoly into the model-free context of Reinforcement Learning in order to deal with very large experience databases with limited memory constraints and without the burden of building a model. We will discuss the benefits of this extension both in terms of genericity (the class of problems we can address), efficiency (the size and complexity of these problems) and adaptability (the capacity of the UAV to react quickly to a new environment). Our conclusion will bring us back to the original problem of mission planning for UAVs.
responsibles	<not specified>

hosted_by

Université Pierre et Marie Curie - Paris VI

speakers

event_of

Institut des Systèmes Intelligents et de Robotique (séminaire de l’ISIR, UMR 7222) (2009)

Event #162504 - latest update on 2022/05/17, created on 2010/06/28