COMPLEX SYSTEMSCurrent Research - MULTI-ROBOT LEARNINGOne of the most desirable missions for robots is to reduce human exposure in dangerous tasks, for example in planetary exploration or hazardous waste cleanup. These application domains are characterized by the impossibility of obtaining a good enough model to use a step-by-step problem-solving procedure to program the robots. Therefore, learning (i.e., a way to improve automatically the robot performance in the environment) is mandatory. In the case of cooperative robots, group learning is also mandatory to achieve redundancy (a major reason for cooperation): on-the-fly reconfiguration of the group tasks. Cooperative robot learning raises, at least, all the issues attached to robot learning [3]: huge search space size, limited number of examples, necessity for generalization. Answers (biases) have been searched in improved exploration of the search space, search space size reduction and reduction of the required number of samples. Over the years, reinforcement learning has emerged as the main learning approach in autonomous robotics, and lazy learning has become the leading bias (allowing the reduction of the time required by an experiment to the time needed to test the learned behavior quality). Reinforcement learning allows us to synthesize a robot behavior using (only) a qualitative measure of the performance of the desired behavior [2]. Q-learning, because it is a model-free learning method is certainly the most used [1]. Lazy learning, also called instance-based learning, buids a non-explicit model of the robot-environment relation. Because there is no explicit model, sub-symbolic learning techniques must be used. Cooperative robot learning adds at least two issues: References [1] Claude Touzet, Neural Reinforcement Learning for Behaviour Synthesis, Robotics and Autonomous Systems, Special issue on Learning Robot: the New Wave, N. Sharkey Guest Editor, vol. 22, Nb 3-4, December 1997, pp 251-281. |