Imitation Learning

date
Nov 21, 2024
slug
imitation-learning
status
Published
tags
TEMP
summary
test text
type
Post
Test font style, free content.

Imitation Learning

Imitation Learning, where decision-making behavior is programmed by demonstration, has led to state-of-the-art performance in a variety of applications, including, e.g., outdoor mobile robot navigation (Silver 2008), legged locomotion (Ratliff 2006), advanced manipulation (Schaal 1999), and electronic games. A common approach to imitation learning is to train a classifier or regressor to replicate an expert's policy given training data of the encountered observations and actions performed by the expert.
Given access to a planner, current state-of-the-art techniques based on Inverse Optimal Control (IOC) (Abbeel 2004 [1], Ratliff 2006) achieve this indirectly by learning the cost function the expert is optimizing from the observed behavior, and the planner is used by the learner to minimize the long-term costs.
The past two decades have seen a paradigm shift towards generative models, which are now becoming state-of-the-art for Imitation learning, like Diffusion Policy.
These techniques can also be thought of as training a classifier (the planner), which is parametrized by the cost function. This often has the advantage that learning the cost function generalizes better over the state space or across similar tasks. A broad spectrum of learning techniques have been applied to imitation learning (Argall 2009; Chernova 2009); however, these applications all violate the crucial assumption made by statistical learning approaches that a learner's prediction does not influence the distribution of examples upon which it will be tested.
 
notion image
 
Imitation Learning
 
 

References

  1. Abbeel, Pieter, and Andrew Y. Ng. "Apprenticeship learning via inverse reinforcement learning." Proceedings of the twenty-first international conference on Machine learning. 2004.
  1. Ross, Stéphane, and Drew Bagnell. "Efficient reductions for imitation learning." Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010.
 
 

© Jinzhou Li 2024