|
|
Policy Gradient Bayesian Robust Optimization for Imitation Learning
Zaynah Javed*,
Daniel Brown*,
Satvik Sharma,
Jerry Zhu,
Ashwin Balakrishna,
Marek Petrik,
Anca D. Dragan,
Ken Goldberg
International Conference on Machine Learning (ICML), 2021
Website /
PDF
A scalable and robust RL algorithm which optimizes for a combination of expected performance and tail risk under a distribution over learned reward functions.
|
|
|
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Suraj Nair,
Michael Luo,
Krishnan Srinivasan,
Minho Hwang,
Joseph E. Gonzalez,
Julian Ibarz,
Chelsea Finn,
Ken Goldberg
Robotics and Automation Letters (RA-L) Journal and International Conference on Robotics and Automation (ICRA), 2021 - Mentioned in Google AI Year in Review
Website /
PDF
Mentioned in Google AI Year in Review
An algorithm for safe reinforcement learning which utilizes a set of offline data to learn about constraints before policy learning and a pair of policies which seperate the often conflicting objectives of task directed exploration and constraint satisfaction to learn contact rich and visuomotor control tasks.
|
|
|
ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Ugo Rosolia,
Joseph E. Gonzalez,
Aaron Ames,
Ken Goldberg
Algorithmic Foundations of Robotics (WAFR), 2020 - Invited to IJRR Special Issue
Website /
PDF
An MPC-based algorithm for robotic control (ABC-LMPC) with (1) performance and safety guarantees for stochastic nonlinear systems and (2) the ability to continuously explore the environment and expand the controller domain.
|
|
|
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks
Brijen Thananjeyan*,
Ashwin Balakrishna*,
Ugo Rosolia,
Felix Li,
Rowan McAllister,
Joseph E. Gonzalez,
Sergey Levine,
Francesco Borrelli,
Ken Goldberg
Robotics and Automation Letters (RA-L) Journal and International Conference on Robotics and Automation (ICRA), 2020
Website /
PDF
A new algorithm for safe and efficient reinforcement learning (SAVED) which leverages a small set of suboptimal demonstrations and prior task successes to structure exploration. SAVED also provides a mechanism for handling state-space constraints by leveraging probabilistic estimates of system dynamics.
|
|
|
On-Policy Robot Imitation Learning from a Converging Supervisor
Ashwin Balakrishna*,
Brijen Thananjeyan*,
Jonathan Lee,
Felix Li,
Arsh Zahed,
Joseph E. Gonzalez,
Ken Goldberg
Conference on Robot Learning (CoRL), 2019 - Oral Presentation
PDF
A new formulation of imitiation learning from a non-stationary supervisor, associated theoretical analysis, and a practical algorithm to apply this formulation to develeop an RL algorithm which combines the sample efficiency of model-based RL and the fast policy evaluation enabled by model-free policies.
|