Skip to main content

Research Repository

Advanced Search

Twin delayed hierarchical actor-critic

Anca, Mihai; Studley, Matthew

Authors

Mihai Anca

Profile Image

Dr Matthew Studley Matthew2.Studley@uwe.ac.uk
Professor of Ethics & Technology/School Director (Research & Enterprise)



Abstract

Hierarchical Reinforcement Learning (HRL) addresses the common problem in sparse rewards environments of having to manually craft a reward function. We present a modified version of the Hierarchical Actor-Critic (HAC) architecture called Twin Delayed HAC (TDHAC), a method capable of sample-efficient learning on environments requiring object interaction. The vanilla algorithm fails to converge on this type of environment, while our method matches the best results so far reported in the literature. We carefully consider each feature added to the original architecture and demonstrate the abilities of TDHAC on the sparse-reward Pick-and-Place environment. To the best of our knowledge, this is the first HRL algorithm successfully applied on an environment requiring object interaction without external enhancements such as demonstrations.

Citation

Anca, M., & Studley, M. (2021). Twin delayed hierarchical actor-critic. In 2021 7th International Conference on Automation, Robotics and Applications (ICARA) (221-225). https://doi.org/10.1109/icara51699.2021.9376459

Conference Name 2021 International Conference on Automation, Robotics and Applications, ICARA 2021
Conference Location Prague, Czech Republic
Start Date Feb 4, 2021
End Date Feb 6, 2021
Acceptance Date Dec 25, 2020
Publication Date Mar 17, 2021
Deposit Date Jun 22, 2021
Pages 221-225
Book Title 2021 7th International Conference on Automation, Robotics and Applications (ICARA)
ISBN 9780738142906
DOI https://doi.org/10.1109/icara51699.2021.9376459
Public URL https://uwe-repository.worktribe.com/output/7229966