Twin delayed hierarchical actor-critic

Anca, Mihai; Studley, Matthew

doi:10.1109/icara51699.2021.9376459

Twin delayed hierarchical actor-critic

Anca, Mihai; Studley, Matthew

Authors

Mihai Anca

Professor Matthew Studley Matthew2.Studley@uwe.ac.uk
Professor in Ethics & Technology/School Director (Research & Enterprise)

Abstract

Hierarchical Reinforcement Learning (HRL) addresses the common problem in sparse rewards environments of having to manually craft a reward function. We present a modified version of the Hierarchical Actor-Critic (HAC) architecture called Twin Delayed HAC (TDHAC), a method capable of sample-efficient learning on environments requiring object interaction. The vanilla algorithm fails to converge on this type of environment, while our method matches the best results so far reported in the literature. We carefully consider each feature added to the original architecture and demonstrate the abilities of TDHAC on the sparse-reward Pick-and-Place environment. To the best of our knowledge, this is the first HRL algorithm successfully applied on an environment requiring object interaction without external enhancements such as demonstrations.

Presentation Conference Type	Conference Paper (published)
Conference Name	2021 International Conference on Automation, Robotics and Applications, ICARA 2021
Start Date	Feb 4, 2021
End Date	Feb 6, 2021
Acceptance Date	Dec 25, 2020
Publication Date	Mar 17, 2021
Deposit Date	Jun 22, 2021
Pages	221-225
Book Title	2021 7th International Conference on Automation, Robotics and Applications (ICARA)
ISBN	9780738142906
DOI	https://doi.org/10.1109/icara51699.2021.9376459
Public URL	https://uwe-repository.worktribe.com/output/7229966

Consulting an oracle; repurposing robots for the circular economy (2024)
Presentation / Conference Contribution

Understanding consumer attitudes towards second-hand robots for the home (2024)
Journal Article

On the relationship between benchmarking, standards and certification in robotics and AI (2024)
Book Chapter

Trusted research environments for health data (2024)
Digital Artefact

Introducing the concept of repurposing robots; to increase their useful life, reduce waste, and improve sustainability in the robotics industry (2023)
Presentation / Conference Contribution

Twin delayed hierarchical actor-critic

Anca, Mihai; Studley, Matthew

Authors

Abstract

You might also like

Downloadable Citations