Achieving goals using reward shaping and curriculum learning

Studley, Matthew; hansen, mark; anca, mihai; thomas, johnathan; pedamonti, dabal

Achieving goals using reward shaping and curriculum learning

Studley, Matthew; hansen, mark; anca, mihai; thomas, johnathan; pedamonti, dabal

Authors

Professor Matthew Studley Matthew2.Studley@uwe.ac.uk
Professor in Ethics & Technology/School Director (Research & Enterprise)

Mark Hansen Mark.Hansen@uwe.ac.uk
Professor in Machine Vision and Machine Learning

mihai anca

johnathan thomas

dabal pedamonti

Abstract

Real-time control for robotics is a popular research area in the reinforcement learning community. Through the use of techniques such as reward shaping, researchers have managed to train online agents across a multitude of domains. Despite these advances, solving goal oriented tasks still requires complex architectural changes or hard constraints to be placed on the problem. In this article, we solve the problem of stacking multiple cubes by combining curriculum learning, reward shaping, and a high number of efficiently parallelized environments. We introduce two curriculum learning settings that allow us to separate the complex task into sequential sub-goals, hence enabling the learning of a problem that may otherwise be too difficult. We focus on discussing the challenges encountered while implementing them in a goal-conditioned environment. Finally, we extend the best configuration identified on a higher complexity environment with differently shaped objects.

Presentation Conference Type	Conference Paper (unpublished)
Conference Name	Future Technologies Conference
Start Date	Nov 2, 2023
End Date	Nov 3, 2023
Deposit Date	May 16, 2023
Publicly Available Date	May 16, 2023
Series Title	Lecture Notes in Networks and Systems
Keywords	reinforcement learning, curriculum learning, reward shaping, robotics
Public URL	https://uwe-repository.worktribe.com/output/10792709