A vision-guided deep learning framework for dexterous robotic grasping using Gaussian processes and transformers

Kadalagere Sampath, Suhas; Wang, Ning; Yang, Chenguang; Wu, Howard; Liu, Cunjia; Pearson, Martin

doi:10.3390/app15052615

A vision-guided deep learning framework for dexterous robotic grasping using Gaussian processes and transformers

Kadalagere Sampath, Suhas; Wang, Ning; Yang, Chenguang; Wu, Howard; Liu, Cunjia; Pearson, Martin

Authors

Suhas Kadalagere Sampath

Dr. Ning Wang Ning2.Wang@uwe.ac.uk
Senior Lecturer in Robotics

Charlie Yang Charlie.Yang@uwe.ac.uk
Professor in Robotics

Howard Wu

Cunjia Liu

Martin Pearson Martin.Pearson@uwe.ac.uk
Senior Lecturer

Abstract

Robotic manipulation of objects with diverse shapes, sizes, and properties, especially deformable ones, remains a significant challenge in automation, necessitating human-like dexterity through the integration of perception, learning, and control. This study enhances a previous framework combining YOLOv8 for object detection and LSTM networks for adaptive grasping by introducing Gaussian Processes (GPs) for robust grasp predictions and Transformer models for efficient multi-modal sensory data integration. A Random Forest classifier also selects optimal grasp configurations based on object-specific features like geometry and stability. The proposed grasping framework achieved a 95.6% grasp success rate using Transformer-based force modulation, surpassing LSTM (91.3%) and GP (91.3%) models. Evaluation of a diverse dataset showed significant improvements in grasp force modulation, adaptability, and robustness for two- and three-finger grasps. However, limitations were observed in five-finger grasps for certain objects, and some classification failures occurred in the vision system. Overall, this combination of vision-based detection and advanced learning techniques offers a scalable solution for flexible robotic manipulation.

Journal Article Type	Article
Acceptance Date	Feb 24, 2025
Online Publication Date	Feb 28, 2025
Publication Date	Feb 28, 2025
Deposit Date	Feb 28, 2025
Publicly Available Date	Mar 4, 2025
Journal	Applied Sciences
Electronic ISSN	2076-3417
Publisher	MDPI
Peer Reviewed	Peer Reviewed
Volume	15
Issue	5
Article Number	2615
DOI	https://doi.org/10.3390/app15052615
Keywords	dexterous robotic grasping; adaptive grasping; deep learning in robotics; transformer networks; Gaussian processes; vision-based force modulation
Public URL	https://uwe-repository.worktribe.com/output/13826594

This output contributes to the following UN Sustainable Development Goals:

End hunger, achieve food security and improved nutrition and promote sustainable agriculture

Build resilient infrastructure, promote inclusive and sustainable industrialisation and foster innovation

Files

A Vision-Guided Deep Learning Framework for Dexterous Robotic Grasping Using Gaussian Processes and Transformers (31.1 Mb)
PDF

Licence
http://creativecommons.org/licenses/by/4.0/

Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/

Dynamic movement primitives-based human action prediction and shared control for bilateral robot teleoperation (2024)
Journal Article

A dynamic movement primitives-based tool use skill learning and transfer framework for robot manipulation (2024)
Journal Article

MechTac: A multifunctional tendon-linked optical tactile sensor for in/out-the-field-of-view perception with deep learning (2023)
Presentation / Conference Contribution

A bio-inspired multi-functional tendon-driven tactile sensor and application in obstacle avoidance using reinforcement learning (2023)
Journal Article

Distributed observer-based prescribed performance control for multi-robot deformable object cooperative teleoperation (2023)
Journal Article

A vision-guided deep learning framework for dexterous robotic grasping using Gaussian processes and transformers

Kadalagere Sampath, Suhas; Wang, Ning; Yang, Chenguang; Wu, Howard; Liu, Cunjia; Pearson, Martin

Authors

Abstract

Files

You might also like

Downloadable Citations