Publications

Guardian: Detecting Robotic Planning and Execution Errors with Vision-Language Models

Published in CoRL 2025 Workshop Robot Data, 2025

TLDR; Guardian introduces an automatic robot failure synthesis approach that generates diverse planning and execution failures with fine-grained reasoning traces. We train a VLM that achieves state-of-the-art performance on failure detection benchmarks and effectively improves task success rates in both simulation and real robots.

Recommended citation: Paul Pacaud, Ricardo Garcia, Shizhe Chen, and Cordelia Schmid https://arxiv.org/abs/2512.01946

Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation

Published in CoRL 2025 LEAP Workshop, 2025

TLDR; Gondola introduces a grounded vision-language planning model that uses multi-view images to generate precise action plans with segmentation masks for generalizable robotic manipulation, achieving state-of-the-art performance on the GemBench benchmark.

Recommended citation: Shizhe Chen, Ricardo Garcia, Paul Pacaud, and Cordelia Schmid https://arxiv.org/abs/2506.11261

Identifying Human Grasp Properties During Robot-to-Human Handover

Published in IEEE World Haptics Conference, 2023

TLDR; An efficient System Identification method during Robot-to-Human Handover for giving the robot quantitative information that it could use to decide when to safely let go of the object.

Recommended citation: Paul Pacaud, Etienne Chassaing, Yilin Cai, Connor Yako, and Kenneth Salisbury https://ieeexplore.ieee.org/abstract/document/10224405