Zhengtong Xu 徐政通

Email: xu1703 AT purdue.edu

I'm a third-year PhD candidate at Purdue University, advised by Professor Yu She. I was a recipient of the 2025 Magoon Graduate Student Research Excellence Award at Purdue University (awarded to only 25 PhD students across the entire Purdue College of Engineering).

I received my Bachelor's degree in mechanical engineering at Huazhong University of Science and Technology.

G. Scholar  /  Twitter  /  Github  /  LinkedIn

profile photo

News

  • [May 2025] I'll spend the rest of 2025 doing two exciting research internships: this summer at MERL, and this fall at Meta Reality Labs. Looking forward to connecting with folks in Cambridge and Redmond!
  • [Nov. 2024] I have passed my PhD preliminary exam and officially become a PhD candidate
  • [Sep. 2024] One paper accepted by IEEE T-RO

Research

* indicates equal contribution.

My research aims to design learning algorithms for robotic agents, enabling them to perform everyday manipulation tasks with human-level proficiency. To this end, I am currently focusing on hierarchical multimodal robot learning. Specifically, my research explores:

1. Integrating visual, 3D, and tactile modalities to enable multimodal robot learning.
2. Developing interpretable neural-symbolic low-level policies through differentiable optimization.
3. Deploying pretrained vision-language models for high-level, open-world reasoning and planning.

LeTac-MPC: Learning Model Predictive Control for Tactile-reactive Grasping
Zhengtong Xu, Yu She
IEEE Transactions on Robotics (T-RO), 2024

arXiv / video / code / bibtex

A generalizable end-to-end tactile-reactive grasping controller with differentiable MPC, combining learning and model-based approaches.

DiffOG: Differentiable Policy Trajectory Optimization with Generalizability
Zhengtong Xu, Zichen Miao, Qiang Qiu, Zhe Zhang, Yu She
Under Review, 2025

website / arXiv / video / code(soon) / bibtex

DiffOG introduces a transformer-based differentiable trajectory optimization framework for action refinement in imitation learning.

Canonical Policy: Learning Canonical 3D Representation for Equivariant Policy
Zhiyuan Zhang*, Zhengtong Xu*, Jai Nanda Lakamsani, Yu She
Under Review, 2025

website / arXiv(soon) / video(soon) / code(soon)

Canonical Policy enables equivariant observation-to-action mappings by grouping both in-distribution and out-of-distribution point clouds to a canonical 3D representation.

ManiFeel: Benchmarking and Understanding Visuotactile Manipulation Policy Learning
Quan Khanh Luu*, Pokuang Zhou*, Zhengtong Xu*, Zhiyuan Zhang, Qiang Qiu, Yu She
Under Review, 2025

website / arXiv(soon) / video(soon) / code(soon)

ManiFeel is a reproducible and scalable simulation benchmark for studying supervised visuotactile policy learning.

UniT: Data Efficient Tactile Representation with Generalization to Unseen Objects
Zhengtong Xu, Raghava Uppuluri, Xinwei Zhang, Cael Fitch, Philip Glen Crandall, Wan Shou, Dongyi Wang, Yu She
IEEE Robotics and Automation Letters (RA-L), 2025

website / arXiv / video / code / bibtex

Learn a tactile representation with generalizability only by a single simple object.

VILP: Imitation Learning with Latent Video Planning
Zhengtong Xu, Qiang Qiu, Yu She
IEEE Robotics and Automation Letters (RA-L), 2025

arXiv / video / code / bibtex

VILP integrates the video generation model into policies, enabling the representation of multi-modal action distributions while reducing reliance on extensive high-quality robot action data.

Safe Human-Robot Collaboration with Risk-tunable Control Barrier Functions
Vipul K. Sharma*, Pokuang Zhou*, Zhengtong Xu*, Yu She, S. Sivaranjani
IEEE/ASME Transactions on Mechatronics (TMECH), 2025

arXiv(soon) / video

We address safety in human-robot collaboration with uncertain human positions by formulating a chance-constrained problem using uncertain control barrier functions.

LeTO: Learning Constrained Visuomotor Policy with Differentiable Trajectory Optimization
Zhengtong Xu, Yu She
IEEE Transactions on Automation Science and Engineering (T-ASE), 2024

arXiv / video / code / bibtex

LeTO is a "gray box" method which marries optimization-based safety and interpretability with representational abilities of neural networks.

sym

VisTac: Toward a Unified Multimodal Sensing Finger for Robotic Manipulation
Sheeraz Athar*, Gaurav Patel*, Zhengtong Xu, Qiang Qiu, Yu She
IEEE Sensors Journal, 2023

paper / video / bibtex

VisTac seamlessly combines high-resolution tactile and visual perception in a single unified device.

Awards

  • Magoon Graduate Student Research Excellence Award, Purdue University, 2025
  • Dr. Theodore J. and Isabel M. Williams Fellowship, Purdue University, 2022
  • Chinese National Scholarship, Ministry of Education of China, 2017

Reviewer Service

  • Conference on Robot Learning (CoRL), 2025
  • IEEE Robotics and Automation Letters (RA-L), 2025
  • IEEE Transactions on Robotics (T-RO), 2024
  • IEEE International Conference on Robotics and Automation (ICRA), 2024

Teaching

  • Vertically Integrated Projects (VIP)-GE Robotics and Autonomous Systems, Grad Mentor, Spring 2024/Fall 2023/Summer 2023
  • IE 474-Industrial Control Systems, Teaching Assistant, Fall 2022

Website template from Jon Barron's website.