Shengjie Wang | 王圣杰

I am a PhD student in Computer Science at Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University , working with Prof. Yang Gao. Previously, I was a Master's student in Department of Automation at Tsinghua University , advised by Prof. Tao Zhang(IEEE Fellow & IET Fellow, Head of Department). I completed my Bachelor's in Robotics at BIT, supervised by Prof. Qing Shi (IEEE Senior Member) and Prof. Toshio Fukuda (2020 IEEE President).

Email  /  CV  /  Google Scholar  /  Twitter  /  Github /  WeChat


profile photo
Research Interests

My research interests lie in the intersection of Robotics and Reinforcement Learning. I care about efficient, stable and safe robot performance in the unstructured open world.

Furthermore, we are passionate about releasing some interesting robotic environments, such as Bi-DexHands and SpaceRobotEnv. Hope everyone enjoys our work!


Selected Publications
SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation
Shengjie Wang, Jiacheng You, Yihang Hu, Jiongye Li, Yang Gao
RSS, 2025
Project Page / ArXiv / Code /

We propose Semantic Keypoint Imitation Learning (SKIL), a framework which automatically obtains semantic keypoints with the help of vision foundation models, and forms the descriptor of semantic keypoints that enables efficient imitation learning of complex robotic tasks with significantly lower sample complexity.

RoboMetaverse: A Large-scale Robotic Metaverse For Reinforcement Learning
Fengbo Lan*, Chu Zhang*, Ziyan Zhang*, Shengjie Wang*, Haotian Xu, Yunzhe Zhang, Tao Zhang
A Cool Work, 2024
Project website / Video /

RoboMetaverse is founded by a team of robotics enthusiasts, dedicated to creating a highly realistic embodied intelligence simulation platform and a cost-effective embodied intelligence hardware platform, aiming to contribute to the realization of the AGI (Artificial General Intelligence) era in embodied intelligence.

DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Fengbo Lan*, Shengjie Wang*, Yunzhe Zhang, Haotian Xu, Oluwatosin Oseni, Yang Gao, Tao Zhang
CoRL, 2024
Project Page / ArXiv / Code /

We present a learning-based catching strategy, which can catch diverse objects of daily life with dexterous hands. The learned policies show strong zero-shot transfer performance on unseen objects.

CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models
Haoxu Huang*, Fanqi Lin*, Yingdong Hu, Shengjie Wang, Yang Gao
IROS Oral, 2024
Project website / Twitter / ArXiv / Code /

We propose Robotic Manipulation through Spatial Constraints of Parts (CoPa), a novel framework that incorporates common sense knowledge embedded within foundation vision-language models (VLMs), such as GPT-4V, into the low-level robotic manipulation tasks.

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang*, Shaohuai Liu*, Weirui Ye*, Jiacheng You, Yang Gao
ICML Spotlight (Top 3.5%), 2024
Twitter / ArXiv / Code /

Introducing EfficientZero V2, a general framework designed for sample-efficient RL algorithms, which outperforms the SotA methods and DreamerV3 across diverse domains.

I-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search
Jun Zhu, Hongyi Li, Zhepeng Wang, Shengjie Wang , Tao Zhang
ICRA, 2024
Project website / ArXiv / Code /

we present the i-Octree, a dynamic octree data structure that supports both fast nearest neighbor search and real-time dynamic updates, outperforming contemporary state-of-the-art approaches by achieving, on average, a 19% reduction in runtime on realworld open datasets.

A Policy Optimization Method Towards Optimal-time Stability
Shengjie Wang, Fengbo Lan, Xiang Zheng, Yuxue Cao, Oluwatosin Oseni, Haotian Xu, Tao Zhang, Yang Gao
CoRL, 2023
Project Page / ArXiv / Code

Our approach enables the system's state to reach an equilibrium point within an optimal time and maintain stability there- after, referred to as "optimal-time stability". To achieve this, we integrate the optimization method into the Actor-Critic framework, resulting in the development of the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm.

Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization
Haotian Xu*, Shengjie Wang*, Zhaolei Wang, Yunzhe Zhang, Qing Zhuo, Yang Gao, Tao Zhang
IROS, 2023
Project Page / ArXiv / Code

Our algorithm (ESB-CPO) improves upon the trade-off between reducing constraint violations and improving expected returns in Safe Reinforcement Learning.

A Learning-based Adaptive Compliance Method for Symmetric Bi-manual Manipulation
Yuxue Cao*, Shengjie Wang*, Xiang Zheng, Wenke Ma, Tao Zhang
T-ASE (Top Journal in Automation), Under review
Project Page / ArXiv / Code

We propose a novel Learning-based Adaptive Compliance (LAC) algorithm to improve the efficiency and adaptability of symmetric bi-manual manipulation.

IMAP: Intrinsically Motivated Adversarial Policy
Xiang Zheng, Xingjun Ma, Shengjie Wang, Xinyu Wang, Chao Shen, Cong Wang
ACM CCS (Top Conference in Security), Under review
Project Page / ArXiv / Code

We propose the Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box evasion attacks in single- and multi-agent environments without any knowledge of the victim policy.

Development of a small-sized quadruped robotic rat capable of multimodal motions
Shengjie Wang, Qing Shi, Junhui Gao, Yuxuan Wang, Fansheng Meng, Chang Li, Qiang Huang, Toshio Fukuda
Journal: T-RO (Top journal in Robotics), 2023
Conference: Advanced Intelligent Mechatronics (AIM), (Best Student Paper Award(Top 0.5%)) , 2019
IEEE Spectrum / EurekAlert / Science Times

We developed a small-sized quadruped robotic rat (SQuRo), which includes four limbs and one flexible spine. On the basis of the extracted key movement joints, SQuRo was subtly designed with a relatively elongated slim body (aspect ratio: 3.42) and smaller weight (220 g) compared with quadruped robots of the same scale.

Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot
Yuxue Cao*, Shengjie Wang*, Xiang Zheng, Wenke Ma, Xinru Xie, Lei Liu
AST (Top journal in Astronautics), 2023
ArXiv / Code

We propose a novel algorithm, EfficientLPT, to facilitate RL-based methods to improve planning accuracy efficiently. Our core contributions are constructing a mixed policy with prior knowledge guidance and introducing infinite norm to build a more reasonable reward function.

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Hao Dong, Zongqing Lu, Song-chun Zhu, Yaodong Yang
NeurIPS, 2022
Project Page / ArXiv / Code

We propose a bimanual dexterous manipulation benchmark (Bi-DexHands) according to literature from cognitive science for comprehensive reinforcement learning research.

Collision-Free Trajectory Planning for a 6-DoF Free Floating Space Robot via Hierarchical Decoupling Optimization
Shengjie Wang, Yuxue Cao, Xiang Zheng, Tao Zhang
IEEE RA-L, 2022
Project Page / Paper / Code

We developed a model-free Hierarchical Decoupling Optimization (HDO) algorithm to realize 6D-pose multi-target trajectory planning for the free-floating space robot.

Experience
Tsinghua University, China
2019.09 - Present

Master Student and PhD Student
Advisor: Prof. Yang Gao and Prof. Tao Zhang(IET Fellow, Head of Department)
Beijing Institute of Technology, China
2015.09 - 2019.07

Undergraduate Student
Advisor: Prof. Qing Shi (IEEE Senior Member) and Prof. Toshio Fukuda (2020 IEEE President).

Template stolen from Jon Barron.
Last updated: Oct 15, 2023