Shengjie Wang

Shengjie Wang | 王圣杰

I am a PhD student in Computer Science at Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University , working with Prof. Yang Gao. Previously, I was a Master's student in Department of Automation at Tsinghua University , advised by Prof. Tao Zhang(IEEE Fellow & IET Fellow, Head of Department). I completed my Bachelor's in Robotics at BIT, supervised by Prof. Qing Shi (IEEE Senior Member) and Prof. Toshio Fukuda (2020 IEEE President).

Email / CV / Google Scholar / Twitter / Github / WeChat

Research Interests

My research interests lie in the intersection of Robotics and Reinforcement Learning. I care about efficient, stable and safe robot performance in the unstructured open world.

Furthermore, we are passionate about releasing some interesting robotic environments, such as Bi-DexHands and SpaceRobotEnv. Hope everyone enjoys our work!

Selected Publications

	SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation Shengjie Wang, Jiacheng You, Yihang Hu, Jiongye Li, Yang Gao RSS, 2025 Project Page / ArXiv / Code / We propose Semantic Keypoint Imitation Learning (SKIL), a framework which automatically obtains semantic keypoints with the help of vision foundation models, and forms the descriptor of semantic keypoints that enables efficient imitation learning of complex robotic tasks with significantly lower sample complexity.
	RoboMetaverse: A Large-scale Robotic Metaverse For Reinforcement Learning Fengbo Lan, Chu Zhang, Ziyan Zhang, Shengjie Wang, Haotian Xu, Yunzhe Zhang, Tao Zhang A Cool Work, 2024 Project website / Video / RoboMetaverse is founded by a team of robotics enthusiasts, dedicated to creating a highly realistic embodied intelligence simulation platform and a cost-effective embodied intelligence hardware platform, aiming to contribute to the realization of the AGI (Artificial General Intelligence) era in embodied intelligence.
	DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands Fengbo Lan, Shengjie Wang, Yunzhe Zhang, Haotian Xu, Oluwatosin Oseni, Yang Gao, Tao Zhang CoRL, 2024 Project Page / ArXiv / Code / We present a learning-based catching strategy, which can catch diverse objects of daily life with dexterous hands. The learned policies show strong zero-shot transfer performance on unseen objects.
	CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models Haoxu Huang, Fanqi Lin, Yingdong Hu, Shengjie Wang, Yang Gao IROS Oral, 2024 Project website / Twitter / ArXiv / Code / We propose Robotic Manipulation through Spatial Constraints of Parts (CoPa), a novel framework that incorporates common sense knowledge embedded within foundation vision-language models (VLMs), such as GPT-4V, into the low-level robotic manipulation tasks.
	EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data Shengjie Wang, Shaohuai Liu, Weirui Ye, Jiacheng You, Yang Gao ICML Spotlight (Top 3.5%)*, 2024 Twitter / ArXiv / Code / Introducing EfficientZero V2, a general framework designed for sample-efficient RL algorithms, which outperforms the SotA methods and DreamerV3 across diverse domains.
	I-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search Jun Zhu, Hongyi Li, Zhepeng Wang, Shengjie Wang , Tao Zhang ICRA, 2024 Project website / ArXiv / Code / we present the i-Octree, a dynamic octree data structure that supports both fast nearest neighbor search and real-time dynamic updates, outperforming contemporary state-of-the-art approaches by achieving, on average, a 19% reduction in runtime on realworld open datasets.
	A Policy Optimization Method Towards Optimal-time Stability Shengjie Wang, Fengbo Lan, Xiang Zheng, Yuxue Cao, Oluwatosin Oseni, Haotian Xu, Tao Zhang, Yang Gao CoRL, 2023 Project Page / ArXiv / Code Our approach enables the system's state to reach an equilibrium point within an optimal time and maintain stability there- after, referred to as "optimal-time stability". To achieve this, we integrate the optimization method into the Actor-Critic framework, resulting in the development of the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm.
	Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization Haotian Xu, Shengjie Wang, Zhaolei Wang, Yunzhe Zhang, Qing Zhuo, Yang Gao, Tao Zhang IROS, 2023 Project Page / ArXiv / Code Our algorithm (ESB-CPO) improves upon the trade-off between reducing constraint violations and improving expected returns in Safe Reinforcement Learning.
	A Learning-based Adaptive Compliance Method for Symmetric Bi-manual Manipulation Yuxue Cao, Shengjie Wang, Xiang Zheng, Wenke Ma, Tao Zhang T-ASE (Top Journal in Automation), Under review Project Page / ArXiv / Code We propose a novel Learning-based Adaptive Compliance (LAC) algorithm to improve the efficiency and adaptability of symmetric bi-manual manipulation.
	IMAP: Intrinsically Motivated Adversarial Policy Xiang Zheng, Xingjun Ma, Shengjie Wang, Xinyu Wang, Chao Shen, Cong Wang ACM CCS (Top Conference in Security), Under review Project Page / ArXiv / Code We propose the Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box evasion attacks in single- and multi-agent environments without any knowledge of the victim policy.
	Development of a small-sized quadruped robotic rat capable of multimodal motions Shengjie Wang, Qing Shi, Junhui Gao, Yuxuan Wang, Fansheng Meng, Chang Li, Qiang Huang, Toshio Fukuda Journal: T-RO (Top journal in Robotics), 2023 Conference: Advanced Intelligent Mechatronics (AIM), (Best Student Paper Award(Top 0.5%)) , 2019 IEEE Spectrum / EurekAlert / Science Times We developed a small-sized quadruped robotic rat (SQuRo), which includes four limbs and one flexible spine. On the basis of the extracted key movement joints, SQuRo was subtly designed with a relatively elongated slim body (aspect ratio: 3.42) and smaller weight (220 g) compared with quadruped robots of the same scale.
	Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot Yuxue Cao, Shengjie Wang, Xiang Zheng, Wenke Ma, Xinru Xie, Lei Liu AST (Top journal in Astronautics), 2023 ArXiv / Code We propose a novel algorithm, EfficientLPT, to facilitate RL-based methods to improve planning accuracy efficiently. Our core contributions are constructing a mixed policy with prior knowledge guidance and introducing infinite norm to build a more reasonable reward function.
	Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Hao Dong, Zongqing Lu, Song-chun Zhu, Yaodong Yang NeurIPS, 2022 Project Page / ArXiv / Code We propose a bimanual dexterous manipulation benchmark (Bi-DexHands) according to literature from cognitive science for comprehensive reinforcement learning research.
	Collision-Free Trajectory Planning for a 6-DoF Free Floating Space Robot via Hierarchical Decoupling Optimization Shengjie Wang, Yuxue Cao, Xiang Zheng, Tao Zhang IEEE RA-L, 2022 Project Page / Paper / Code We developed a model-free Hierarchical Decoupling Optimization (HDO) algorithm to realize 6D-pose multi-target trajectory planning for the free-floating space robot.

Experience

	Tsinghua University, China 2019.09 - Present Master Student and PhD Student Advisor: Prof. Yang Gao and Prof. Tao Zhang(IET Fellow, Head of Department)
	Beijing Institute of Technology, China 2015.09 - 2019.07 Undergraduate Student Advisor: Prof. Qing Shi (IEEE Senior Member) and Prof. Toshio Fukuda (2020 IEEE President).

Template stolen from Jon Barron.
Last updated: Oct 15, 2023