Research Interests
My research interests lie in the intersection of Robotics and Reinforcement Learning. I care about efficient, stable and safe robot performance in the unstructured open world.
Furthermore, we are passionate about releasing some interesting robotic environments, such as Bi-DexHands and SpaceRobotEnv. Hope everyone enjoys our work!
|
|
RoboMetaverse: A Large-scale Robotic Metaverse For Reinforcement Learning
Fengbo Lan*,
Chu Zhang*,
Ziyan Zhang*,
Shengjie Wang*,
Haotian Xu,
Yunzhe Zhang,
Tao Zhang
A Cool Work, 2024
Project website
/
Video
/
RoboMetaverse is founded by a team of robotics enthusiasts, dedicated to creating a highly realistic embodied intelligence simulation platform and a cost-effective embodied intelligence hardware platform, aiming to contribute to the realization of the AGI (Artificial General Intelligence) era in embodied intelligence.
|
|
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands
Fengbo Lan*,
Shengjie Wang*,
Yunzhe Zhang,
Haotian Xu,
Oluwatosin Oseni,
Yang Gao,
Tao Zhang
CoRL, 2024
Project Page
/
ArXiv
/
Code
/
We present a learning-based catching strategy, which can catch diverse objects of daily life with dexterous hands. The learned policies show strong zero-shot transfer performance on unseen objects.
|
|
CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models
Haoxu Huang*,
Fanqi Lin*,
Yingdong Hu,
Shengjie Wang,
Yang Gao
IROS Oral, 2024
Project website
/
Twitter
/
ArXiv
/
Code
/
We propose Robotic Manipulation through Spatial Constraints of Parts (CoPa), a novel framework that incorporates common sense knowledge embedded within foundation vision-language models (VLMs), such as GPT-4V, into the low-level robotic manipulation tasks.
|
|
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang*,
Shaohuai Liu*,
Weirui Ye*,
Jiacheng You,
Yang Gao
ICML Spotlight (Top 3.5%), 2024
Twitter
/
ArXiv
/
Code
/
Introducing EfficientZero V2, a general framework designed for sample-efficient RL algorithms, which outperforms the SotA methods and DreamerV3 across diverse domains.
|
|
I-Octree: A Fast, Lightweight, and Dynamic Octree for Proximity Search
Jun Zhu,
Hongyi Li,
Zhepeng Wang,
Shengjie Wang ,
Tao Zhang
ICRA, 2024
Project website
/
ArXiv
/
Code
/
we present the i-Octree, a dynamic octree data structure that supports both fast nearest neighbor search and real-time dynamic updates, outperforming contemporary state-of-the-art approaches by achieving, on average, a 19% reduction in runtime on realworld open datasets.
|
|
A Policy Optimization Method Towards Optimal-time Stability
Shengjie Wang,
Fengbo Lan,
Xiang Zheng,
Yuxue Cao,
Oluwatosin Oseni,
Haotian Xu,
Tao Zhang,
Yang Gao
CoRL, 2023
Project Page
/
ArXiv
/
Code
Our approach enables the system's state to reach an equilibrium point within an optimal time and maintain stability there- after, referred to as "optimal-time stability". To achieve this, we integrate the optimization method into the Actor-Critic framework, resulting in the development of the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm.
|
|
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization
Haotian Xu*,
Shengjie Wang*,
Zhaolei Wang,
Yunzhe Zhang,
Qing Zhuo,
Yang Gao,
Tao Zhang
IROS, 2023
Project Page
/
ArXiv
/
Code
Our algorithm (ESB-CPO) improves upon the trade-off between reducing constraint violations and improving expected returns in Safe Reinforcement Learning.
|
|
A Learning-based Adaptive Compliance Method for Symmetric Bi-manual Manipulation
Yuxue Cao*,
Shengjie Wang*,
Xiang Zheng,
Wenke Ma,
Tao Zhang
T-ASE (Top Journal in Automation), Under review
Project Page
/
ArXiv
/
Code
We propose a novel Learning-based Adaptive Compliance (LAC) algorithm to improve the efficiency and adaptability of symmetric bi-manual manipulation.
|
|
IMAP: Intrinsically Motivated Adversarial Policy
Xiang Zheng,
Xingjun Ma,
Shengjie Wang,
Xinyu Wang,
Chao Shen,
Cong Wang
ACM CCS (Top Conference in Security), Under review
Project Page
/
ArXiv
/
Code
We propose the Intrinsically Motivated Adversarial Policy (IMAP) for efficient black-box evasion attacks in single- and multi-agent environments without any knowledge of the victim policy.
|
|
Development of a small-sized quadruped robotic rat capable of multimodal motions
Shengjie Wang,
Qing Shi,
Junhui Gao,
Yuxuan Wang,
Fansheng Meng,
Chang Li,
Qiang Huang,
Toshio Fukuda
Journal: T-RO (Top journal in Robotics), 2023
Conference: Advanced Intelligent Mechatronics (AIM), (Best Student Paper Award(Top 0.5%)) , 2019
IEEE Spectrum
/
EurekAlert
/
Science Times
We developed a small-sized quadruped robotic rat (SQuRo), which includes four limbs and one flexible spine. On the basis of the extracted key movement joints, SQuRo was subtly designed with a relatively elongated slim body (aspect ratio: 3.42) and smaller weight (220 g) compared with quadruped robots of the same scale.
|
|
Reinforcement learning with prior policy guidance for motion planning of dual-arm free-floating space robot
Yuxue Cao*,
Shengjie Wang*,
Xiang Zheng,
Wenke Ma,
Xinru Xie,
Lei Liu
AST (Top journal in Astronautics), 2023
ArXiv
/
Code
We propose a novel algorithm, EfficientLPT, to facilitate RL-based methods to improve planning accuracy efficiently. Our core contributions are constructing a mixed policy with prior knowledge guidance and introducing infinite norm to build a more reasonable reward function.
|
|
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Yuanpei Chen,
Tianhao Wu,
Shengjie Wang,
Xidong Feng,
Jiechuang Jiang,
Stephen Marcus McAleer,
Hao Dong,
Zongqing Lu,
Song-chun Zhu,
Yaodong Yang
NeurIPS, 2022
Project Page
/
ArXiv
/
Code
We propose a bimanual dexterous manipulation benchmark (Bi-DexHands) according to literature from cognitive science for comprehensive reinforcement learning research.
|
|
Collision-Free Trajectory Planning for a 6-DoF Free Floating Space Robot via Hierarchical Decoupling Optimization
Shengjie Wang,
Yuxue Cao,
Xiang Zheng,
Tao Zhang
IEEE RA-L, 2022
Project Page
/
Paper
/
Code
We developed a model-free Hierarchical Decoupling Optimization (HDO) algorithm to realize 6D-pose multi-target trajectory planning for the free-floating space robot.
|
|
Tsinghua University, China
2019.09 - Present
Master Student and PhD Student
Advisor: Prof. Yang Gao and Prof. Tao Zhang(IET Fellow, Head of Department)
|
|
Beijing Institute of Technology, China
2015.09 - 2019.07
Undergraduate Student
Advisor: Prof. Qing Shi (IEEE Senior Member) and Prof. Toshio Fukuda (2020 IEEE President).
|
Template stolen from Jon Barron.
Last updated: Oct 15, 2023
|
|