Hello! I’m Guanwen Xie, a second-grade graduate student from Tsinghua University. Before this, I’ve obtained B. Eng. of ocean technology in Zhejiang University.

Currently, driven by rapid advances in topics such as multimodal large language models (LLMs) and large-scale reinforcement learning, we have witnessed significant progress of embodied intelligence. However, underwater robotics remains relatively underexplored and lacks generalizable, unified solutions. In my prior work, I have investigated the application of reinforcement learning, learning from demonstrations (LfD), and multimodal LLMs to both terrestrial and underwater robotic systems, with the goal of developing methods that are robust, adaptable, and generalizable across diverse tasks and dynamic environments. I am continuously exploring new research frontiers and am committed to making my research outcomes as universally applicable and open-source as possible, to benefit the broader research community.

Feel free to email me at xgw24@mails.tsinghua.edu.cn or gwxie360@outlook.com]!

📝 Publications

( * denotes equal contribution )


ICRA 2026 (in submission)
sym

EasyUUV: An LLM-Enhanced Universal and Lightweight Sim-to-Real Reinforcement Learning Framework for UUV Attitude Control

Guanwen Xie*, Jingzehua Xu*, Jiwei Tang, Yubo Huang, Shuai Zhang and Xiaofan Li

[Website&Video&code] [PDF] [BibTeX]

TLDR: EasyUUV is an LLM-enhanced, lightweight RL framework for robust UUV attitude control. It combines parallel RL training, adaptive S-Surface controller, and multimodal LLM-based real-time parameter tuning using visual/text feedback—enabling zero-shot adaptation. Validated on a low-cost (<1000 USD) 6DoF platform, it excels in diverse underwater conditions, including tank experiment and sea trial in Shenzhen Bay Park.

ICRA 2026 (in submission)
sym

Ocean Diviner: A Diffusion-Augmented Reinforcement Learning for AUV Robust Control in the Underwater Tasks

Weiyi Liu*, Jingzehua Xu*, Guanwen Xie* and Yi Li

[Arxiv] [BibTeX]

TLDR: This paper introduces a diffusion-augmented RL method for robust AUV control, combining diffusion-based trajectory planning and hybrid learning for improved adaptability, efficiency, and performance in dynamic underwater environments.

IEEE Transactions on Mobile Computing (Major Revision)
sym

Never too Cocky to Cooperate: An FIM and RL-based USV-AUV Collaborative System for Underwater Tasks in Extreme Sea Conditions

Jingzehua Xu*, Guanwen Xie*, Jiwei Tang, Yimian Ding, Weiyi Liu, Junhao Huang, Shuai Zhang and Yi Li

[Arxiv] [BibTeX]

TLDR: This paper presents a USV–AUV collaborative system that enhances underwater task performance under extreme sea conditions through optimized USV path planning and reinforcement learning–based multi-AUV coordination, showing clear performance gains in experiments.

IEEE/RSJ IROS 2025
sym

Never too Prim to Swim: An LLM-Enhanced RL-based Adaptive S-Surface Controller for AUVs under Extreme Sea Conditions

Guanwen Xie*, Jingzehua Xu*, Yimian Ding, Zhi Zhang, Shuai Zhang, Yi Li

[Website&Video&Code] [Arxiv] [BibTeX]

TLDR: This paper introduces an LLM-enhanced RL-based adaptive S-Surface controller for AUVs, optimizing parameters and rewards via LLMs. It handles nonlinearities and disturbances in extreme seas, using high-level RL commands converted by S-surface for robust, multi-objective adaptability.

IEEE Journal of Biomedical and Health Informatics
sym

Leveraging LLMs for Personalized Parkinson’s Disease Treatment

Rongqian Zhang*, Guanwen Xie*, Jie Ying, Zhongsheng Hua

[Code&SUPP PDF] [TechRxiv] [BibTeX]

TLDR: Our study aims to design and validate a decision-making system for Parkinson’s Disease(PD) treatment based on large language Models(LLMs), to tackle the challenges of effectiveness, reliability, and interpretability.

AAAI 2025 (student abstract)
sym

LLMs as Efficient Reward Function Searchers for Custom-Environment MORL

Guanwen Xie, Jingzehua Xu, Yiyuan Yang, Yimian Ding, Shuai Zhang

[Website&Code] [Online] [Arxiv] [BibTeX]

TLDR: An efficient reward function searcher using LLMs (ERFSL) is achieved by decomposing multi-objective tasks to provide clear textual task feedback, utilizing LLM’s strong semantic understanding capabilities, and incorporating versatile search strategies.

IEEE Transactions on Mobile Computing
sym

Is FISHER All You Need in The Multi-AUV Underwater Target Tracking Task?

Guanwen Xie*, Jingzehua Xu*, Ziqi Zhang, Xiangwang Hou, Dongfang Ma, Shuai Zhang, Yong Ren, Dusit Niyato

[IEEE Xplore] [Arxiv] [BibTeX]

TLDR: Leveraging sim2sim expert trajectories transformation process facilitates the generation of demonstrations. Based on this, a two-stage framework called FISHER of MADAC imitation learning (IL) and MAIGDT offline reinforcement learning (ORL) is employed to achieve high generalized and applicable multi-AUV target tracking policies without designating reward functions.

IEEE Internet of Things Journal
sym

UPEGSim: An RL-Enabled Simulator for Unmanned Underwater Vehicles Dedicated in the Underwater Pursuit-Evasion Game

Jingzehua Xu*, Guanwen Xie*, Zekai Zhang, Xiangwang Hou, Shuai Zhang, Yong Ren, Dusit Niyato

[IEEE Xplore] [BibTeX]

TLDR: A highly customizable and high-precision physics simulation platform called UPEGsim has been developed based on ROS and Gazebo. Combined with the RL compatibility layer based on RoboEnv Class, it can be easily applied to a variety of RL tasks. Based on this platform, an efficient training framework of scenario transfer RL training and offline reinforcement learning is utilized for underwater pursuit-evasion game.

Arxiv
sym

USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions

Jingzehua Xu*, Guanwen Xie*, Xinqi Wang, Yimian Ding, Shuai Zhang

[Code] [Arxiv] [BibTeX]

TLDR: By utilizing the USV-AUV collaborative framework, the accuracy of AUV positioning can be enhanced in extreme sea conditions. USV and AUV cooperatively execute the data collection task of internet of underwater things (IoUT).

  • Environment and Energy-Aware AUV-Assisted Data Collection for the Internet of Underwater Things, Zekai Zhang*, Jingzehua Xu*, Guanwen Xie, Jingjing Wang, Zhu Han and Yong Ren. IEEE Internet of Things Journal [IEEE Xplore] [PDF] [BibTeX]

  • Multi-AUV Assisted Seamless Underwater Target Tracking Relying on Deep Learning and Reinforcement Learning, Jingzehua Xu*, Yimian Ding*, Zekai Zhang, Guanwen Xie, Ziyuan Wang, Yongming Zeng and Gang Li. IEEE WCCI 2024 [PDF] [BibTeX] [News Coverage(中文报道)]

  • Fisher-Information-Matrix-Based USBL Cooperative Location in USV–AUV Networks, Ziyuan Wang, Jingzehua Xu, Yuanzhe Feng, Yijing Wang, Guanwen Xie, Xiangwang Hou, Wei Men, and Yong Ren. Sensors 2023 [Online Publication] [PDF] [BibTeX]

🥇 Selected Honors and Awards

Scholarship

  • 2024 Outstanding Graduate (awarded to top undergraduates in Zhejiang University)
  • 2023 Runhe Scholarship (1% Top in ZJU)
  • 2022 Zhejiang Provincal Government Scholarship (3% Top in ZJU)

Experience

  • 2023 First Prize of “Aotuo Cup” National Underwater Robot Designing Competition (Top 5% of all participants of finals)
  • 2022 First Prize of Zhejiang Provincal Student Physics Innovation Competition (Theory)

Others

  • 2025 IROS IEEE RAS Travel Support

📖 Educations & Skills

2024.09 - present, Tsinghua University, M. S. in Electric Information

  • GPA: 3.90 / 4.00
  • Relative courses: Artificial Intelligence in Ocean Engneering (4.0), Analysis and Processing of Measured Signals (4.0), Guidance, Navigation and Control of Dynamic Positioning Systems (4.0), Modern Sensing Technology (4.0), Technical Foundation of Intelligent Manufacturing (3.6), Frontiers and Instruments of Marine Observation Technology (4.0), Marine Optical Technology (4.0)

2020.09 - 2024.06, Zhejiang University, B. Eng. in Ocean Technology

  • GPA: 3.99 / 4.00 or score 92.5 / 100 (rank 1st / 143)
  • Relative courses: Calculus(94), Linear Algebra(88), Probability Theory and Statistics(96), Partial Differential Equation(99), Foundamental of Ocean Engineering Modeling(ML-relative, 94), Software Development and Applications(95), Introduction to Computer Systems(93), Embedded Systems(96), Signals and Systems(97), Digital Signal Processing(98), Automatic Control Theory(95), Underwater Robot Design(96)

Programming💻 & Debugging 🐛🔧: Python/Pytorch, C/C++, ROS, Linux(Arch Linux/Ubuntu…), $\LaTeX$

Hardware⚙️ : STM32 (HAL/Standard Peripheral Library), Raspberry Pi, ESP32 (Arduino)

Languages: English CET6 600

📊 Selected Experiences

2024.9 - present Tsinghua University, Shenzhen International Graduate School, Graduate Researcher

  • Cross-Embodiment Generalization Research for UUV 6-DOF control: We propose a dual-layer framework combining RL (as planner) and adaptive S-surface controllers (as executor), which directly optimize end goals and cancel nonlinear effects and external disturbances. Multimodal LLMs enable real-time, feedback-driven tuning of controller parameters for dynamic tasks, while Isaac Lab facilitates parallelized RL training completed within 2 minutes. Real-world validation using a custom UUV testbed, including tank experiments and sea trials in Shenzhen Bay Park, confirms effectiveness, with an earlier version accepted by IROS 2025 and latest results submitted to ICRA 2026.

2024.7 - 2025.4 New Jersey Institute of Technology, Department of Data Science, Research Assistant (Advisor - Prof. Shuai Zhang)

  • USV–AUV Collaborative Framework and Task Execution under Extreme Sea Conditions: The USV localizes AUVs using a cross-shaped USBL sonar array. In this context, we leverage the Fisher information matrix to derive theoretically optimal USV positioning, thereby maximizing multi-AUV localization accuracy. Furthermore, we employ reinforcement learning (RL) to enable efficient, localization-based task execution while ensuring continuous communication. Preliminary work has been accepted by ICASSP 2025; the latest work is currently under major revision at IEEE TMC. The open-source simulation repository has received over 30+⭐ on Github

  • Efficient LLM reward searcher for MORL: An efficient reward function searcher using LLMs (ERFSL) is achieved by decomposing multi-objective tasks to provide clear textual task feedback, utilizing LLM’s strong semantic understanding capabilities, and incorporating versatile search strategies.

2023.10 - 2024.6 Zhejiang University - Undergraduate Dissertation (Advisor - Prof. Dongfang Ma)

  • AUV Target Tracking via Learning from Demonstration deployed on high-precision simulation platform: An underwater robot simulation platform is built based on ROS and Gazebo, highlighting high-fidelity simulation and customizable modules. Based on this platform, we built an expert-driven, reward function irrelevant simulation to simulation(sim2sim) training framework utilizing multi-agent discriminator-actor-critic (MADAC) to effectively align policies with demonstrations and transformer-based ORL algorithm (MAIGDT) to generalize policies across diverse scenarios without reward dependencies. Experiments demonstrated superior performance in dense-obstacle environments, achieving near-expert tracking accuracy and collision avoidance.

2023.2 - 2023.6 Zhejiang University, Ocean College, Undergraduate Researcher (Advisor - Prof. Yulin Si)

  • Underwater Robot Design: Developed a compact, energy-save and easy-to-control underwater robot via Raspberry Pi, STM32 with the functions of navigation, obstacle avoidance, letter and color recognition. Data augmentation procedure based on BAGAN is utilized for enhancing the accuracy of the recognition task. We participated in the “Aotuo Cup” national underwater robot designing competition and won the first prize.

2025.5 - 2025.8 Shenzhen Glitech Technology Co., Ltd., InternShip (A subsidiary controlled by FRD Science & Technology Co., Ltd. (300602.SZ))

  • Developed the communication SDK for the Xiaoyao dexterous robotic hand (EtherCAT protocol, PDO/SDO), and deployed and improved the point cloud observation based D(R,O)-Grasp five-finger grasping algorithm in the company’s laboratory.

📑 Professional Activities

Journal Reviewer

  • 2025 IEEE Journal of Oceanic Engineering (JOE)

Conference Reviewer

  • 2026 IEEE International Conference on Robotics and Automation (ICRA)
  • 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2025 International Joint Conference on Neural Networks (IJCNN)

🧩 Miscellaneous

I have a passion for computer-related knowledge and have a wide range of interests. I enjoy video editing and used to be a video uploader on a website called bilibili, where I uploaded fun videos about computer knowledge and computer viruses (such as MEMZ 🤐). Up to 2019.6, I gained 20k+ subscribers, ranking within the top 15,000 of bilibili. During my university experiences, I accumulated two years of computer repair experience within the Electrical volunteer association(EVA) of Zhejiang University. I also love running and during my college years, I keep running 10 kilometers daily during the winter vacation of final year of undergraduate study. These experiences have taught me that no matter how difficult the situation, I have the courage to persevere!

This page was last updated at UTC+0 2025/10/11 14:30PM.