I’m currently a junior undergraduate student (from 2023 Fall) in IIIS (Yao Class), Tsinghua University, pursuing a Bachelor’s degree in Computer Science and Technology.
I welcome any collaboration or discussion, whether with seniors or peers. Please feel free to reach out!
Some picture options: ( I'll try to keep this up to date)
Inspired by Pieter Abbeel's homepage. Photos are taken within the past year.
Giving a talk on my recent work (first from the right) 🗣️
Eating 😋
Hanging out with friends (second from left) 🤣
Cuddling my dog at home 🐕
Research Interests
My research goal is to develop fundamental models with intrinsic understandings of the world and apply these to obtain general decision intelligence. Currently, my research interests include:
World Models: Visual World Models, Object-Centric World Models, Grounding Foundation Models(e.g. Video Diffusion Models, LLMs) to World Models.
Recently, I am super interested in understanding theoretical foundations of machine learning and robotics, especially for generative modeling, sequence prediction, and robot learning.
News
[Aug. 2025] 🐋 SURGE is accepted by EMNLP 2025 Main, with a top 0.3% meta score!
[May. 2025] 🔥 I became a member of the Sparking Program, the most prestigious and selective academic organization for students at Tsinghua University (top 1%).
[May. 2025] 📈 TrajWorld is accepted by ICML, 2025.
[Nov. 2024] 🏆 Honored to receive Comprehensive Excellence Award of Tsinghua.
[Nov. 2024] 🏆 Glad to receive Outstanding Sports Scholarship of Tsinghua.
Education
B.S. in Computer Science, Tsinghua University, 2023-2027 (expected). Institute for Interdisciplinary Information Sciences (Yao Class), Tsinghua University. GPA: 3.93/4.00, Rank: 10/92. Selected Courses:Natural Language Processing (A+), Algebra and Computation (A+, Top 1), Fundamentals of Programming (A+), Multi-modal Machine Learning (A), Deep Learning (A), Computer Vision (A), Introduction to Computer Systems (A).More Selected Courses:Basic Principles of Marxism (A+), The History of Western Music (A+), Discrete Mathematics II (A), Fundamentals of Computer Science (A), Advanced Topics in Linear Algebra (A), Calculus-A II (A), Physics I (A).
We investigated the feasibility of utilizing language models as text-based world models. Through empirical study, we found that the performance is greatly hindered by overlengthy CoTs, and we proposed DreamFactory, a novel architecture to address this issue.
We introduce ManiGen, a generative simulation pipeline using ManiSkill to automate task creation. It utilizes the power of LLMs to propose tasks, generate scenes, and produce task-specific code for rewards, parameters, and metrics.
1. Designed and implemented a PostgreSQL-based course sharing platform using Scala for backend and React for frontend 2. Utilized Stable Diffusion 2 and Llama 2 API to enhance users experiences
A 2D Stickman vs CAD-themed game, developed using Unity. In this game, players, taking form as stick figures, explore a world within a CAD software through movement, skills, and various interactions.
We propose “Watch-and-Learn”, a multimodal framework that efficiently enhances MLLMs' reasoning abilities in counting tasks by integrating function calls.