Chenxin Li | 李宸鑫
Hi! I'm Chenxin "Jason" Li, a final-year Ph.D. candidate at The Chinese University of Hong Kong (CUHK).
I work on scaling the agentic stack of LLM/VLM for digital automation:
(i) CLI agents that operate creative software (HuggingFace Spaces, PhotoShop, Blender, etc.) and automate model training pipelines (IR3D-Bench, JarvisArt, JarvisIR, Doubao RSI Flywheel); (ii) Visual coding agents that generate charts, web pages, and structured visualizations — code-to-chart/web (Seed 1.8); (iii) GUI agents for screen-level software automation (UI-TARS-2). Incidentally, these directions parallel Anthropic's bet on Claude Code/Cowork, Claude Design, and Claude Computer Use.
Throughout my fulfilling years in Master and Ph.D., I have gained extensive industry exposure through internships at ByteDance Seed, Tencent AI, Ant Ling, Giga AI, AMD, Hedra AI, JoinQuant, etc. Spanning Agent, World Model, and Quant, this trajectory has trained me to stay open to change and to spot emerging opportunities with a cross-disciplinary perspective. I also visited UT Austin and UMD for research.
I anticipate graduating in the summer of 2026 and am interested in industrial positions (Profile). Please feel free to reach out via email (chenxinli@link.cuhk.edu.hk) or WeChat (jasonchenxinli).
LinkedIn |
Scholar |
GitHub |
X
Selected Work
* Equal contribution, † Project Leader, ‡ Corresponding author
|
|
|
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
Parker Liu*, Chenxin Li*, Zhengxin Li, Yipeng Wu, Wuyang Li, Zhiqin Yang, Zhenyuan Zhang, Yunlong Lin, Sirui Han, Brandon Y. Feng
NeurIPS 2025
[Project] [Paper] [Code]
An agentic inverse-rendering framework that closes the loop from visual understanding to structured code generation, Blender execution, and environment feedback.
|
Selected Experience
LLM Agent
- ByteDance Seed: Agent for CLI, visual coding and GUI (Seed-1.6 / Seed-1.8 / UI-TARS-2)
- Tencent AI Lab: Agent execution loop, code generation and environment-grounded RL (IR3D-Bench)
- Ant Ling: Agent memory, context compression and output verification (Ling-Pilot)
- AMD: Visual token compression for efficient VLMs
World Model
- Giga AI: World-model agent for 3D environments and tool-augmented training (Giga Brain-0)
- Hedra AI: Omnimodal attention architecture for commercial digital-human generation (Hedra Character-3)
Quant, Real-world Utility
- JoinQuant: RL post-training of LLMs for quant alpha-factor mining; agent evaluation on financial data
|
ScholaGO (Co-founder): LLM-backend Education Startup
Co-founded ScholaGO Education Technology Company Limited (学旅通教育科技有限公司) to build LLM-powered education products that turn static content into immersive, interactive, multimodal learning experiences. Grateful to receiving funding from HKSTP, HK Tech 300, and Alibaba Cloud.
|
|
Professional Activities
- Workshop Organizer: AIM-FM: Advancements In Foundation Models Towards Intelligent Agents (NeurIPS 2024)
- Talks: "UKAN" at VALSE Summit (Jun 2025) and DAMTP, University of Cambridge (Jul 2024)
- Conference Reviewer: ICLR, NeurIPS, ICML, CVPR, ICCV, ECCV, EMNLP, AAAI, ACM MM, MICCAI, BIBM
- Journal Reviewer: Nature Machine Intelligence, PAMI, TIP, DMLR, PR, TNNLS
|
Beyond Work
Reading: I dedicate substantial time to reading, especially history, philosophy, and sociology, which shapes my perspective on what AGI should be from first principles.
Investment: Investment is real-world RL: returns provide fast feedback to iteratively improve individual decision policy. Recently, I am fascinated by the idea that how to (i) build benchmarks for LLMs that quantify real-world investment utility (in the similar spirit of GPT-5.2's gdpeval benchmark), and (ii) extending quantitative financial metrics to more general event and trend forecasting.
|
|