profile photo

Chenxin Li | 李宸鑫

Hi! I'm Chenxin "Jason" Li, a final-year Ph.D. candidate at The Chinese University of Hong Kong (CUHK).

My recent interest lies in scaling LLM/VLM agents for digital automation, including: (i) coding agents for computer use (Claw-Eval-Live, IRBlender-Bench, JarvisArt, Seed Auto R&D Agent) and (ii) visual coding agents for design artifacts (JarvisArt, JarvisIR, SAM-Agent, Seed code-to-chart/web). These experiences span building agent scaffolds/harnesses, task and evaluation design, trajectory rollout and distillation, as well as mid/post-training for agents.

Throughout my fulfilling years in Master and Ph.D., I have gained extensive industry exposure through internships at ByteDance Seed, Tencent AI, Ant Ling, Giga AI, AMD, Hedra AI, JoinQuant, etc. Across Agent, World Model, and Quant, I've learned to stay open and to spot cross-disciplinary opportunities. I also visited UT Austin and UMD for research.

I anticipate graduating in the summer of 2026 and am interested in industrial positions (Profile). Please feel free to reach out via email (chenxinli@link.cuhk.edu.hk) or WeChat (jasonchenxinli).

LinkedIn | Google Scholar Scholar | GitHub | X | Blogs

Selected Work
* Equal contribution, † Project Leader, ‡ Corresponding author
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows
Chenxin Li†, Zhengyang Tang, Huangxin Lin, Yunlong Lin, Shijue Huang, Shengyuan Liu, Bowen Ye, Rang Li, Lei Li, Benyou Wang, Yixuan Yuan
Preprint

[Project] [Paper] [Code]

A live workflow-agent benchmark with refreshable demand signals and verifiable execution traces; 105 tasks across 22 categories, 13 frontier models, top model passes only 66.7%.

Seed-1.8
Seed-1.8: Towards Generalized Real-World Agency
ByteDance Seed Team

[Project] [Model Card]

Contributed to agent post-training for visual coding and agentic tool-use.

UI-TARS-2
UI-TARS-2: Advancing GUI Agent with Multi-Turn Reinforcement Learning
ByteDance Seed Team

[Project] [Report]

Contributed to agent post-training.

Ling
Ling: Open-sourced LLM with MoE Architecture by InclusionAI
Ant Group InclusionAI Team

[Project]

Contributed to agentic memory.

Open Source for Agent Tools

I enjoy crafting agent tools that make my workflows more efficient and AI-native.

IRBlender-Bench Framework
IRBlender-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
Parker Liu*, Chenxin Li*, Zhengxin Li, Yipeng Wu, Wuyang Li, Zhiqin Yang, Zhenyuan Zhang, Yunlong Lin, Sirui Han, Brandon Y. Feng
NeurIPS 2025

[Project] [Paper] [Code]

An agentic inverse-rendering framework that closes the loop from visual understanding to structured code generation, Blender execution, and environment feedback.

SAM-Agent Framework
SAM-Agent: Empowering Interactive Image Segmentation with Multi-turn Agentic Reinforcement Learning
Shengyuan Liu*, Chenxin Li*, Liuxin Bao, Qi Yang, Wanting Geng, Boyun Zheng, Wenting Chen, Houwen Peng, Yixuan Yuan
Preprint

[Paper] [Code]

An interactive segmentation agent system that learns multi-turn correction actions with process rewards for iterative image refinement.

U-KAN Framework
U-KAN: U-KAN Makes Strong Backbone for Image Segmentation and Generation
Chenxin Li*, Xinyu Liu*, Wuyang Li*, Cheng Wang*, Hengyu Liu, Yixuan Yuan
AAAI 2025

[Project] [Paper] [Code] Top 1 most influential papers in AAAI 2025

Integrating Linear Attention mechanism like KAN into vision backbone

JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
Yunlong Lin*, Zixu Lin*, Haoyu Chen*, Chenxin Li*, Sixiang Chen, Kairun Wen, Yeying Jin, Wenbo Li, Xinghao Ding‡
CVPR 2025

[Project] [Paper] [Code]

A restoration agent system that schedules structured recovery steps across expert modules and optimizes execution outcomes.

JarvisArt
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
Yunlong Lin*, Zixu Lin*, Kunjie Lin*, Chenxin Li*, Haoyu Chen, Zhongdao Wang, Xinghao Ding†, Wenbo Li, Shuicheng Yan†
NeurIPS 2025

[Project] [Paper] [Code]

An artistic editing agent system that plans multi-step retouching commands and coordinates expert models for execution-quality image refinement.

EMNLP 2024 VLM fine-tuning
Visual Large Language Model Fine-Tuning via Simple Parameter-Efficient Modification
Mengjiao Li, Zhiyuan Ji, Chenxin Li†, Lianliang Nie, Zhiyang Li, Masashi Sugiyama
EMNLP 2024

[Project] [Paper] [Code]

Simple yet efficient parameter-efficient fine-tuning for VLM alignment.

Selected Experience
LLM Agent
  • ByteDance Seed: Agent for CLI, visual coding and GUI (Seed-1.6 / Seed-1.8 / UI-TARS-2)
  • Tencent AI Lab: Agent execution loop, code generation and environment-grounded RL (IRBlender-Bench)
  • Ant Ling: Agent memory, context compression and output verification (Ling-Pilot)
  • AMD: Visual token compression for efficient VLMs
World Model
  • Giga AI: World-model agent for 3D environments and tool-augmented training (Giga Brain-0)
  • Hedra AI: Omnimodal attention architecture for commercial digital-human generation (Hedra Character-3)
Quant, Real-world Utility
  • JoinQuant: RL post-training of LLMs for quant alpha-factor mining; agent evaluation on financial data
ScholaGO

ScholaGO (Co-founder): LLM-backend Education Startup

Co-founded ScholaGO Education Technology Company Limited (学旅通教育科技有限公司) to build LLM-powered education products that turn static content into immersive, interactive, multimodal learning experiences. Grateful to receiving funding from HKSTP, HK Tech 300, and Alibaba Cloud.

Professional Activities
  • Workshop Organizer: AIM-FM: Advancements In Foundation Models Towards Intelligent Agents (NeurIPS 2024)
  • Talks: "UKAN" at VALSE Summit (Jun 2025) and DAMTP, University of Cambridge (Jul 2024)
  • Conference Reviewer: ICLR, NeurIPS, ICML, ACL, CVPR, ICCV, ECCV, EMNLP, AAAI, ACM MM
  • Journal Reviewer: Nature Machine Intelligence, PAMI, TIP, DMLR, PR, TNNLS
Beyond Work
Reading: I dedicate substantial time to reading, especially history, philosophy, and sociology, which shapes my perspective on what AGI should be from first principles.

Investment: Investment is real-world RL: returns provide fast feedback to iteratively improve individual decision policy. Recently, I am fascinated by the idea that how to (i) build benchmarks for LLMs that quantify real-world investment utility (in the similar spirit of GPT-5.2's gdpeval benchmark), and (ii) extending quantitative financial metrics to more general event and trend forecasting.