Chenxin Li | 李宸鑫

Hi! I'm Chenxin "Jason" Li, a final-year Ph.D. candidate at The Chinese University of Hong Kong (CUHK). I work on multimodal LLM, reasoning/agent via RL, and world model.

I am currently interning at ByteDance Seed, scaling VLM via reasoning/agentic RL. I built hands-on experience in (i) scaling multimodal models (data, architecture, training, benchmarking) and (ii) post-training via RL (reasoning, multi-turn agent, reward modeling and shaping). Previously, I did internships at Tencent AI, Ant Ling and Hedra AI etc., and research visits with UT Austin and UMD.

I anticipate graduating in the summer of 2026 and am interested in industrial positions (Profile). Please feel free to reach out via email (chenxinli@link.cuhk.edu.hk) or WeChat (jasonchenxinli).

LinkedIn | Google Scholar Scholar | GitHub | X

profile photo
Selected Publications
* Equal contribution, † Project Leader, ‡ Corresponding author
IR3D-Bench Framework
IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering
Parker Liu*, Chenxin Li*, Zhengxin Li, Yipeng Wu, Wuyang Li, Zhiqin Yang, Zhenyuan Zhang, Yunlong Lin, Sirui Han, Brandon Y. Feng
NeurIPS 2025

[Project] [Paper] [Code]

Evaluating scene understanding capabilities of VLM via inverse rendering tasks.

InfoBridge Framework
InfoBridge: Balanced Multimodal Alignment by Maximizing Cross-modal Conditional Mutual Information
Chenxin Li, Yifan Liu, Xinyu Liu, Wuyang Li, Hengyu Liu, Cheng Wang, Weihao Yu, Yunlong Lin, Yixuan Yuan
ICCV 2025

[Project] [Paper] [Code]

Enhanced multimodal alignment by maximizing cross-modal conditional mutual information.

U-KAN Framework
U-KAN: U-KAN Makes Strong Backbone for Image Segmentation and Generation
Chenxin Li*, Xinyu Liu*, Wuyang Li*, Cheng Wang*, Hengyu Liu, Yixuan Yuan
AAAI 2025

[Project] [Paper] [Code] Top 1 most influential papers in AAAI 2025

Integrating Linear Attention mechanism like KAN into vision backbone

JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
Yunlong Lin*, Zixu Lin*, Haoyu Chen*, Chenxin Li*, Sixiang Chen, Kairun Wen, Yeying Jin, Wenbo Li, Xinghao Ding‡
CVPR 2025

[Project] [Paper] [Code]

JarvisIR is a VLM-powered intelligent system that dynamically schedules expert models for restoration.

JarvisArt
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
Yunlong Lin*, Zixu Lin*, Kunjie Lin*, Chenxin Li*, Haoyu Chen, Zhongdao Wang, Xinghao Ding†, Wenbo Li, Shuicheng Yan†
Preprint 2025

[Project] [Paper] [Code]

VLM-powered agentic photo retouching system that orchestrates expert models for professional-grade image editing.

EMNLP 2024 VLM fine-tuning
Visual Large Language Model Fine-Tuning via Simple Parameter-Efficient Modification
Mengjiao Li, Zhiyuan Ji, Chenxin Li†, Lianliang Nie, Zhiyang Li, Masashi Sugiyama
EMNLP 2024

[Project] [Paper] [Code]

Simple yet efficient fine-tuning strategy for VLM.

Selected Experience
  • ByteDance Seed: VLM scaling via reasoning/agentic RL
  • Tencent AI: World model simulation via Blender agent
  • Ant Ling: Long-context memory RL, hallucination verifiers
  • Hedra AI: Omnimodal (audio, image, pose) injection for video generation
ScholaGO

ScholaGO (Co-founder): LLM-backend Education Startup

Co-founded ScholaGO Education Technology Company Limited (学旅通教育科技有限公司) to build LLM-powered education products that turn static content into immersive, interactive, multimodal learning experiences. Grateful to receiving funding from HKSTP, HK Tech 300, and Alibaba Cloud.

Professional Activities
  • Workshop Organizer: AIM-FM: Advancements In Foundation Models Towards Intelligent Agents (NeurIPS 2024)
  • Talks: "UKAN" at VALSE Summit (Jun 2025) and DAMTP, University of Cambridge (Jul 2024)
  • Conference Reviewer: ICLR, NeurIPS, ICML, CVPR, ICCV, ECCV, EMNLP, AAAI, ACM MM, MICCAI, BIBM
  • Journal Reviewer: Nature Machine Intelligence, PAMI, TIP, DMLR, PR, TNNLS
Beyond Work
Reading: I dedicate substantial time to reading, especially history, philosophy, and sociology, which shapes my perspective on what AGI should be from first principles.

Investment: Investment is real-world RL: returns provide fast feedback to iteratively improve individual decision policy. Recently, I am fascinated by the idea that how to (i) build benchmarks for LLMs that quantify real-world investment utility (in the similar spirit of GPT-5.2's gdpeval benchmark), and (ii) extending quantitative financial metrics to more general event and trend forecasting.