My research interests include: (i)🧠 Multimodal Reasoning and Alignment:
Multimodal LLM understanding and reasoning | Agent and RL | Backbone for multimodal alignment.
(ii) 🌍 Multimodal World Modeling:
Multimodal content generation | Video diffusion and 3DGS.
[Pinned] Looking for industry position and internship opportunities.
A granular video anomaly detection framework that integrates the detection of multiple fine-grained anomalous objects into a unified framework, achieving state-of-the-art performance.
A pioneering foray into the intriguing realm of embedding, relating and perceiving the heterogeneous patterns from various biomedical modalities holistically via a graph theory.
A novel approach that leverages the ambiguity and uncertainty in object boundaries to improve segmentation performance, turning traditional segmentation "flaws" into advantages.
An initial exploration into embedding customizable, imperceptible, and recoverable information within the renders produced by off-the-line 3D generative models, while ensuring minimal impact on the rendered content's quality.
Co-founder & TechLead, ScholaGO Education Technology Company Limited
I co-founded ScholaGO Education Technology Company Limited (学旅通教育科技有限公司) to develop innovative LLM-backed educational products that transform static knowledge into immersive, interactive, multimodal experiences. We secured funding from HKSTP, HK Tech 300 government entrepreneurship funds, and Alibaba Cloud, aiming to create impactful technologies that enhance education and societal well-being.
🎨 Personal Interests
📚 Reading: I dedicate substantial time each week to reading and deep contemplation. I have a particular passion for exploring history, philosophy, and sociology, which I believe has enriched my cognitive perspectives significantly.
📈 Investment:
From a reinforcement learning perspective, no things can provide more verifiable rewards than precise return figures, which enables me to continuously refine my decision-making policy.
From a synergistic perspective of perception and generation, I guess training investment decision-making abilities (perception tasks) may benefit future entrepreneurial ventures (generation tasks)? Maybe I can answer it in the future.