I am a research scientist on the pre-training team at Qwen (Alibaba), focusing on improved model architectures and training algorithms for large language models.

I received my Ph.D. in computer science from the Institute for Interdisciplinary Information Sciences at Tsinghua University in 2026, advised by Professor Andrew Chi-Chih Yao, who is recipient of the 2000 A.M. Turing Award. I received my B.S. degree in artificial intelligence from Peking University in 2021, advised by Professor Liwei Wang.

My research lies at the intersection of theoretical and applied machine learning. On the theoretical side, I am interested in establishing provable guarantees for the generalization and optimization of machine learning algorithms. On the empirical side, I have hands-on experience with large-scale LLM pre-training and am committed to designing efficient optimization algorithms that improve scalability and performance in pre-training.

I also have in-depth practical experience in quantitative research, including internships at Citadel Securities and Jump Trading.

My previous work includes:

  • Efficient and stable optimizers for LLM pre-training.
  • Adaptation of LLMs, e.g., parameter-efficient fine-tuning and scalable model merging.
  • Generalization guarantees, implicit bias, and corresponding empirical signals in machine learning.
  • Upper and lower convergence bounds for optimization algorithms on structured problems.

Experience

  • Qwen (Alibaba) (2026 – present)
    Research Scientist, Pre-training Team
    Working on improved model architectures and training algorithms for large language models.

  • Citadel Securities (Jun. 2025 – Sept. 2025)
    Quantitative Research Intern
    Built LLM pipelines to extract signals and build alphas from text-based alternative dataset. Received return offer.

  • Moonshot AI (Feb. 2025 – Jun. 2025)
    Machine Learning Intern at Pre-training Team
    Developed efficient and stable optimization algorithms (e.g., Muon and its variants) for LLM pre-training.

  • Jump Trading (Jun. 2024 – Aug. 2024)
    Quantitative Research Intern
    Conducted alpha analysis for China’s stock market.

Publications