Yiqing Xie's Personal Webpage
Language Technologies Institute, CMU. yiqingxi@andrew.cmu.edu
Rm 6413, Gates and Hillman Centers
4902 Forbes Ave
Pittsburgh, PA 15213, USA
I am a fourth-year Ph.D. student at the Language Technologies Institute of Carnegie Mellon University and I am working with Carolyn Rosé and Daniel Fried on coding agent training and evaluation. Previously, I obtained my Master degree in the data mining group at the University of Illinois Urbana-Champaign supervised by Jiawei Han and obtained my Bachelor degree in Hong Kong University of Science and Technology, where I received the Academic Achievement Medal.
My research mainly focuses on synthetic training data construction and automatic evaluation, especially for coding agent. The topics include: (i) Constructing scalable synthetic training data for coding agents; (ii) Training coding agents to generalize across diverse tasks; (iii) Improving coding agents with auxiliary models and evaluation benchmarks.
Scalable Synthetic Training Data
- Scalable training environment construction [RepoST, COLM’25] [Hybrid-Gym, ICML’26]
- Model-generated Training Signals [FenCE, ACL’25]
- Pretraining & continuous pretraining [Anchor-DR, SIGIR’23] [METRO-T0, ACL’23]
- Data augmentation [Eider, ACL’22] [Anchor-DR, SIGIR’23] [CMTrans, EMNLP’23] [FenCE, ACL’25]
- Guidance under heuristic metrics or prior knowledge [RL-MMR, EMNLP’20] [AlaGCN, IJCAI’20] [KoMen, WWW’22]
Training Models for Task Generalization
- Training coding agents for task generalization [Hybrid-Gym, ICML’26]
- Training NLP and graph-based methods for task generalization and easy-to-hard generalization [MetaLint] [Anchor-DR, SIGIR’23] [METRO-T0, ACL’23] [KoMen, WWW’22]
Auxiliary Model Training and Benchmark Construction
- Evaluation Benchmarks and Frameworks [DocLens, ACL’24] [CodeRAG-Bench, NAACL’25] [TheAgentCompany, Neurips’25] [RepoST, COLM’25]
- Auxiliary models in training and inference [FenCE, ACL’25] [SACL, EMNLP’25] [Strong-Weak-Colab, EMNLP’25]
News
| May 15, 2026 | Proposed my PhD thesis on “Synthetic Training Data Construction for Coding Agents“🦸🏻♀️!!! |
|---|---|
| Apr 30, 2026 | Hybrid-Gym got accepted to ICML 2026! (Hybrid-Gym) |
| Feb 19, 2026 | New preprint on coding agent training for task generalization! (Hybrid-Gym) |
| Nov 11, 2025 | Gave a guest lecture in CMU 11-891 Neural code generation! (slides) |
| Aug 26, 2025 | Start TA-ing for 11-891 Neural code generation!! |
| Aug 20, 2025 | Two papers on repo-level code generation got accepted to EMNLP 2025!! (SACL, Strong-Weak-Colab) |
| Jul 7, 2025 | Oun paper on synthetic coding environment construction for repo-level code generation got accepted to COLM 2025! (RepoST) |
| Jun 25, 2025 | Really excited about our two new preprints on analysis for code generation! (SACL, Strong-Weak-Colab) |
| May 15, 2025 | One paper on factuality evaluator training got accepted to ACL 2025! (FenCE 🚧) |
| Apr 9, 2025 | Gave a talk about repo-level coding environment construction at the EFML Reading Group (Stanford / UW)! |
Educations
Work experience
- Meta AI
2024.05 - 2024.10 - Research Intern, GenAI
- Work on training a fine-grained critic-based evaluator model and use it to improve generators' factuality [FenCE, ACL'25]
- Manager: Hejia Zhang; Peers: Di Jin, Sinong Wang
- Microsoft Research Redmond
2023.06 - 2023.08 - Research Intern, Health Futures
- Work on a multi-aspect fine-grained evaluation framework of medical text generation [DocLens, ACL'24]
- Manager: Sheng Zhang, Hao Cheng, Hoifung Poon
- Microsoft Research Redmond
2022.05 - 2022.08 - Research Intern, Productivity and Intelligence group
- Work on continuously pre-trained models for zero-shot dense retrieval [Anchor-DR, SIGIR'23]
- Manager: Chenyan Xiong
- Alibaba DAMO Academy
2020.07 - 2021.02 - Research Intern, Data Analytics and Intelligence Lab
- Work on few-shot interaction recommendation under multiple scenarios [KoMen, WWW'22]
- Manager: Yaliang Li, Bolin Ding
Honors and Awards
Additional Information
Selected publications
- ICMLHybrid-Gym: Training Coding Agents to Generalize Across TasksIn Forty-third International Conference on Machine Learning (ICML, 2026)
- COLMRepoST: Scalable Repository-Level Coding Environment Construction with Sandbox TestingIn Proceedings of the 2nd Conference on Language Modeling (COLM, 2025)
- ACLImproving Model Factuality with Fine-grained Critique-based EvaluatorIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL, 2025)
- ACLDocLens: Multi-aspect Fine-grained Evaluation for Medical Text GenerationIn Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL, 2024)
- EMNLP FindingsData Augmentation for Code Translation with Comparable Corpora and Multiple ReferencesIn Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings, 2023)
- SIGIRUnsupervised Dense Retrieval Training with Web AnchorsIn Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR, 2023)
- ACL FindingsEider: Evidence-enhanced Document-level Relation ExtractionIn Findings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL Findings, 2022)
- WWWKoMen: Domain Knowledge-Guided Few-Shot Interaction Recommendation on Multiplex NetworksIn Proceedings of the Web Conference (WWW, 2022)
- IJCAIWhen Do GNNs Work: Understanding and Improving Neighborhood AggregationIn Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI, 2020)
- WWWGuiding Corpus-Based Set Expansion by Auxiliary Sets Generation and Co-ExpansionIn Proceedings of The Web Conference (WWW, 2020)