Yiqing Xie's Personal Webpage
Language Technologies Institute, CMU. yiqingxi@andrew.cmu.edu

Rm 6607, Gates and Hillman Centers
4902 Forbes Ave
Pittsburgh, PA 15213, USA
I am a third-year Ph.D. student at the Language Technologies Institute of Carnegie Mellon University and I am working with Carolyn Rosé and Daniel Fried. Previously, I obtained my Master degree in the data mining group at the University of Illinois Urbana-Champaign supervised by Jiawei Han and obtained my Bachelor degree in Hong Kong University of Science and Technology, where I received the Academic Achievement Medal.
My research mainly focuses on annotation-efficient generation and evaluation systems, especially on code generation. The topics including (i) building generalizable and annotation efficient NLP systems to assist human with practical tasks, and (ii) building reliable and automatic evaluation systems for NLP methods.
I'm looking for a summer 2025 internship / collaboration on code generation or coding agents. Feel free to reach out if you have an opportunity that aligns!
Annotation-efficient NLP Systems
- Pretraining & continuous pretraining (Anchor-DR, METRO-T0)
- Training environment (RepoST)
- Model-generated Reward Signals (FenCE)
- Data augmentation (FenCE, Anchor-DR, CMTrans, Eider)
- Guidance under heuristic metrics or prior knowledge (AlaGCN, RL-MMR, KoMen)
- Unsupervised or Semi-supervised methods (Set-CoExpan, CoRel)
Reliable and Automatic Evaluation Systems
- Evaluation Benchmarks (RepoST, TheAgentCompany, CodeRAG-Bench, CodeBenchGen)
- Evaluation frameworks (DocLens)
- Evaluator models (FenCE)
LLMs for Code Generation and Evaluation
- Code generation Training (RepoST, CMTrans)
- Evaluation for code generation (RepoST, TheAgentCompany, CodeRAG-Bench, CodeBenchGen)
News
Feb 19, 2025 | Really excited about our new preprint: RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing! (RepoST) |
---|---|
Jan 22, 2025 | One paper on RAG-for-code benchmark got accepted to NAACL-Findings 2025! (CodeRAG-Bench) |
Dec 18, 2024 | Really excited about our new preprint on a benchmark for LLM agents! (TheAgentCompany) |
May 16, 2024 | One paper on medical evaluation got accepted to ACL 2024! (DocLens 🔍) |
Apr 5, 2024 | Gave a talk on our recent work on evaluation of medical text at Microsoft! |
Dec 10, 2023 | Present our work on code translation at EMNLP 2023! (CMTrans 🔄💻) |
Educations
Work experience
- Meta AI
2024.05 - 2024.10 - Research Intern, GenAI
- Work on training a fine-grained critic-based evaluator model and use it to improve generators' factuality [FenCE]
- Manager: Hejia Zhang; Peers: Di Jin, Sinong Wang
- Microsoft Research Redmond
2023.06 - 2023.08 - Research Intern, Health Futures
- Work on a multi-aspect fine-grained evaluation framework of medical text generation [DocLens]
- Manager: Sheng Zhang, Hao Cheng, Hoifung Poon
- Microsoft Research Redmond
2022.05 - 2022.08 - Research Intern, Productivity and Intelligence group
- Work on continuously pre-trained models for zero-shot dense retrieval [Anchor-DR]
- Manager: Chenyan Xiong
- Alibaba DAMO Academy
2020.07 - 2021.02 - Research Intern, Data Analytics and Intelligence Lab
- Work on few-shot interaction recommendation under multiple scenarios [KoMen]
- Manager: Yaliang Li, Bolin Ding
Honors and Awards
Additional Information
Selected publications
- preprintRepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing(preprint, 2025)
- preprintImproving Model Factuality with Fine-grained Critique-based Evaluator(preprint, 2024)
- ACLDocLens: Multi-aspect Fine-grained Evaluation for Medical Text GenerationIn Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL, 2024)
- EMNLP FindingsData Augmentation for Code Translation with Comparable Corpora and Multiple ReferencesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP Findings, 2023)
- SIGIRUnsupervised Dense Retrieval Training with Web AnchorsIn Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR, 2023)
- ACL FindingsEider: Evidence-enhanced Document-level Relation ExtractionIn Findings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL Findings, 2022)
- WWWKoMen: Domain Knowledge-Guided Few-Shot Interaction Recommendation on Multiplex NetworksIn Proceedings of the Web Conference (WWW, 2022)
- IJCAIWhen Do GNNs Work: Understanding and Improving Neighborhood AggregationIn Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI, 2020)
- WWWGuiding Corpus-Based Set Expansion by Auxiliary Sets Generation and Co-ExpansionIn Proceedings of The Web Conference (WWW, 2020)