Hi, I am Sitao Cheng. I am an incoming Ph.D. Student at R2L Lab, University of Waterloo, advised by Prof. Victor Zhong. Previously, I was a research scholar at UCSB NLP Group, advised by Prof.William Wang. I also closely work with Prof. Liangming Pan and Prof. Jie Fu. I obtained my Master’s degree from Nanjing University, advised by Prof. Yuzhong Qu. I also worked as a research intern at Microsoft DKI Group.

My research interests lie in advancing the knowledge-intensive reasoning capabilities of language models (LMs). I have experience on Language Agents, RAG and Neural-Symbolic Reasoning. Currently, I am doing research on the following topics:

  1. Reward modeling and model generalizability (e.g., exploration of reward functioning or reasoning framework by Reinforcement Learnining)
  2. Understanding and improving reasoning capabilities (e.g., how LLMs adopts parametric and contextual knowledge for reasoning).
  3. Language Agents (e.g., reasoning on real-world environments by information retrieval and semantic parsing).

Please feel free to reach out to discuss research! Please check out my CV.

Publications

  • [ACL’25 Workshop] Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
    Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
    [paper] [code] [homepage]

  • [ACL’24 Findings] Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments
    Sitao Cheng, Ziyuan Zhuang, Yong Xu, Fangkai Yang, Chaoyun Zhang, Xiaoting Qin, Xiang Huang, Ling Chen, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang
    [paper] [code]

  • [ACL’24 Oral] QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction
    Xiang Huang*, Sitao Cheng*, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu
    [paper] [code]

  • [ACL’25] Disentangling Memory and Reasoning Ability in Large Language Models
    Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang
    [paper] [code]

  • [ACL’25] RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios
    Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang
    [paper] [code]

  • [ACL’25]TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data
    Xiang Huang, Jiayu Shen, Shanshan Huang, Sitao Cheng, Xiaxia Wang, Yuzhong Qu
    [paper]

  • [EMNLP’25] Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation
    Kaikai An, Fangkai Yang, Liqun Li, Junting Lu, Sitao Cheng, Shuzheng Si, Lu Wang, Pu Zhao, Lele Cao, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang, Baobao Chang
    [paper]

  • [EMNLP’24] EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
    Ziyuan Zhuang, Zhiyang Zhang, Sitao Cheng, Fangkai Yang, Jia Liu, Shujian Huang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
    [paper] [code]

  • [EMNLP’23] MarkQA: A Large Scale KBQA Dataset with Numerical Reasoning
    Xiang Huang, Sitao Cheng, Yuheng Bao, Shanshan Huang, Yuzhong Qu
    [paper] [code] [homepage]

  • [AAAI’23 Oral] Question Decomposition Tree for Answering Complex Questions over Knowledge Bases
    Xiang Huang, Sitao Cheng, Yiheng Shu, Yuheng Bao, Yuzhong Qu
    [paper] [code]

Recent News

  • 2024-11: Attending SoCal NLP 2024, San Diego.
  • 2024-11: Attending EMNLP 2024, Miami.
  • 2024-09: One paper accepted by EMNLP 2024.
  • 2024-08: Attending and volunteering at ACL 2024, Bangkok.
  • 2024-08: Attending EMNLP23 2024, Singapore.
  • 2024-07: Joining UC Santa Barbara, NLP Group.
  • 2024-05: Two papers accepted by ACL 2024 (one Main + one Findings).
  • 2023-10: One paper accepted by EMNLP 2023.
  • 2022-11: One paper accepted by AAAI 2023.

Services

  • Reviewer: ARR, ICLR 2024
  • ACL 2024 Volunteer