Tongshuang Wu
28 papers · 2019–2026 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
๐ Academic Marathon (6) ๐ Interdisciplinary Bridge ๐งญ Keyword Pioneer ๐ Conference Polyglot (7) ๐ Cross-Pollinator (12)
๐ฃ
Hot Topic Early Bird
๐
Conference Polyglot
(7)
๐
Academic Marathon
(6)
๐
Keyword Champion
(2)
๐งฌ
Topic Evolution
๐ฅ
Unstoppable
(7)
โก
Prolific Year
(8)
๐
Century Club
(27)
โ
The Questioner
(2)
๐๏ธ
Keyword Collector
(118)
Conferences
ACL (12)
EMNLP (8)
IJCAI (2)
IJCNLP (2)
NAACL (2)
AAAI (1)
AACL (1)
Top co-authors
Research topics
Keywords
large language model
(6)
data augmentation
(4)
model evaluation
(4)
question answering
(4)
language model
(4)
text perturbation
(3)
nlp model
(3)
sentiment analysis
(3)
information retrieval
(3)
question generation
(3)
counterfactual generation
(2)
natural language processing
(2)
human-ai interaction
(2)
retrieval-augmented generation
(2)
behavioral testing
(2)
human-computer interaction
(2)
synthetic data generation
(2)
software engineering
(2)
code generation
(2)
retrieval augmentation
(2)
Papers
RECAP: An End-to-End Platform for Capturing, Replaying, and Analyzing AI-Assisted Programming Interactions
ACL 2026
Evaluating Mathematical Reasoning Beyond Accuracy
AAAI 2025
SPHERE: An Evaluation Card for Human-AI Systems
ACL 2025
MoR: Better Handling Diverse Queries with a Mixture of Sparse, Dense, and Human Retrievers
EMNLP 2025
cAST: Enhancing Code Retrieval-Augmented Generation with Structural Chunking via Abstract Syntax Tree
EMNLP 2025
How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging (Extended Abstract)
IJCAI 2025
SOTOPIA-S4: a user-friendly system for flexible, customizable, and large-scale social simulation
NAACL 2025
Large Language Models Help Humans Verify Truthfulness โ Except When They Are Convincingly Wrong
NAACL 2024
Better Synthetic Data by Retrieving and Transforming Existing Datasets
ACL 2024
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
ACL 2024
Synthetic Multimodal Question Generation
EMNLP 2024
Designing, Evaluating, and Learning from Humans Interacting with NLP Models
EMNLP 2023
Prompt2Model: Generating Deployable Models from Natural Language Instructions
EMNLP 2023
NewsSense: Reference-free Verification via Cross-document Comparison
EMNLP 2023
Beyond Testersโ Biases: Guiding Model Testing with Knowledge Bases using LLMs
EMNLP 2023
BiasX: โThinking Slowโ in Toxic Content Moderation with Explanations of Implied Social Biases
EMNLP 2023
DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions
ACL 2023
Measuring Adversarial Datasets
AACL 2023
Measuring Adversarial Datasets
IJCNLP 2023
It is AIโs Turn to Ask Humans a Question: Question-Answer Pair Generation for Childrenโs Story Books
ACL 2022
Are Shortest Rationales the Best Explanations for Human Understanding?
ACL 2022
Fantastic Questions and Where to Find Them: FairytaleQA โ An Authentic Dataset for Narrative Comprehension
ACL 2022
Tailor: Generating and Perturbing Text with Semantic Controls
ACL 2022
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
ACL 2021
Beyond Accuracy: Behavioral Testing of NLP Models with Checklist (Extended Abstract)
IJCAI 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
IJCNLP 2021
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
ACL 2020
Errudite: Scalable, Reproducible, and Testable Error Analysis
ACL 2019