Yizhong Wang

32 papers · 2016–2025 · 9 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (9) 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (9)

🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (9) 🤝 Dynamic Duo (12) 👥 Mega-Team (43) 🧬 Topic Evolution 💎 Century Club (32) ⚡ Prolific Year (5) 📈 Trend Setter 🗃️ Keyword Collector (143) 🔥 Unstoppable (10) ❓ The Questioner (4)

Conferences

ACL (13) EMNLP (5) ICLR (4) COLING (2) IJCNLP (2) NAACL (2) NIPS (2) AAAI (1) ICCV (1)

Top co-authors

Hannaneh Hajishirzi (12) Noah A. Smith (8) Sujian Li (6) Hamish Ivison (4) Pradeep Dasigi (4) Matt Gardner (3) Jungo Kasai (3) Valentina Pyatkin (3) Yeganeh Kordi (3) Sameer Singh (3)

Research topics

Reinforcement Learning (1)

Keywords

language model (6) large language model (5) instruction tuning (4) numerical reasoning (3) instruction following (3) zero-shot learning (3) natural language processing (2) character-level embedding (2) token embedding (2) synthetic datum (2) information retrieval (2) data augmentation (2) in-context learning (2) question answering (2) reward model (2) machine reading comprehension (2) probing analysis (2) model evaluation (2) factual knowledge (2) parameter-efficient fine-tuning (2)

Papers

Evaluating Language Models as Synthetic Data Generators ACL 2025 Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback ACL 2025 Synthetic Data in the Era of Large Language Models ACL 2025 Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning ACL 2025 Language Models over Large-Scale Knowledge Base: on Capacity, Flexibility and Reasoning for New Facts COLING 2025 TurkingBench: A Challenge Benchmark for Web Agents NAACL 2025 Retrieval Head Mechanistically Explains Long-Context Factuality ICLR 2025 Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback NIPS 2024 OLMo: Accelerating the Science of Language Models ACL 2024 BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models ICLR 2024 Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection ICLR 2024 Set the Clock: Temporal Alignment of Pretrained Language Models ACL 2024 How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources NIPS 2023 HINT: Hypernetwork Instruction Tuning for Efficient Zero- and Few-Shot Generalisation ACL 2023 Self-Instruct: Aligning Language Models with Self-Generated Instructions ACL 2023 One Embedder, Any Task: Instruction-Finetuned Text Embeddings ACL 2023 TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering ICCV 2023 Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks EMNLP 2022 MultiModalQA: complex question answering over text, tables and images ICLR 2021 Probing Across Time: What Does RoBERTa Know and When? EMNLP 2021 Automated Lay Language Summarization of Biomedical Scientific Reviews AAAI 2021 Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics EMNLP 2020 Do NLP Models Know Numbers? Probing Numeracy in Embeddings EMNLP 2019 DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs NAACL 2019 Do NLP Models Know Numbers? Probing Numeracy in Embeddings IJCNLP 2019 Bag-of-Words as Target for Neural Machine Translation ACL 2018 Toward Fast and Accurate Neural Discourse Segmentation EMNLP 2018 Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification ACL 2018 DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications ACL 2018 Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification IJCNLP 2017 A Two-Stage Parsing Method for Text-Level Discourse Analysis ACL 2017 Towards Non-projective High-Order Dependency Parser COLING 2016