conftrace_

Saku Sugawara

39 papers · 2017–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+13 more ↓

🏃 Academic Marathon (8) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (9)

🐝 Cross-Pollinator (9) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (60) 🤝 Dynamic Duo (20) 🏆 Keyword Champion (2) 🔬 Deep Specialist (10) 🧬 Topic Evolution ⚡ Prolific Year (8) 🔥 Unstoppable (6) 📈 Trend Setter 💎 Century Club (35) 🗃️ Keyword Collector (154) ❓ The Questioner (12)

Conferences

ACL (14) EMNLP (12) IJCNLP (4) COLING (3) AAAI (2) EACL (2) AACL (1) CONLL (1)

Top co-authors

Akiko Aizawa (20) Kazutoshi Shinoda (5) Xanh Ho (4) Nikita Nangia (3) Johannes Mario Meissner (3) Alex Warstadt (3) Daiki Asami (3) Akira Kawabata (3) Miyu Oba (3) Hiroki Ouchi (2)

Keywords

question answering (12) language model (7) reading comprehension (7) natural language understanding (6) large language model (5) natural language inference (4) data augmentation (3) representation learning (3) spurious correlation (3) benchmark dataset (3) benchmark evaluation (3) language model evaluation (3) machine reading comprehension (3) dataset evaluation (3) data quality (2) multi-hop question answering (2) generalization ability (2) bias mitigation (2) variational inference (2) shortcut learning (2)

Papers

CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models ACL 2026 Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge ACL 2026 C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences ACL 2026 A Dual-Task Paradigm to Investigate Sentence Comprehension Strategies in Language Models ACL 2026 Development of Numerical Error Detection Tasks to Analyze the Numerical Capabilities of Language Models COLING 2025 Are Checklists Really Useful for Automatic Evaluation of Generative Tasks? EMNLP 2025 Specification-Aware Machine Translation and Evaluation for Purpose Alignment EMNLP 2025 TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies? EMNLP 2025 MCQFormatBench: Robustness Tests for Multiple-Choice Questions ACL 2025 Modeling Overregularization in Children with Small Language Models ACL 2024 Rationale-Aware Answer Verification by Pairwise Self-Evaluation EMNLP 2024 Can Language Models Induce Grammatical Knowledge from Indirect Evidence? EMNLP 2024 What Makes Language Models Good-enough? ACL 2024 Which Shortcut Solution Do Question Answering Models Prefer to Learn? AAAI 2023 On Degrees of Freedom in Defining and Testing Natural Language Understanding ACL 2023 Probing Physical Reasoning with Counter-Commonsense Context ACL 2023 PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments CONLL 2023 PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments EMNLP 2023 Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering EACL 2023 Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension EMNLP 2023 How Well Do Multi-hop Reading Comprehension Models Understand Date Information? IJCNLP 2022 How Well Do Multi-hop Reading Comprehension Models Understand Date Information? AACL 2022 What Makes Reading Comprehension Questions Difficult? ACL 2022 Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios COLING 2022 Cross-Modal Similarity-Based Curriculum Learning for Image Captioning EMNLP 2022 Debiasing Masks: A New Framework for Shortcut Mitigation in NLU EMNLP 2022 Look to the Right: Mitigating Relative Position Bias in Extractive Question Answering EMNLP 2022 Can Question Generation Debias Question Answering Models? A Case Study on Question–Context Lexical Overlap EMNLP 2021 Benchmarking Machine Reading Comprehension: A Psychological Perspective EACL 2021 Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation ACL 2021 Embracing Ambiguity: Shifting the Training Target of NLI Models ACL 2021 What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? IJCNLP 2021 Embracing Ambiguity: Shifting the Training Target of NLI Models IJCNLP 2021 Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation IJCNLP 2021 What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? ACL 2021 Assessing the Benchmarking Capacity of Machine Reading Comprehension Datasets AAAI 2020 Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps COLING 2020 What Makes Reading Comprehension Questions Easier? EMNLP 2018 Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability ACL 2017