Saku Sugawara
39 papers · 2017–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Academic Marathon (8) π Conference Polyglot (8) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (9)
π
Cross-Pollinator
(9)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(60)
π€
Dynamic Duo
(20)
π
Keyword Champion
(2)
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
β‘
Prolific Year
(8)
π₯
Unstoppable
(6)
π
Trend Setter
π
Century Club
(35)
ποΈ
Keyword Collector
(154)
β
The Questioner
(12)
Conferences
ACL (14)
EMNLP (12)
IJCNLP (4)
COLING (3)
AAAI (2)
EACL (2)
AACL (1)
CONLL (1)
Top co-authors
Keywords
question answering
(12)
language model
(7)
reading comprehension
(7)
natural language understanding
(6)
large language model
(5)
natural language inference
(4)
data augmentation
(3)
representation learning
(3)
spurious correlation
(3)
benchmark dataset
(3)
benchmark evaluation
(3)
language model evaluation
(3)
machine reading comprehension
(3)
dataset evaluation
(3)
data quality
(2)
multi-hop question answering
(2)
generalization ability
(2)
bias mitigation
(2)
variational inference
(2)
shortcut learning
(2)
Papers
CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models
ACL 2026
Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge
ACL 2026
C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences
ACL 2026
A Dual-Task Paradigm to Investigate Sentence Comprehension Strategies in Language Models
ACL 2026
Development of Numerical Error Detection Tasks to Analyze the Numerical Capabilities of Language Models
COLING 2025
Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?
EMNLP 2025
Specification-Aware Machine Translation and Evaluation for Purpose Alignment
EMNLP 2025
TactfulToM: Do LLMs have the Theory of Mind ability to understand White Lies?
EMNLP 2025
MCQFormatBench: Robustness Tests for Multiple-Choice Questions
ACL 2025
Modeling Overregularization in Children with Small Language Models
ACL 2024
Rationale-Aware Answer Verification by Pairwise Self-Evaluation
EMNLP 2024
Can Language Models Induce Grammatical Knowledge from Indirect Evidence?
EMNLP 2024
What Makes Language Models Good-enough?
ACL 2024
Which Shortcut Solution Do Question Answering Models Prefer to Learn?
AAAI 2023
On Degrees of Freedom in Defining and Testing Natural Language Understanding
ACL 2023
Probing Physical Reasoning with Counter-Commonsense Context
ACL 2023
PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments
CONLL 2023
PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments
EMNLP 2023
Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering
EACL 2023
Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension
EMNLP 2023
How Well Do Multi-hop Reading Comprehension Models Understand Date Information?
IJCNLP 2022
How Well Do Multi-hop Reading Comprehension Models Understand Date Information?
AACL 2022
What Makes Reading Comprehension Questions Difficult?
ACL 2022
Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios
COLING 2022
Cross-Modal Similarity-Based Curriculum Learning for Image Captioning
EMNLP 2022
Debiasing Masks: A New Framework for Shortcut Mitigation in NLU
EMNLP 2022
Look to the Right: Mitigating Relative Position Bias in Extractive Question Answering
EMNLP 2022
Can Question Generation Debias Question Answering Models? A Case Study on QuestionβContext Lexical Overlap
EMNLP 2021
Benchmarking Machine Reading Comprehension: A Psychological Perspective
EACL 2021
Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation
ACL 2021
Embracing Ambiguity: Shifting the Training Target of NLI Models
ACL 2021
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?
IJCNLP 2021
Embracing Ambiguity: Shifting the Training Target of NLI Models
IJCNLP 2021
Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation
IJCNLP 2021
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?
ACL 2021
Assessing the Benchmarking Capacity of Machine Reading Comprehension Datasets
AAAI 2020
Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps
COLING 2020
What Makes Reading Comprehension Questions Easier?
EMNLP 2018
Evaluation Metrics for Machine Reading Comprehension: Prerequisite Skills and Readability
ACL 2017