Robin Jia
59 papers · 2016–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π Conference Polyglot (8) π Academic Marathon (9) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (9)
π
Renaissance Researcher
(7)
π£
Hot Topic Early Bird
π
Conference Polyglot
(8)
π
Keyword Trendsetter Combo
(4)
π
Conference Loyalist
(20)
π€
Dynamic Duo
(12)
π±
Topic Pioneer
π¬
Deep Specialist
(12)
π§¬
Topic Evolution
π
Trend Setter
π₯
Unstoppable
(10)
β‘
Prolific Year
(9)
ποΈ
Keyword Collector
(249)
π
Century Club
(58)
β
The Questioner
(10)
Conferences
ACL (20)
EMNLP (20)
NAACL (9)
IJCNLP (4)
NIPS (3)
EACL (1)
ICLR (1)
ICML (1)
Top co-authors
Research topics
Keywords
question answering
(11)
large language model
(8)
model robustness
(6)
natural language inference
(5)
language model
(5)
benchmark evaluation
(5)
adversarial training
(4)
model evaluation
(4)
in-context learning
(4)
few-shot learning
(4)
training datum
(3)
out-of-distribution detection
(3)
reading comprehension
(3)
adversarial robustness
(3)
adversarial example
(3)
natural language generation
(3)
model interpretability
(3)
natural language processing
(3)
machine reading comprehension
(3)
certified robustness
(2)
Papers
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models
ACL 2026
Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge
ACL 2025
Language Models Can Infer Action Semantics for Symbolic Planners from Environment Feedback
NAACL 2025
Mechanistic Interpretability of Emotion Inference in Large Language Models
ACL 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
EMNLP 2025
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
ACL 2025
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
ICLR 2025
TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability
EMNLP 2025
Rethinking Backdoor Detection Evaluation for Language Models
EMNLP 2025
Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
EMNLP 2025
When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models
EMNLP 2024
Efficient End-to-End Visual Document Understanding with Rationale Distillation
NAACL 2024
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
NAACL 2024
Pre-trained Large Language Models Use Fourier Features to Compute Addition
NIPS 2024
Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression
NIPS 2024
Proving membership in LLM pretraining data via data watermarks
ACL 2024
Do Question Answering Modeling Improvements Hold Across Benchmarks?
ACL 2023
Are Sample-Efficient NLP Models More Robust?
ACL 2023
Estimating Large Language Model Capabilities without Labeled Test Data
EMNLP 2023
Data Curation Alone Can Stabilize In-context Learning
ACL 2023
SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples
EMNLP 2023
Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering
EMNLP 2023
Benchmarking Long-tail Generalization with Likelihood Splits
EACL 2023
How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench
EMNLP 2023
Contrastive Novelty-Augmented Learning: Anticipating Outliers with Large Language Models
ACL 2023
Knowledge Base Question Answering by Case-based Reasoning over Subgraphs
ICML 2022
On Continual Model Refinement in Out-of-Distribution Data Streams
ACL 2022
Analyzing Dynamic Adversarial Training Data in the Limit
ACL 2022
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
ACL 2022
Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems
EMNLP 2022
On the Robustness of Reading Comprehension Models to Entity Renaming
NAACL 2022
Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants
NAACL 2022
To what extent do human explanations of model behavior align with actual model behavior?
EMNLP 2021
Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations
ACL 2021
The statistical advantage of automatic NLG metrics at the system level
ACL 2021
Evaluation Examples are not Equally Informative: How should that change NLP Leaderboards?
ACL 2021
Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
NIPS 2021
Evaluation Examples are not Equally Informative: How should that change NLP Leaderboards?
IJCNLP 2021
The statistical advantage of automatic NLG metrics at the system level
IJCNLP 2021
Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations
IJCNLP 2021
Dynabench: Rethinking Benchmarking in NLP
NAACL 2021
Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality
NAACL 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
EMNLP 2021
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation
EMNLP 2021
Robustness and Adversarial Examples in Natural Language Processing
EMNLP 2021
With Little Power Comes Great Responsibility
EMNLP 2020
Selective Question Answering under Domain Shift
ACL 2020
Robust Encodings: A Framework for Combating Adversarial Typos
ACL 2020
On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks
EMNLP 2020
Document-Level N-ary Relation Extraction with Multiscale Representation Learning
NAACL 2019
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
EMNLP 2019
Proceedings of the 2nd Workshop on Machine Reading for Question Answering
EMNLP 2019
Certified Robustness to Adversarial Word Substitutions
EMNLP 2019
Certified Robustness to Adversarial Word Substitutions
IJCNLP 2019
Know What You Donβt Know: Unanswerable Questions for SQuAD
ACL 2018
Proceedings of the Workshop on Machine Reading for Question Answering
ACL 2018
Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer
NAACL 2018
Adversarial Examples for Evaluating Reading Comprehension Systems
EMNLP 2017
Data Recombination for Neural Semantic Parsing
ACL 2016