Bill Yuchen Lin
65 papers · 2018–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (11) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π Academic Marathon (7)
π
Academic Marathon
(7)
π
Cross-Pollinator
(12)
π
Renaissance Researcher
(8)
π
Conference Loyalist
(20)
π
Grand Slam
π
Keyword Champion
(3)
π₯
Mega-Team
(32)
π¬
Deep Specialist
(19)
π§¬
Topic Evolution
π€
Dynamic Duo
(31)
π
Century Club
(64)
π₯
Unstoppable
(8)
β
The Questioner
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(260)
Conferences
ACL (21)
EMNLP (14)
NAACL (10)
NIPS (5)
ICLR (4)
IJCNLP (4)
AAAI (3)
AACL (1)
CVPR (1)
EACL (1)
ICML (1)
Top co-authors
Keywords
large language model
(17)
commonsense reasoning
(10)
named entity recognition
(7)
language model
(7)
benchmark evaluation
(5)
pre-trained language model
(5)
knowledge distillation
(4)
knowledge graph
(4)
transfer learning
(3)
few-shot learning
(3)
contrastive learning
(3)
chain-of-thought reasoning
(3)
sequence labeling
(3)
vision-language model
(3)
commonsense knowledge
(3)
continual learning
(2)
unsupervised learning
(2)
text generation
(2)
question answering
(2)
relation extraction
(2)
Papers
Temporal Sampling for Forgotten Reasoning in LLMs
ACL 2026
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
CVPR 2025
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
ICLR 2025
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
AAAI 2025
On Memorization of Large Language Models in Logical Reasoning
AACL 2025
SimulBench: Evaluating Language Models with Creative Simulation Tasks
NAACL 2025
CulturalBench: A Robust, Diverse and Challenging Benchmark for Measuring LMsβ Cultural Knowledge Through Human-AI Red-Teaming
ACL 2025
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
ACL 2025
Small Models Struggle to Learn from Strong Reasoners
ACL 2025
RewardBench: Evaluating Reward Models for Language Modeling
NAACL 2025
L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
NAACL 2025
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
NAACL 2025
Stronger Models are Not Always Stronger Teachers for Instruction Tuning
NAACL 2025
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
NAACL 2025
Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models
NAACL 2025
On Memorization of Large Language Models in Logical Reasoning
IJCNLP 2025
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
ICLR 2025
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
ICML 2025
Latent Action Pretraining from Videos
ICLR 2025
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
ICLR 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
EMNLP 2024
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
EMNLP 2024
WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
NIPS 2024
WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences
NIPS 2024
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
ACL 2024
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents
ACL 2024
Agent Lumos: Unified and Modular Training for Open-Source Language Agents
ACL 2024
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
ACL 2024
Selective βSelective Predictionβ: Reducing Unnecessary Abstention in Vision-Language Reasoning
ACL 2024
Complex Reasoning in Natural Language
ACL 2023
AutoTriggER: Label-Efficient and Robust Named Entity Recognition with Auxiliary Trigger Extraction
EACL 2023
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
ACL 2023
Faith and Fate: Limits of Transformers on Compositionality
NIPS 2023
SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
NIPS 2023
On Grounded Planning for Embodied Tasks with Language Models
AAAI 2023
Unsupervised Cross-Task Generalization via Retrieval Augmentation
NIPS 2022
On Continual Model Refinement in Out-of-Distribution Data Streams
ACL 2022
Knowledge-Augmented Methods for Natural Language Processing
ACL 2022
Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality
EMNLP 2022
On the Robustness of Reading Comprehension Models to Entity Renaming
NAACL 2022
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks
NAACL 2022
Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
EMNLP 2021
RockNER: A Simple Method to Create Adversarial Examples for Evaluating the Robustness of Named Entity Recognition Models
EMNLP 2021
Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning
IJCNLP 2021
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
IJCNLP 2021
RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge
ACL 2021
Differentiable Open-Ended Commonsense Reasoning
NAACL 2021
IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization
AAAI 2021
Common Sense Beyond English: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning
ACL 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
EMNLP 2021
RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms
EMNLP 2021
Probing Commonsense Explanation in Dialogue Response Generation
EMNLP 2021
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
ACL 2020
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
ACL 2020
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
ACL 2020
CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning
EMNLP 2020
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language Models
EMNLP 2020
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
EMNLP 2020
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
IJCNLP 2019
AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging
ACL 2019
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
EMNLP 2019
ExtRA: Extracting Prominent Review Aspects from Customer Feedback
EMNLP 2018
Automatic Extraction of Commonsense LocatedNear Knowledge
ACL 2018
Mining Cross-Cultural Differences and Similarities in Social Media
ACL 2018
Neural Adaptation Layers for Cross-domain Named Entity Recognition
EMNLP 2018