Percy Liang

211 papers · 2006–2026 · 18 conferences · across top CS/AI conferences

Achievements

+20 more ↓

🗺️ Taxonomy Completionist (37) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (28) 🏠 Conference Loyalist (42) 🏆 Keyword Champion 🤝 Dynamic Duo (18) 🏆 Grand Slam 👑 Triple Crown 👥 Mega-Team (34) 🌱 Topic Pioneer 🔬 Deep Specialist (20) 🧬 Topic Evolution 🗃️ Keyword Collector (109) 🚀 Conference Pioneer 💎 Century Club (210) 🔥 Unstoppable (20) 📈 Trend Setter ⚡ Prolific Year (6) ❓ The Questioner (8)

Conferences

ICML (46) NIPS (42) ACL (34) EMNLP (27) ICLR (25) NAACL (9) IJCNLP (8) AISTATS (5) CORL (4) RSS (2) CONLL (2) UAI (1) JMLR (1) ICCV (1) EACL (1) COLT (1) COLING (1) AAAI (1)

Top co-authors

Dan Klein (18) Tatsunori Hashimoto (17) Jure Leskovec (16) Aditi Raghunathan (16) Michihiro Yasunaga (15) Tengyu Ma (12) Jacob Steinhardt (12) Robin Jia (12) Panupong Pasupat (11) Ananya Kumar (11)

Research topics

Natural Language Processing (1) Optimization (1) Education (1)

Keywords

language model (15) large language model (10) representation learning (10) question answering (9) unsupervised learning (7) distribution shift (7) distributionally robust optimization (5) parameter estimation (5) adversarial robustness (5) text generation (5) domain adaptation (5) self-supervised learning (4) approximate inference (4) semantic parsing (4) language modeling (4) reinforcement learning (4) transfer learning (4) structured prediction (4) active learning (4) in-context learning (4)

Papers

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pre-training EACL 2026 Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI ICML 2025 Independence Tests for Language Models ICML 2025 Reliable and Efficient Amortized Model-based Evaluation ICML 2025 Language Models May Verbatim Complete Text They Were Not Explicitly Trained On ICML 2025 LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain NAACL 2025 Eliciting Language Model Behaviors with Investigator Agents ICML 2025 Auditing Prompt Caching in Language Model APIs ICML 2025 Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models ICLR 2025 Model Equality Testing: Which Model is this API Serving? ICLR 2025 AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories ICLR 2025 BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments ICLR 2025 AutoBencher: Towards Declarative Benchmark Construction ICLR 2025 Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View ICLR 2025 s1: Simple test-time scaling EMNLP 2025 RoboArena: Distributed Real-World Evaluation of Generalist Robot Policies CORL 2025 Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success RSS 2025 Position: Language model developers should report train-test overlap ICML 2025 MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records AAAI 2024 Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making NIPS 2024 OpenVLA: An Open-Source Vision-Language-Action Model CORL 2024 Image2Struct: Benchmarking Structure Extraction for Vision-Language Models NIPS 2024 RedPajama: an Open Dataset for Training Large Language Models NIPS 2024 VHELM: A Holistic Evaluation of Vision Language Models NIPS 2024 Large Language Models as Analogical Reasoners ICLR 2024 Benchmarking and Improving Generator-Validator Consistency of Language Models ICLR 2024 Position: A Safe Harbor for AI Evaluation and Red Teaming ICML 2024 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models ICML 2024 Position: On the Societal Impact of Open Foundation Models ICML 2024 MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation ICML 2024 Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training ICLR 2024 On the Learnability of Watermarks for Language Models ICLR 2024 Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration CORL 2024 Are Sample-Efficient NLP Models More Robust? ACL 2023 PRODIGY: Enabling In-context Learning Over Graphs NIPS 2023 Lexinvariant Language Models NIPS 2023 AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback NIPS 2023 Data Selection for Language Models via Importance Resampling NIPS 2023 Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes NIPS 2023 Cheaply Estimating Inference Efficiency Metrics for Autoregressive Transformer Models NIPS 2023 DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining NIPS 2023 Holistic Evaluation of Text-to-Image Models NIPS 2023 Backpack Language Models ACL 2023 Contrastive Decoding: Open-ended Text Generation as Optimization ACL 2023 Do Question Answering Modeling Improvements Hold Across Benchmarks? ACL 2023 Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language Models ACL 2023 Evaluating Verifiability in Generative Search Engines EMNLP 2023 Is a Caption Worth a Thousand Images? A Study on Representation Learning ICLR 2023 Surgical Fine-Tuning Improves Adaptation to Distribution Shifts ICLR 2023 One-sided Matrix Completion from Two Observations Per Row ICML 2023 Evaluating Self-Supervised Learning via Risk Decomposition ICML 2023 Out-of-Domain Robustness via Targeted Augmentations ICML 2023 Whose Opinions Do Language Models Reflect? ICML 2023 FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU ICML 2023 CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks ICML 2023 Retrieval-Augmented Multimodal Language Modeling ICML 2023 Foundation Models and Fair Use JMLR 2023 Language-Driven Representation Learning for Robotics RSS 2023 Improving Self-Supervised Learning by Characterizing Idealized Representations NIPS 2022 Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization? NIPS 2022 Diffusion-LM Improves Controllable Text Generation NIPS 2022 GreaseLM: Graph REASoning Enhanced Language Models ICLR 2022 Large Language Models Can Be Strong Differentially Private Learners ICLR 2022 Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution ICLR 2022 Extending the WILDS Benchmark for Unsupervised Adaptation ICLR 2022 An Explanation of In-context Learning as Implicit Bayesian Inference ICLR 2022 LinkBERT: Pretraining Language Models with Document Links ACL 2022 Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift UAI 2022 Insights into Pre-training via Simpler Synthetic Tasks NIPS 2022 Decentralized Training of Foundation Models in Heterogeneous Environments NIPS 2022 What Can Transformers Learn In-Context? A Case Study of Simple Function Classes NIPS 2022 Deep Bidirectional Language-Knowledge Graph Pretraining NIPS 2022 Truncation Sampling as Language Model Desmoothing EMNLP 2022 Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation ICML 2022 Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization ICML 2021 WILDS: A Benchmark of in-the-Wild Distribution Shifts ICML 2021 LM-Critic: Language Models for Unsupervised Grammatical Error Correction EMNLP 2021 Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality NAACL 2021 QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering NAACL 2021 Conditional probing: measuring usable information beyond a baseline EMNLP 2021 Prefix-Tuning: Optimizing Continuous Prompts for Generation ACL 2021 Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization ICML 2021 Prefix-Tuning: Optimizing Continuous Prompts for Generation IJCNLP 2021 Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices ICML 2021 Just Train Twice: Improving Group Robustness without Training Group Information ICML 2021 Break-It-Fix-It: Unsupervised Learning for Program Repair ICML 2021 LILA: Language-Informed Latent Actions CORL 2021 In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness ICLR 2021 Selective Classification Can Magnify Disparities Across Groups ICLR 2021 Catformer: Designing Stable Transformers via Sensitivity Analysis ICML 2021 Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming NIPS 2020 Learning Adaptive Language Interfaces through Decomposition EMNLP 2020 Strategies for Pre-training Graph Neural Networks ICLR 2020 Distributionally Robust Neural Networks ICLR 2020 ExpBERT: Representation Engineering with Natural Language Explanations ACL 2020 Enabling Language Models to Fill in the Blanks ACL 2020 Robust Encodings: A Framework for Combating Adversarial Typos ACL 2020 Shaping Visual Representations with Language for Few-Shot Classification ACL 2020 Selective Question Answering under Domain Shift ACL 2020 Feature Noise Induces Loss Discrepancy Across Groups ICML 2020 Concept Bottleneck Models ICML 2020 Understanding Self-Training for Gradual Domain Adaptation ICML 2020 Understanding and Mitigating the Tradeoff between Robustness and Accuracy ICML 2020 An Investigation of Why Overparameterization Exacerbates Spurious Correlations ICML 2020 The EOS Decision and Length Extrapolation EMNLP 2020 Robustness to Spurious Correlations via Human Annotations ICML 2020 Graph-based, Self-Supervised Program Repair from Diagnostic Feedback ICML 2020 Selection via Proxy: Efficient Data Selection for Deep Learning ICLR 2020 RNNs can generate bounded hierarchical languages with optimal memory EMNLP 2020 On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks EMNLP 2020 Certified Robustness to Adversarial Word Substitutions EMNLP 2019 Distributionally Robust Language Modeling IJCNLP 2019 Inferring Multidimensional Rates of Aging from Cross-Sectional Data AISTATS 2019 Defending against Whitebox Adversarial Attacks via Randomized Discretization AISTATS 2019 Certified Robustness to Adversarial Word Substitutions IJCNLP 2019 Designing and Interpreting Probes with Control Tasks IJCNLP 2019 Pun Generation with Surprise NAACL 2019 Unifying Human and Statistical Evaluation for Natural Language Generation NAACL 2019 Designing and Interpreting Probes with Control Tasks EMNLP 2019 Distributionally Robust Language Modeling EMNLP 2019 Learning a SAT Solver from Single-Bit Supervision ICLR 2019 Verified Uncertainty Calibration NIPS 2019 On the Accuracy of Influence Functions for Measuring Group Effects NIPS 2019 SPoC: Search-based Pseudocode to Code NIPS 2019 Unlabeled Data Improves Adversarial Robustness NIPS 2019 Mapping natural language commands to web elements EMNLP 2018 QuAC: Question Answering in Context EMNLP 2018 Textual Analogy Parsing: What’s Shared and What’s Compared among Analogous Facts EMNLP 2018 Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer NAACL 2018 Fairness Without Demographics in Repeated Loss Minimization ICML 2018 A Retrieve-and-Edit Framework for Predicting Structured Outputs NIPS 2018 Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss NIPS 2018 Semidefinite relaxations for certifying robustness to adversarial examples NIPS 2018 Generalized Binary Search For Split-Neighborly Problems AISTATS 2018 Know What You Don’t Know: Unanswerable Questions for SQuAD ACL 2018 Training Classifiers with Natural Language Explanations ACL 2018 The price of debiasing automatic metrics in natural language evalaution ACL 2018 Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration ICLR 2018 Certified Defenses against Adversarial Examples ICLR 2018 On the Relationship between Data Efficiency and Error for Uncertainty Sampling ICML 2018 Decoupling Strategy and Generation in Negotiation Dialogues EMNLP 2018 Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings ACL 2017 Adversarial Examples for Evaluating Reading Comprehension Systems EMNLP 2017 Macro Grammars and Holistic Triggering for Efficient Semantic Parsing EMNLP 2017 Importance sampling for unbiased on-demand evaluation of knowledge base population EMNLP 2017 Naturalizing a Programming Language via Interactive Learning ACL 2017 From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood ACL 2017 Certified Defenses for Data Poisoning Attacks NIPS 2017 Unsupervised Transformation Learning via Convex Relaxations NIPS 2017 Learning Overcomplete HMMs NIPS 2017 Developing Bug-Free Machine Learning Systems With Formal Mathematics ICML 2017 World of Bits: An Open-Domain Platform for Web-Based Agents ICML 2017 Convexified Convolutional Neural Networks ICML 2017 A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics COLT 2017 Understanding Black-box Predictions via Influence Functions ICML 2017 Data Recombination for Neural Semantic Parsing ACL 2016 How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions ACL 2016 Unanimous Prediction for 100% Precision with Application to Learning Semantic Mappings ACL 2016 Simpler Context-Dependent Logical Forms via Model Projections ACL 2016 Learning Language Games through Interaction ACL 2016 SQuAD: 100,000+ Questions for Machine Comprehension of Text EMNLP 2016 Estimation from Indirect Supervision with Linear Moments ICML 2016 Unsupervised Risk Estimation Using Only Conditional Independence Structure NIPS 2016 Inferring Logical Forms From Denotations ACL 2016 Estimating Mixture Models via Mixtures of Polynomials NIPS 2015 Learning with Relaxed Supervision NIPS 2015 Compositional Semantic Parsing on Semi-Structured Tables ACL 2015 Reified Context Models ICML 2015 Learning Fast-Mixing Models for Structured Prediction ICML 2015 Building a Semantic Parser Overnight ACL 2015 Environment-Driven Lexicon Induction for High-Level Instructions ACL 2015 Traversing Knowledge Graphs in Vector Space EMNLP 2015 Tensor Factorization via Matrix Factorization AISTATS 2015 Learning Where to Sample in Structured Prediction AISTATS 2015 Compositional Semantic Parsing on Semi-Structured Tables IJCNLP 2015 Building a Semantic Parser Overnight IJCNLP 2015 Environment-Driven Lexicon Induction for High-Level Instructions IJCNLP 2015 On-the-Job Learning with Bayesian Decision Theory NIPS 2015 Calibrated Structured Prediction NIPS 2015 Zero-shot Entity Extraction from Web Pages ACL 2014 Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm ICML 2014 Filtering with Abstract Particles ICML 2014 Simple MAP Inference via Low-Rank Relaxations NIPS 2014 Altitude Training: Strong Bounds for Single-Layer Dropout NIPS 2014 Estimating Latent-Variable Graphical Models using Moments and Likelihoods ICML 2014 Semantic Parsing via Paraphrasing ACL 2014 Feature Noising for Log-Linear Structured Prediction EMNLP 2013 Semantic Parsing on Freebase from Question-Answer Pairs EMNLP 2013 Dropout Training as Adaptive Regularization NIPS 2013 Video Event Understanding Using Natural Language Descriptions ICCV 2013 Spectral Experts for Estimating Mixtures of Linear Regressions ICML 2013 Identifiability and Unmixing of Latent Parse Trees NIPS 2012 Learning Dependency-Based Compositional Semantics ACL 2011 A Simple Domain-Independent Probabilistic Approach to Generation EMNLP 2010 A Game-Theoretic Approach to Generating Spatial Descriptions EMNLP 2010 Type-Based MCMC NAACL 2010 Online EM for Unsupervised Models NAACL 2009 Learning Semantic Correspondences with Less Supervision IJCNLP 2009 Learning Semantic Correspondences with Less Supervision ACL 2009 Asymptotically Optimal Regularization in Smooth Parametric Models NIPS 2009 Analyzing the Errors of Unsupervised Learning ACL 2008 Learning Bilingual Lexicons from Monolingual Corpora ACL 2008 A Probabilistic Approach to Language Change NIPS 2007 Agreement-Based Learning NIPS 2007 A Probabilistic Approach to Diachronic Phonology CONLL 2007 The Infinite PCFG Using Hierarchical Dirichlet Processes EMNLP 2007 A Probabilistic Approach to Diachronic Phonology EMNLP 2007 The Infinite PCFG Using Hierarchical Dirichlet Processes CONLL 2007 An End-to-End Discriminative Approach to Machine Translation COLING 2006 Alignment by Agreement NAACL 2006 An End-to-End Discriminative Approach to Machine Translation ACL 2006