conftrace_

Noah A. Smith

275 papers · 2002–2025 · 17 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+19 more ↓ πŸ—ΊοΈ Taxonomy Completionist (29) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🐣 Hot Topic Early Bird
🌈 Renaissance Researcher (7) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (9) 🏠 Conference Loyalist (46) πŸ† Keyword Champion (4) 🀝 Dynamic Duo (36) πŸ† Grand Slam πŸ‘‘ Triple Crown πŸ‘₯ Mega-Team (50) πŸ”¬ Deep Specialist (27) 🧬 Topic Evolution ❓ The Questioner (14) πŸ—ƒοΈ Keyword Collector (65) πŸš€ Conference Pioneer πŸ’Ž Century Club (275) πŸ”₯ Unstoppable (22) πŸ“ˆ Trend Setter ⚑ Prolific Year (16)

Conferences

EMNLP (77) ACL (75) NAACL (46) IJCNLP (19) NIPS (14) CONLL (11) ICLR (8) SEMEVAL (5) COLING (4) EACL (3) JMLR (3) INTERSPEECH (2) ICML (2) ICCV (2) CVPR (2) CLEAR (1) AAAI (1)

Papers

LlamaPIE: Proactive In-Ear Conversation Assistants ACL 2025 Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models CVPR 2025 RewardBench: Evaluating Reward Models for Language Modeling NAACL 2025 ComPO: Community Preferences for Language Model Personalization NAACL 2025 Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models NAACL 2025 Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation CVPR 2025 DataDecide: How to Predict Best Pretraining Data with Small Experiments ICML 2025 OLMoE: Open Mixture-of-Experts Language Models ICLR 2025 MUSE: Machine Unlearning Six-Way Evaluation for Language Models ICLR 2025 On Linear Representations and Pretraining Data Frequency in Language Models ICLR 2025 Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index EMNLP 2025 Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback ACL 2025 OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens ACL 2025 How Language Model Hallucinations Can Snowball ICML 2024 Paloma: A Benchmark for Evaluating Language Model Fit NIPS 2024 Evaluating Copyright Takedown Methods for Language Models NIPS 2024 Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models NIPS 2024 The Art of Saying No: Contextual Noncompliance in Language Models NIPS 2024 Decoding-Time Language Model Alignment with Multiple Objectives NIPS 2024 MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization NIPS 2024 Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback NIPS 2024 Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions NIPS 2024 Voices Unheard: NLP Resources and Models for YorΓΉbΓ‘ Regional Dialects EMNLP 2024 Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models EMNLP 2024 Evaluating n-Gram Novelty of Language Models Using Rusty-DAWG EMNLP 2024 Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals EMNLP 2024 Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging EMNLP 2024 CPS-TaskForge: Generating Collaborative Problem Solving Environments for Diverse Communication Tasks EMNLP 2024 Summarization-Based Document IDs for Generative Retrieval with Language Models EMNLP 2024 In-Context Pretraining: Language Modeling Beyond Document Boundaries ICLR 2024 What's In My Big Data? ICLR 2024 SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore ICLR 2024 A Call for Clarity in Beam Search: How It Works and When It Stops COLING 2024 Estimating the Causal Effect of Early ArXiving on Paper Acceptance CLEAR 2024 Elaboration-Generating Commonsense Question Answering at Scale ACL 2023 LEXPLAIN: Improving Model Explanations via Lexicon Supervision ACL 2023 Reproducibility in NLP: What Have We Learned from the Checklist? ACL 2023 PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 ICCV 2023 TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering ICCV 2023 Selective Annotation Makes Language Models Better Few-Shot Learners ICLR 2023 Binding Language Models in Symbolic Languages ICLR 2023 RealTime QA: What's the Answer Right Now? NIPS 2023 Fine-Grained Human Feedback Gives Better Rewards for Language Model Training NIPS 2023 Data-Efficient Finetuning Using Cross-Task Nearest Neighbors ACL 2023 Stubborn Lexical Bias in Data and Models ACL 2023 Risks and NLP Design: A Case Study on Procedural Document QA ACL 2023 Self-Instruct: Aligning Language Models with Self-Generated Instructions ACL 2023 How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources NIPS 2023 One Embedder, Any Task: Instruction-Finetuned Text Embeddings ACL 2023 NarrowBERT: Accelerating Masked Language Model Pretraining and Inference ACL 2023 Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text ACL 2022 How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers EMNLP 2022 Time Waits for No One! Analysis and Challenges of Temporal Misalignment NAACL 2022 Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection NAACL 2022 DEMix Layers: Disentangling Domains for Modular Language Modeling NAACL 2022 Modeling Context With Linear Attention for Scalable Document-Level Translation EMNLP 2022 Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand NAACL 2022 Transparent Human Evaluation for Image Captioning NAACL 2022 Twist Decoding: Diverse Generators Guide Each Other EMNLP 2022 NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics NAACL 2022 UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models EMNLP 2022 Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection EMNLP 2022 WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation EMNLP 2022 Unsupervised Learning of Hierarchical Conversation Structure EMNLP 2022 In-Context Learning for Few-Shot Dialogue State Tracking EMNLP 2022 GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation EMNLP 2022 Generating Scientific Definitions with Controllable Complexity ACL 2022 ABC: Attention with Bounded-memory Control ACL 2022 Expected Validation Performance and Estimation of a Random Variable’s Maximum EMNLP 2021 Specializing Multilingual Language Models: An Empirical Study EMNLP 2021 A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers NAACL 2021 Choose Your Own Adventure: Paired Suggestions in Collaborative Writing for Evaluating Story Generation Models NAACL 2021 Promoting Graph Awareness in Linearized Graph-to-Text Generation IJCNLP 2021 Explaining Relationships Between Scientific Documents ACL 2021 Shortformer: Better Language Modeling using Shorter Inputs ACL 2021 DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts ACL 2021 All That’s β€˜Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text ACL 2021 Promoting Graph Awareness in Linearized Graph-to-Text Generation ACL 2021 All That’s β€˜Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text IJCNLP 2021 DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts IJCNLP 2021 Shortformer: Better Language Modeling using Shorter Inputs IJCNLP 2021 Explaining Relationships Between Scientific Documents IJCNLP 2021 Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent EMNLP 2021 Competency Problems: On Finding and Removing Artifacts in Language Data EMNLP 2021 Sentence Bottleneck Autoencoders from Transformer Language Models EMNLP 2021 Measuring Association Between Labels and Free-Text Rationales EMNLP 2021 Finetuning Pretrained Transformers into RNNs EMNLP 2021 Probing Across Time: What Does RoBERTa Know and When? EMNLP 2021 A Mixture of h - 1 Heads is Better than h Heads ACL 2020 Thinking Like a Skeptic: Defeasible Inference in Natural Language EMNLP 2020 The Right Tool for the Job: Matching Model and Instance Complexities ACL 2020 Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks ACL 2020 Exploring the Effect of Author and Reader Identity in Online Story Writing: the STORIESINTHEWILD Corpus. ACL 2020 Evaluating Models’ Local Decision Boundaries via Contrast Sets EMNLP 2020 Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics EMNLP 2020 Plug and Play Autoencoders for Conditional Text Generation EMNLP 2020 Writing Strategies for Science Communication: Data and Computational Analysis EMNLP 2020 Multilevel Text Alignment with Cross-Document Attention EMNLP 2020 The Multilingual Amazon Reviews Corpus EMNLP 2020 Grounded Compositional Outputs for Adaptive Language Modeling EMNLP 2020 Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs EMNLP 2020 RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models EMNLP 2020 A Formal Hierarchy of RNN Architectures ACL 2020 Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models ACL 2020 Improving Transformer Models by Reordering their Sublayers ACL 2020 Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank EMNLP 2020 Social Bias Frames: Reasoning about Social and Power Implications of Language ACL 2020 Knowledge Enhanced Contextual Word Representations IJCNLP 2019 Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts ACL 2019 Evaluating Gender Bias in Machine Translation ACL 2019 The Risk of Racial Bias in Hate Speech Detection ACL 2019 Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning IJCNLP 2019 Topics to Avoid: Demoting Latent Confounds in Text Classification IJCNLP 2019 PaLM: A Hybrid Parser and Language Model IJCNLP 2019 Show Your Work: Improved Reporting of Experimental Results IJCNLP 2019 Robust Navigation with Language Pretraining and Stochastic Sampling IJCNLP 2019 RNN Architecture Learning with Sparse Regularization IJCNLP 2019 Low-Resource Parsing with Crosslingual Contextualized Representations CONLL 2019 Is Attention Interpretable? ACL 2019 Linguistic Knowledge and Transferability of Contextual Representations NAACL 2019 Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets NAACL 2019 Polyglot Contextual Representations Improve Crosslingual Transfer NAACL 2019 ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning AAAI 2019 Knowledge Enhanced Contextual Word Representations EMNLP 2019 RNN Architecture Learning with Sparse Regularization EMNLP 2019 Robust Navigation with Language Pretraining and Stochastic Sampling EMNLP 2019 Show Your Work: Improved Reporting of Experimental Results EMNLP 2019 PaLM: A Hybrid Parser and Language Model EMNLP 2019 Topics to Avoid: Demoting Latent Confounds in Text Classification EMNLP 2019 Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning EMNLP 2019 To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks ACL 2019 Variational Pretraining for Semi-supervised Text Classification ACL 2019 Polyglot Semantic Role Labeling ACL 2018 Discovering Phonesthemes with Sparse Regularization NAACL 2018 Parsing Tweets into Universal Dependencies NAACL 2018 LSTMs Exploit Linguistic Attributes of Data ACL 2018 Sounding Board: A User-Centric and Content-Driven Social Chatbot NAACL 2018 Annotation Artifacts in Natural Language Inference Data NAACL 2018 Neural Text Generation in Stories Using Entity Representations as Context NAACL 2018 The Importance of Calibration for Estimating Proportions from Annotations NAACL 2018 Learning Joint Semantic Parsers from Disjoint Data NAACL 2018 Bridging CNNs, RNNs, and Weighted Finite-State Machines ACL 2018 Neural Cross-Lingual Named Entity Recognition with Minimal Resources EMNLP 2018 Rational Recurrences EMNLP 2018 Syntactic Scaffolds for Semantic Structures EMNLP 2018 Event2Mind: Commonsense Inference on Events, Intents, and Reactions ACL 2018 Backpropagating through Structured Argmax using a SPIGOT ACL 2018 Neural Models for Documents with Metadata ACL 2018 Deep Multitask Learning for Semantic Dependency Parsing ACL 2017 Dynamic Entity Representations in Neural Language Models EMNLP 2017 What Do Recurrent Neural Network Grammars Learn About Syntax? EACL 2017 The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task CONLL 2017 Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts ACL 2017 Multitask Learning with CTC and Segmental CRF for Speech Recognition INTERSPEECH 2017 Neural Discourse Structure for Text Categorization ACL 2017 Analyzing Framing through the Casts of Characters in the News EMNLP 2016 CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss SEMEVAL 2016 Segmental Recurrent Neural Networks for End-to-End Speech Recognition INTERSPEECH 2016 Semi-Supervised Learning of Sequence Models with Method of Moments EMNLP 2016 Recurrent Neural Network Grammars NAACL 2016 Generation from Abstract Meaning Representation using Tree Transducers NAACL 2016 Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs CONLL 2016 Training with Exploration Improves a Greedy Stack LSTM Parser EMNLP 2016 Character Sequence Models for Colorful Words EMNLP 2016 Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser EMNLP 2016 Friends with Motives: Using Text to Infer Influence on SCOTUS EMNLP 2016 UW-CSE at SemEval-2016 Task 10: Detecting Multiword Expressions and Supersenses using Double-Chained Conditional Random Fields SEMEVAL 2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) ACL 2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) ACL 2016 The Media Frames Corpus: Annotations of Frames Across Issues IJCNLP 2015 Transition-Based Dependency Parsing with Stack Long Short-Term Memory ACL 2015 Sparse Overcomplete Word Vector Representations ACL 2015 Frame-Semantic Role Labeling with Heterogeneous Annotations ACL 2015 The Media Frames Corpus: Annotations of Frames Across Issues ACL 2015 A Supertag-Context Model for Weakly-Supervised CCG Parser Learning CONLL 2015 Open Extraction of Fine-Grained Political Statements EMNLP 2015 Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs EMNLP 2015 A Utility Model of Authors in the Scientific Community EMNLP 2015 Extractive Summarization by Maximizing Semantic Volume EMNLP 2015 Bayesian Optimization of Text Representations EMNLP 2015 Transition-Based Dependency Parsing with Stack Long Short-Term Memory IJCNLP 2015 Sparse Overcomplete Word Vector Representations IJCNLP 2015 Frame-Semantic Role Labeling with Heterogeneous Annotations IJCNLP 2015 AD3: Alternating Directions Dual Decomposition for MAP Inference in Graphical Models JMLR 2015 Transforming Dependencies into Phrase Structures NAACL 2015 Toward Abstractive Summarization Using Semantic Representations NAACL 2015 A Corpus and Model Integrating Multiword Expressions and Supersenses NAACL 2015 Retrofitting Word Vectors to Semantic Lexicons NAACL 2015 A Step Towards Usable Privacy Policy: Automatic Alignment of Privacy Statements COLING 2014 Simplified Dependency Annotations with GFL-Web ACL 2014 Distributed Representations of Geographically Situated Language ACL 2014 A Bayesian Mixed Effects Model of Literary Character ACL 2014 Linguistic Structured Sparsity in Text Categorization ACL 2014 A Discriminative Graph-Based Parser for the Abstract Meaning Representation ACL 2014 Conditional Random Field Autoencoders for Unsupervised Structured Prediction NIPS 2014 CMU: Arc-Factored, Discriminative Semantic Dependency Parsing SEMEVAL 2014 A Dependency Parser for Tweets EMNLP 2014 Weakly-Supervised Bayesian Learning of a CCG Supertagger CONLL 2014 Unsupervised Alignment of Privacy Policies using Hidden Markov Models ACL 2014 Measuring Ideological Proportions in Political Speeches EMNLP 2013 Supersense Tagging for Arabic: the MT-in-the-Middle Attack NAACL 2013 A Simple, Fast, and Effective Reparameterization of IBM Model 2 NAACL 2013 Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters NAACL 2013 Translating into Morphologically Rich Languages with Synthetic Phrases EMNLP 2013 Learning Topics and Positions from Debatepedia EMNLP 2013 Learning Latent Personas of Film Characters ACL 2013 Discrete Log-Linear Autoencoders for Unsupervised Learning of Linguistic Structure EMNLP 2013 Learning to Extract International Relations from Political Context ACL 2013 Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers ACL 2013 Knowledge-Rich Morphological Priors for Bayesian Language Models NAACL 2013 Concavity and Initialization for Unsupervised Dependency Parsing NAACL 2012 Word Salad: Relating Food Prices and Descriptions EMNLP 2012 Discovering Factions in the Computational Linguistics Community ACL 2012 A Probabilistic Model for Canonicalizing Named Entity Mentions ACL 2012 Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study ACL 2012 Recall-Oriented Learning of Named Entities in Arabic Wikipedia EACL 2012 An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints SEMEVAL 2012 Word Salad: Relating Food Prices and Descriptions CONLL 2012 Structured Sparsity in Natural Language Processing: Models, Algorithms and Applications NAACL 2012 Textual Predictors of Bill Survival in Congressional Committees NAACL 2012 Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties NAACL 2012 Structured Ramp Loss Minimization for Machine Translation NAACL 2012 Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance EMNLP 2011 Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability ACL 2011 Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments ACL 2011 Semi-Supervised Frame-Semantic Parsing for Unknown Predicates ACL 2011 Discovering Sociolinguistic Associations with Structured Sparsity ACL 2011 Unsupervised Word Alignment with Arbitrary Features ACL 2011 Predicting a Scientific Community’s Response to an Article EMNLP 2011 Quasi-Synchronous Phrase Dependency Grammars for Machine Translation EMNLP 2011 Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions NAACL 2010 Empirical Risk Minimization with Approximations of Probabilistic Grammars NIPS 2010 Probabilistic Frame-Semantic Parsing NAACL 2010 Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions NAACL 2010 Covariance in Unsupervised Learning of Probabilistic Grammars JMLR 2010 Good Question! Statistical Ranking for Question Generation NAACL 2010 Variational Inference for Adaptor Grammars NAACL 2010 Nonparametric Word Segmentation for Machine Translation COLING 2010 Movie Reviews and Revenues: An Experiment in Text Regression NAACL 2010 SEMAFOR: Frame Argument Resolution with Log-Linear Models SEMEVAL 2010 Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization ACL 2010 A Latent Variable Model for Geographic Lexical Variation EMNLP 2010 Distributed Asynchronous Online Learning for Natural Language Processing CONLL 2010 Nonextensive Information Theoretic Kernels on Measures JMLR 2009 Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings EACL 2009 Leveraging Structural Relations for Fluent Compressions at Multiple Compression Rates IJCNLP 2009 Variational Inference for Grammar Induction with Prior Knowledge IJCNLP 2009 Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition IJCNLP 2009 Feature-Rich Translation by Quasi-Synchronous Lattice Parsing EMNLP 2009 Leveraging Structural Relations for Fluent Compressions at Multiple Compression Rates ACL 2009 Variational Inference for Grammar Induction with Prior Knowledge ACL 2009 Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition ACL 2009 Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction NAACL 2009 Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation NAACL 2009 Predicting Risk from Financial Reports with Regression NAACL 2009 Predicting Response to Political Blog Posts with Topic Models NAACL 2009 Stacking Dependency Parsers EMNLP 2008 Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction NIPS 2008 What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA EMNLP 2007 Joint Morphological and Syntactic Disambiguation CONLL 2007 Probabilistic Models of Nonprojective Dependency Trees CONLL 2007 What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA CONLL 2007 Probabilistic Models of Nonprojective Dependency Trees EMNLP 2007 Joint Morphological and Syntactic Disambiguation EMNLP 2007 Computationally Efficient M-Estimation of Log-Linear Structure Models ACL 2007 Vine Parsing and Minimum Risk Reranking for Speed and Precision CONLL 2006 Annealing Structural Bias in Multilingual Weighted Grammar Induction COLING 2006 Annealing Structural Bias in Multilingual Weighted Grammar Induction ACL 2006 Contrastive Estimation: Training Log-Linear Models on Unlabeled Data ACL 2005 Context-Based Morphological Disambiguation with Random Fields EMNLP 2005 Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language EMNLP 2005 Annealing Techniques For Unsupervised Statistical Language Learning ACL 2004 Bilingual Parsing with Factored Estimation: Using English to Parse Korean EMNLP 2004 Dyna: A Language for Weighted Dynamic Programming ACL 2004 From Words to Corpora: Recognizing Translation EMNLP 2002