Brian Roark
52 papers · 2000–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🐣 Hot Topic Early Bird 🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (25)
🧭
Keyword Pioneer
🗺️
Taxonomy Completionist
(34)
🐣
Hot Topic Early Bird
🏠
Conference Loyalist
(23)
🧬
Topic Evolution
👥
Mega-Team
(27)
🏆
Keyword Champion
(2)
🗃️
Keyword Collector
(78)
⚡
Prolific Year
(6)
🚀
Conference Pioneer
💎
Century Club
(52)
🔥
Unstoppable
(13)
❓
The Questioner
(2)
Conferences
ACL (23)
NAACL (10)
EMNLP (9)
COLING (4)
EACL (3)
INTERSPEECH (2)
IJCNLP (1)
Top co-authors
Research topics
Keywords
language modeling
(3)
cross-linguistic analysis
(3)
text classification
(2)
large language model
(2)
speech recognition
(2)
information theory
(2)
morphological complexity
(2)
recurrent neural network
(2)
text normalization
(2)
abbreviation expansion
(2)
finite-state transducer
(2)
multilingual nlp
(2)
script normalization
(2)
language identification
(2)
natural language processing
(1)
speech processing
(1)
natural language generation
(1)
transfer learning
(1)
benchmark evaluation
(1)
language model adaptation
(1)
Papers
Improving Informally Romanized Language Identification
EMNLP 2025
Abbreviation Across the World’s Languages and Scripts
COLING 2024
Distinguishing Romanized Hindi from Romanized Urdu
ACL 2023
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
EMNLP 2023
Spelling convention sensitivity in neural language models
EACL 2023
Beyond Arabic: Software for Perso-Arabic Script Manipulation
EMNLP 2022
Design principles of an open-source language modeling microservice package for AAC text-entry applications
ACL 2022
Finite-state script normalization and processing utilities: The Nisaba Brahmic library
EACL 2021
Finding Concept-specific Biases in Form–Meaning Associations
NAACL 2021
Disambiguatory Signals are Stronger in Word-initial Positions
EACL 2021
Structured abbreviation expansion in context
EMNLP 2021
Rethinking Phonotactic Complexity
ACL 2019
Meaning to Form: Measuring Systematicity as Information
ACL 2019
What Kind of Language Is Hard to Language-Model?
ACL 2019
Are All Languages Equally Hard to Language-Model?
NAACL 2018
Learning N-Gram Language Models from Uncertain Data
INTERSPEECH 2016
Contextual Prediction Models for Speech Recognition
INTERSPEECH 2016
Hippocratic Abbreviation Expansion
ACL 2014
Data Driven Grammatical Error Detection in Transcripts of Children’s Speech
EMNLP 2014
Transforming trees into hedges and parsing with “hedgebank” grammars
ACL 2014
Smoothed marginal distribution constraints for language modeling
ACL 2013
Pair Language Models for Deriving Alternative Pronunciations and Spellings from Pronunciation Dictionaries
EMNLP 2013
Discriminative Joint Modeling of Lexical Variation and Acoustic Confusion for Automated Narrative Retelling Assessment
NAACL 2013
Distributional semantic models for the evaluation of disordered language
NAACL 2013
The OpenGrm open-source finite-state grammar software libraries
ACL 2012
Beam-Width Prediction for Efficient Context-Free Parsing
ACL 2011
Lexicographic Semirings for Exact Automata Encoding of Sequence Models
ACL 2011
An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling
ACL 2011
Unary Constraints for Efficient Context-Free Parsing
ACL 2011
Semi-Supervised Modeling for Prenominal Modifier Ordering
ACL 2011
Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
EMNLP 2011
Prenominal Modifier Ordering via Multiple Sequence Alignment
NAACL 2010
Linear Complexity Context-Free Parsing Pipelines via Chart Constraints
NAACL 2009
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
ACL 2009
Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing
EMNLP 2009
Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
IJCNLP 2009
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts
NAACL 2009
Classifying Chart Cells for Quadratic Complexity Context-Free Inference
COLING 2008
The utility of parse-derived features for automatic discourse segmentation
ACL 2007
Pipeline Iteration
ACL 2007
Probabilistic Context-Free Grammar Induction Based on Structural Zeros
NAACL 2006
PCFGs with Syntactic and Prosodic Indicators of Speech Repairs
COLING 2006
PCFGs with Syntactic and Prosodic Indicators of Speech Repairs
ACL 2006
Discriminative Syntactic Language Modeling for Speech Recognition
ACL 2005
Comparing and Combining Finite-State and Context-Free Parsers
EMNLP 2005
Incremental Parsing with the Perceptron Algorithm
ACL 2004
Language Model Adaptation with MAP Estimation and the Perceptron Algorithm
NAACL 2004
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm
ACL 2004
Supervised and unsupervised PCFG adaptation to novel domains
NAACL 2003
Generalized Algorithms for Constructing Statistical Language Models
ACL 2003
Markov Parsing: Lattice Rescoring with a Statistical Parser
ACL 2002
Compact non-left-recursive grammars using the selective left-corner transform and factoring
COLING 2000