conftrace_

Catherine Arnett

8 papers · 2023–2026 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌍 Conference Polyglot (4) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

❓ The Questioner (2)

Conferences

EMNLP (3) ACL (2) COLING (2) NAACL (1)

Top co-authors

Tyler A. Chang (5) Benjamin K. Bergen (2) James A. Michaelov (2) Benjamin Bergen (2) Ahmad Mustapha Wali (1) Rafael Mosquera (1) Jean Maillard (1) Mithil Bangera (1) Hande Celikkanat (1) Benjamin L Rice (1)

Keywords

grammatical representation (2) language modeling (2) structural priming (2) multilingual language model (2) linear regression (1) low-resource language (1) subword tokenization (1) human annotation (1) multilingual model (1) syntactic structure (1) morphological analysis (1) high-resource language (1) text compression (1) multilingual dataset (1) byte-pair encoding (1) agglutinative language (1) model capacity (1) multilingual capability (1) multilingual corpus (1) bilingual language model (1)

Papers

CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data ACL 2026 Why do language models perform worse for morphologically complex languages? COLING 2025 On the Acquisition of Shared Grammatical Representations in Bilingual Language Models ACL 2025 When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages EMNLP 2024 BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training EMNLP 2024 Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement NAACL 2024 A Bit of a Problem: Measurement Disparities in Dataset Sizes across Languages COLING 2024 Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models EMNLP 2023