Tiago Pimentel

49 papers · 2019–2026 · 6 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (6) 🏃 Academic Marathon (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird

🐝 Cross-Pollinator (9) 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (6) 🏠 Conference Loyalist (24) 🤝 Dynamic Duo (36) 👥 Mega-Team (58) 🏆 Keyword Champion (2) 💎 Century Club (48) ⚡ Prolific Year (11) ❓ The Questioner 🗃️ Keyword Collector (206) 🔥 Unstoppable (7)

Conferences

ACL (25) EMNLP (15) NAACL (4) EACL (2) IJCNLP (2) ICLR (1)

Top co-authors

Ryan Cotterell (36) Clara Meister (15) Damian Blasi (7) Josef Valvoda (5) Adina Williams (5) Ethan Wilcox (5) Rowan Hall Maudslay (5) Eleanor Chodroff (5) Alex Warstadt (4) Thomas Hofmann (4)

Research topics

Linguistics (4)

Keywords

information theory (11) language model (10) mutual information (5) representation learning (5) reading time (4) probability distribution (3) transformer model (3) speech prosody (3) text generation (3) formal language (2) syntactic information (2) statistical learning (2) low-resource language (2) training dynamics (2) subword tokenization (2) cross-linguistic analysis (2) probabilistic modeling (2) dependency parsing (2) causal inference (2) language modeling (2)

Papers

What Do Prosody and Text Convey? Characterizing How Meaningful Information is Distributed Across Multiple Channels ACL 2026 Tokenisation is NP-Complete ACL 2025 Convergence and Divergence of Language Models under Different Random Seeds EMNLP 2025 The time scale of redundancy between prosody and linguistic context ACL 2025 Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent ACL 2025 Causal Estimation of Tokenisation Bias ACL 2025 Local and Global Decoding in Text Generation EMNLP 2024 Causal Estimation of Memorisation Profiles ACL 2024 On the Effect of (Near) Duplicate Subwords in Language Modelling ACL 2024 How to Compute the Probability of a Word EMNLP 2024 Towards a Similarity-adjusted Surprisal Theory EMNLP 2024 On the Intersection of Context-Free and Regular Languages EACL 2023 Language Model Quality Correlates with Psychometric Predictive Power in Multiple Languages EMNLP 2023 Revisiting the Optimality of Word Lengths EMNLP 2023 An Exploration of Left-Corner Transformations EMNLP 2023 On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation ICLR 2023 On the Efficacy of Sampling Adapters ACL 2023 A Measure-Theoretic Characterization of Tight Language Models ACL 2023 A Natural Bias for Language Generation Models ACL 2023 Generating Text from Language Models ACL 2023 Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation ACL 2023 Quantifying the redundancy between prosody and text EMNLP 2023 Analyzing Wrap-Up Effects through an Information-Theoretic Lens ACL 2022 Probing for the Usage of Grammatical Number ACL 2022 The Architectural Bottleneck Principle EMNLP 2022 On the probability–quality paradox in language generation ACL 2022 A Bayesian Framework for Information-Theoretic Probing EMNLP 2021 Modeling the Unigram Distribution ACL 2021 SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages ACL 2021 Disambiguatory Signals are Stronger in Word-initial Positions EACL 2021 A surprisal–duration trade-off across and within the world’s languages EMNLP 2021 Revisiting the Uniform Information Density Hypothesis EMNLP 2021 On Homophony and Rényi Entropy EMNLP 2021 Modeling the Unigram Distribution IJCNLP 2021 SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages IJCNLP 2021 A Non-Linear Structural Probe NAACL 2021 What About the Precedent: An Information-Theoretic Analysis of Common Law NAACL 2021 Finding Concept-specific Biases in Form–Meaning Associations NAACL 2021 How (Non-)Optimal is the Lexicon? NAACL 2021 A Tale of a Probe and a Parser ACL 2020 Information-Theoretic Probing for Linguistic Structure ACL 2020 A Corpus for Large-Scale Phonetic Typology ACL 2020 Metaphor Detection using Context and Concreteness ACL 2020 Predicting Declension Class from Form and Meaning ACL 2020 Speakers Fill Lexical Semantic Gaps with Context EMNLP 2020 Pareto Probing: Trading Off Accuracy for Complexity EMNLP 2020 SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ACL 2020 Rethinking Phonotactic Complexity ACL 2019 Meaning to Form: Measuring Systematicity as Information ACL 2019