Valentin Hofmann

21 papers · 2020–2025 · 6 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌍 Conference Polyglot (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (13) 🏃 Academic Marathon (5)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (6) 🤝 Dynamic Duo (10) 👥 Mega-Team (36) 🧬 Topic Evolution 💎 Century Club (21) ⚡ Prolific Year (5) ❓ The Questioner (2) 🔥 Unstoppable (6) 🗃️ Keyword Collector (110)

Conferences

ACL (10) EMNLP (4) ICML (2) IJCNLP (2) NIPS (2) NAACL (1)

Top co-authors

Janet Pierrehumbert (10) Hinrich Schütze (9) Hinrich Schuetze (5) Leonie Weissweiler (3) Kyle Richardson (2) Oyvind Tafjord (2) Emanuele La Malfa (2) Dirk Groeneveld (2) Akshita Bhagia (2) Kyle Lo (2)

Research topics

Linguistics (1)

Keywords

pretrained language model (7) large language model (4) derivational morphology (4) contextualized embedding (3) word embedding (3) language model (3) semantic representation (2) semantic variability (2) word formation (2) graph neural network (2) bias detection (2) structured sparsity (2) complex word (2) word segmentation (1) computational linguistics (1) temporal dynamics (1) semantic analysis (1) model evaluation (1) opinion mining (1) benchmark evaluation (1)

Papers

Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race ACL 2025 Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks ACL 2025 Large Language Models Discriminate Against Speakers of German Dialects EMNLP 2025 MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization NIPS 2024 Paloma: A Benchmark for Evaluating Language Model Fit NIPS 2024 Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models ACL 2024 Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research ACL 2024 Graph-enhanced Large Language Models in Asynchronous Plan Reasoning ICML 2024 Counting the Bugs in ChatGPT’s Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model EMNLP 2023 An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers ACL 2022 The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative EMNLP 2022 Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology ICML 2022 Modeling Ideological Salience and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity NAACL 2022 CaMEL: Case Marker Extraction without Labels ACL 2022 Superbizarre Is Not Superb: Derivational Morphology Improves BERT’s Interpretation of Complex Words ACL 2021 Dynamic Contextualized Word Embeddings ACL 2021 Dynamic Contextualized Word Embeddings IJCNLP 2021 Superbizarre Is Not Superb: Derivational Morphology Improves BERT’s Interpretation of Complex Words IJCNLP 2021 A Graph Auto-encoder Model of Derivational Morphology ACL 2020 Predicting the Growth of Morphological Families from Social and Linguistic Factors ACL 2020 DagoBERT: Generating Derivational Morphology with a Pretrained Language Model EMNLP 2020