Dietrich Klakow

126 papers · 2005–2025 · 12 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🗺️ Taxonomy Completionist (21) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (12)

🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (5) 🗺️ Taxonomy Completionist (21) 🏠 Conference Loyalist (32) 🤝 Dynamic Duo (20) 🧬 Topic Evolution 👥 Mega-Team (45) 🔬 Deep Specialist (19) 🏆 Keyword Champion (3) 🗃️ Keyword Collector (399) ⚡ Prolific Year (13) ❓ The Questioner (10) 💎 Century Club (126) 🚀 Conference Pioneer 🔥 Unstoppable (10)

Conferences

EMNLP (32) ACL (20) INTERSPEECH (19) NAACL (15) COLING (13) EACL (12) IJCNLP (6) AAAI (3) ICLR (2) SEMEVAL (2) CONLL (1) ICML (1)

Top co-authors

Marius Mosbach (20) Xiaoyu Shen (16) Badr M. Abdullah (15) Michael A. Hedderich (13) David Ifeoluwa Adelani (12) Miaoran Zhang (12) Bernd Möbius (10) Dawei Zhu (9) Dana Ruiter (8) Vagrant Gautam (7)

Research topics

Education (2) Speech & Audio (1) Privacy (1)

Keywords

low-resource language (13) large language model (11) named entity recognition (11) cross-lingual transfer (10) multilingual nlp (7) multilingual model (7) transfer learning (7) text classification (7) machine translation (6) neural network (6) zero-shot learning (6) representation learning (6) african language (6) language model (5) natural language processing (5) domain adaptation (5) data augmentation (4) noisy label (4) speech processing (4) text generation (4)

Papers

Colombian Waitresses y Jueces canadienses: Gender and Country Biases in Occupation Recommendations from LLMs ACL 2025 Evaluating Intermediate Reasoning of Code-Assisted Large Language Models for Mathematics ACL 2025 Saarland-Groningen at NADI 2025 Shared Task: Effective Dialectal Arabic Speech Processing under Data Constraints EMNLP 2025 AFRIDOC-MT: Document-level MT Corpus for African Languages EMNLP 2025 Attention on Multiword Expressions: A Multilingual Study of BERT-based Models with Regard to Idiomaticity and Microsyntax NAACL 2025 PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks EMNLP 2025 Charting the Landscape of African NLP: Mapping Progress and Shaping the Road Ahead EMNLP 2025 Improving Semantic Understanding in Speech Language Models via Brain-tuning ICLR 2025 ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding NAACL 2025 Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding COLING 2025 Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches EMNLP 2025 INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages ACL 2025 It’s Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems ACL 2025 Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning ACL 2025 Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control INTERSPEECH 2024 On the Encoding of Gender in Transformer-based ASR Representations INTERSPEECH 2024 Wave to Interlingua: Analyzing Representations of Multilingual Speech Transformers for Spoken Language Translation INTERSPEECH 2024 Annotating Customer-Oriented Behaviour in Call Centre Sales Dialogues COLING 2024 EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation COLING 2024 Who Did You Blame When Your Project Failed? Designing a Corpus for Presupposition Generation in Cross-Examination Dialogues COLING 2024 Modeling Diachronic Change in English Scientific Writing over 300+ Years with Transformer-based Language Model Surprisal COLING 2024 Understanding “Democratization” in NLP and ML Research EMNLP 2024 From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP EMNLP 2024 Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice? EMNLP 2024 AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness NAACL 2024 What explains the success of cross-modal fine-tuning with ORCA? NAACL 2024 What Are the Rules? Discovering Constraints from Data AAAI 2024 AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness SEMEVAL 2024 The Hidden Space of Transformer Language Adapters ACL 2024 Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning ACL 2024 The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis ACL 2024 Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It? ACL 2024 An Interactive Toolkit for Approachable NLP ACL 2024 A Preference-driven Paradigm for Enhanced Translation with Large Language Models NAACL 2024 Cross-Linguistic Intelligibility of Non-Compositional Expressions in Spoken Context INTERSPEECH 2024 WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case EMNLP 2024 Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures CONLL 2023 An Information-Theoretic Analysis of Self-supervised Discrete Representations of Speech INTERSPEECH 2023 Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation ACL 2023 Weaker Than You Think: A Critical Look at Weakly Supervised Learning ACL 2023 MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African languages ACL 2023 A Lightweight Method to Generate Unanswerable Questions in English EMNLP 2023 Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures EMNLP 2023 On the Nature of Discrete Speech Representations in Multilingual Self-supervised Models EACL 2023 Meta Self-Refinement for Robust Learning with Weak Supervision EACL 2023 Multilingual Normalization of Temporal Expressions with Masked Language Models EACL 2023 Varepsilon kú mask: Integrating Yorùbá cultural greetings into machine translation EACL 2023 Information-Theoretic Characterization of Vowel Harmony: A Cross-Linguistic Study on Word Lists EACL 2023 On the N-gram Approximation of Pre-trained Language Models INTERSPEECH 2023 Mapping Phonology to Semantics: A Computational Model of Cross-Lingual Spoken-Word Recognition COLING 2022 Discovering Interpretable Data-to-Sequence Generators AAAI 2022 Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification ACL 2022 Knowledge Base Index Compression via Dimensionality and Precision Reduction ACL 2022 Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning COLING 2022 MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition EMNLP 2022 Analyzing the Representational Geometry of Acoustic Word Embeddings EMNLP 2022 Label-Descriptive Patterns and Their Application to Characterizing Classification Errors ICML 2022 Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings INTERSPEECH 2022 A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation NAACL 2022 MCSE: Multimodal Contrastive Learning of Sentence Embeddings NAACL 2022 Exploiting Social Media Content for Self-Supervised Style Transfer NAACL 2022 StereoKG: Data-Driven Knowledge Graph Construction For Cultural Knowledge and Stereotypes NAACL 2022 Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition INTERSPEECH 2021 How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings EMNLP 2021 A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios NAACL 2021 Emoji-Based Transfer Learning for Sentiment Tasks EACL 2021 Familiar words but strange voices: Modelling the influence of speech variability on word recognition EACL 2021 Do we read what we hear? Modeling orthographic influences on spoken word recognition EACL 2021 Analysing the Noise Model Error for Realistic Noisy Label Data AAAI 2021 On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines ICLR 2021 Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces ACL 2021 Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study INTERSPEECH 2021 Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic INTERSPEECH 2021 FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations EMNLP 2021 Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces IJCNLP 2021 Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors INTERSPEECH 2021 Preventing Author Profiling through Zero-Shot Multilingual Back-Translation EMNLP 2021 To Share or not to Share: Predicting Sets of Sources for Model Transfer Learning EMNLP 2021 Privacy Guarantees for De-Identifying Text Transformations INTERSPEECH 2020 Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages INTERSPEECH 2020 Defining Explanation in an AI Context EMNLP 2020 Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence ACL 2020 CoLi at UdS at SemEval-2020 Task 12: Offensive Tweet Detection with Ensembling COLING 2020 On the Interplay Between Fine-tuning and Sentence-Level Probing for Linguistic Knowledge in Pre-Trained Transformers EMNLP 2020 On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers EMNLP 2020 HUMAN: Hierarchical Universal Modular ANnotator EMNLP 2020 Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages EMNLP 2020 CoLi at UdS at SemEval-2020 Task 12: Offensive Tweet Detection with Ensembling SEMEVAL 2020 Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification COLING 2020 A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English COLING 2020 Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification EMNLP 2020 Select and Attend: Towards Controllable Content Selection in Text Generation IJCNLP 2019 Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels IJCNLP 2019 Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator IJCNLP 2019 Cross-lingual Transfer Learning for Japanese Named Entity Recognition NAACL 2019 Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling NAACL 2019 Select and Attend: Towards Controllable Content Selection in Text Generation EMNLP 2019 Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels EMNLP 2019 Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator EMNLP 2019 Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN) EMNLP 2019 Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data ACL 2018 NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation EMNLP 2018 Closing Brackets with Recurrent Neural Networks EMNLP 2018 A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models INTERSPEECH 2017 The Extended SPaRKy Restaurant Corpus: Designing a Corpus with Variable Information Density INTERSPEECH 2017 Approximated and Domain-Adapted LSTM Language Models for First-Pass Decoding in Speech Recognition INTERSPEECH 2017 Estimation of Gap Between Current Language Models and Human Performance INTERSPEECH 2017 Incremental Dialogue Act Recognition: Token- vs Chunk-Based Classification INTERSPEECH 2017 Long-Short Range Context Neural Networks for Language Modeling EMNLP 2016 Sequential Recurrent Neural Networks for Language Modeling INTERSPEECH 2016 Sub-Word Similarity based Search for Embeddings: Inducing Rare-Word Embeddings for Word Similarity Tasks and Language Modelling COLING 2016 Event participant modelling with neural networks EMNLP 2016 Unsupervised morph segmentation and statistical language models for vocabulary expansion ACL 2016 RelationFactory: A Fast, Modular and Effective System for Knowledge Base Population EACL 2014 Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction EACL 2014 Separating Brands from Types: an Investigation of Different Features for the Food Domain COLING 2014 Unsupervised Parsing for Generating Surface-Based Relation Extraction Patterns EACL 2014 Towards Contextual Healthiness Classification of Food Items - A Linguistic Approach IJCNLP 2013 Combining Generative and Discriminative Model Scores for Distant Supervision EMNLP 2013 Predicative Adjectives: An Unsupervised Criterion to Extract Subjective Adjectives NAACL 2013 Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction EACL 2012 Convolution Kernels for Opinion Holder Extraction NAACL 2010 A Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval NAACL 2010 Exploring Correlation of Dependency Relation Paths for Answer Extraction ACL 2006 Exploring Correlation of Dependency Relation Paths for Answer Extraction COLING 2006 Exploring Syntactic Relation Patterns for Question Answering IJCNLP 2005