Mikel Artetxe

46 papers · 2016–2025 · 8 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (10) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8)

🏃 Academic Marathon (9) 🐝 Cross-Pollinator (13) 🌈 Renaissance Researcher (5) 👥 Mega-Team (24) 🤝 Dynamic Duo (22) 🏆 Keyword Champion (2) 🔬 Deep Specialist (22) 🧬 Topic Evolution 🗃️ Keyword Collector (168) ❓ The Questioner (4) ⚡ Prolific Year (6) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (46) 🔥 Unstoppable (10)

Conferences

ACL (18) EMNLP (18) NAACL (4) NIPS (2) CONLL (1) EACL (1) ICLR (1) IJCNLP (1)

Top co-authors

Eneko Agirre (22) Gorka Labaka (15) Aitor Soroa (10) Luke Zettlemoyer (7) Aitor Ormazabal (7) Veselin Stoyanov (5) Naman Goyal (5) Julen Etxaniz (4) Shruti Bhosale (4) Jingfei Du (4)

Keywords

cross-lingual transfer (11) machine translation (10) language model (8) multilingual model (7) low-resource language (6) large language model (6) unsupervised machine translation (5) cross-lingual word embedding (5) multilingual language model (4) bilingual lexicon induction (4) unsupervised learning (4) transfer learning (4) masked language model (4) multilingual nlp (4) few-shot learning (3) cross-lingual embedding (3) zero-shot learning (3) in-context learning (3) multilingual machine translation (3) word embedding (3)

Papers

Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque EMNLP 2025 BOUQuET : dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation EMNLP 2025 Emergent Abilities of Large Language Models under Continued Pre-training for Language Adaptation ACL 2025 WiCkeD: A Simple Method to Make Multiple Choice Benchmarks More Challenging ACL 2025 Translate, Then Detect: Leveraging Machine Translation for Cross-Lingual Toxicity Classification EMNLP 2025 Latxa: An Open Language Model and Evaluation Suite for Basque ACL 2024 Improving Factuality in Clinical Abstractive Multi-Document Summarization by Guided Continued Pre-training NAACL 2024 Do Multilingual Language Models Think Better in English? NAACL 2024 Gender-specific Machine Translation with Large Language Models EMNLP 2024 BertaQA: How Much Do Language Models Know About Local Culture? NIPS 2024 The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants ACL 2024 Revisiting Machine Translation for Cross-lingual Classification EMNLP 2023 On the Role of Parallel Data in Cross-lingual Transfer Learning ACL 2023 Training Trajectories of Language Models Across Scales ACL 2023 Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training ACL 2023 CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models EMNLP 2023 Improving Language Plasticity via Pretraining with Active Forgetting NIPS 2023 Few-shot Learning with Multilingual Generative Language Models EMNLP 2022 Principled Paraphrase Generation with Parallel Corpora ACL 2022 Multilingual Machine Translation with Hyper-Adapters EMNLP 2022 Does Corpus Quality Really Matter for Low-Resource Languages? EMNLP 2022 Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models EMNLP 2022 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? EMNLP 2022 Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models EMNLP 2022 Efficient Large Scale Language Modeling with Mixtures of Experts EMNLP 2022 PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation EMNLP 2022 On the Role of Bidirectionality in Language Model Pre-Training EMNLP 2022 PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining NAACL 2022 Lifting the Curse of Multilinguality by Pre-training Modular Transformers NAACL 2022 Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders EACL 2021 Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring IJCNLP 2021 Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring ACL 2021 On the Cross-lingual Transferability of Monolingual Representations ACL 2020 Translation Artifacts in Cross-lingual Transfer Learning EMNLP 2020 Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining ACL 2020 A Call for More Rigor in Unsupervised Cross-lingual Learning ACL 2020 An Effective Approach to Unsupervised Machine Translation ACL 2019 Bilingual Lexicon Induction through Unsupervised Machine Translation ACL 2019 Analyzing the Limitations of Cross-lingual Word Embedding Mappings ACL 2019 Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings ACL 2019 Unsupervised Statistical Machine Translation EMNLP 2018 A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings ACL 2018 Unsupervised Neural Machine Translation ICLR 2018 Uncovering Divergent Linguistic Information in Word Embeddings with Lessons for Intrinsic and Extrinsic Evaluation CONLL 2018 Learning bilingual word embeddings with (almost) no bilingual data ACL 2017 Learning principled bilingual mappings of word embeddings while preserving monolingual invariance EMNLP 2016