Lukas Edman

17 papers · 2019–2025 · 5 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (14)

🏃 Academic Marathon (6) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (14) 🧬 Topic Evolution 🔥 Unstoppable (7) 💎 Century Club (17) ❓ The Questioner 🗃️ Keyword Collector (68)

Conferences

EMNLP (8) ACL (4) CONLL (2) SEMEVAL (2) NAACL (1)

Top co-authors

Alexander Fraser (6) Antonio Toral (5) Gertjan van Noord (3) Lisa Bylinina (3) Esther Ploeger (2) Konstantin Chernyshev (2) Tommaso Caselli (2) Jennifer Spenader (2) Frank Van Den Berg (2) Ekaterina Garanina (2)

Keywords

low-resource language (4) large language model (4) neural machine translation (4) unsupervised machine translation (2) text classification (2) machine translation (2) sexism detection (2) token understanding (2) multi-task learning (2) transfer learning (2) morphological segmentation (2) domain-adaptive pre-training (2) data selection (1) synthetic datum (1) text representation (1) benchmark evaluation (1) masked language modeling (1) parallel corpus (1) subword tokenization (1) token prediction (1)

Papers

Findings of the WMT 2025 Shared Task LLMs with Limited Resources for Slavic Languages: MT and QA EMNLP 2025 Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding ACL 2025 EXECUTE: A Multilingual Benchmark for LLM Token Understanding ACL 2025 Mask and You Shall Receive: Optimizing Masked Language Modeling For Pretraining BabyLMs EMNLP 2025 Are BabyLMs Second Language Learners? CONLL 2024 CUTE: Measuring LLMs’ Understanding of Their Tokens EMNLP 2024 Too Much Information: Keeping Training Simple for BabyLMs CONLL 2023 LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification SEMEVAL 2023 LCT-1 at SemEval-2023 Task 10: Pre-training and Multi-task Learning for Sexism Detection and Classification ACL 2023 Too Much Information: Keeping Training Simple for BabyLMs EMNLP 2023 RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations SEMEVAL 2022 RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations NAACL 2022 Subword-Delimited Downsampling for Better Character-Level Translation EMNLP 2022 Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language EMNLP 2021 Data Selection for Unsupervised Translation of German–Upper Sorbian EMNLP 2020 Machine Translation for English–Inuktitut with Segmentation, Data Acquisition and Pre-Training EMNLP 2020 Neural Machine Translation for English–Kazakh with Morphological Segmentation and Synthetic Data ACL 2019