Aitor Soroa
37 papers · 2006–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
🌍 Conference Polyglot (9) 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (19)
🧭
Keyword Pioneer
🐣
Hot Topic Early Bird
🏃
Academic Marathon
(19)
🤝
Dynamic Duo
(26)
👥
Mega-Team
(54)
🔬
Deep Specialist
(12)
🏆
Keyword Champion
🚀
Conference Pioneer
⚡
Prolific Year
(5)
🗃️
Keyword Collector
(108)
💎
Century Club
(36)
🔥
Unstoppable
(8)
❓
The Questioner
(3)
Conferences
ACL (10)
EMNLP (8)
NAACL (5)
COLING (4)
SEMEVAL (3)
EACL (2)
IJCNLP (2)
NIPS (2)
CONLL (1)
Top co-authors
Keywords
low-resource language
(7)
large language model
(5)
machine translation
(4)
multilingual nlp
(3)
bilingual lexicon induction
(3)
language model
(3)
multilingual model
(3)
cross-lingual word embedding
(3)
transfer learning
(3)
conversational question answering
(2)
multilingual dataset
(2)
zero-shot learning
(2)
multilingual language model
(2)
multilingual corpus
(2)
representation learning
(2)
embedding alignment
(2)
information retrieval
(2)
dialogue system
(2)
cross-lingual transfer
(1)
lexical semantics
(1)
Papers
Machine Translation for Low-Resource Languages through Monolingual Data and LLM: A Case Study of English-to-Basque
EACL 2026
EuskañolDS: A Naturally Sourced Corpus for Basque-Spanish Code-Switching
NAACL 2025
Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque
EMNLP 2025
The First Workshop on Multilingual Counterspeech Generation at COLING 2025: Overview of the Shared Task
COLING 2025
A LLM-based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation
EMNLP 2024
BertaQA: How Much Do Language Models Know About Local Culture?
NIPS 2024
Latxa: An Open Language Model and Evaluation Suite for Basque
ACL 2024
Do Multilingual Language Models Think Better in English?
NAACL 2024
XNLIeu: a dataset for cross-lingual NLI in Basque
NAACL 2024
Scaling Laws for BERT in Low-Resource Settings
ACL 2023
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
NIPS 2022
Principled Paraphrase Generation with Parallel Corpora
ACL 2022
Does Corpus Quality Really Matter for Low-Resource Languages?
EMNLP 2022
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation
EMNLP 2022
IrekiaLFes: a New Open Benchmark and Baseline Systems for Spanish Automatic Text Simplification
EMNLP 2022
Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring
IJCNLP 2021
Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring
ACL 2021
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning
COLING 2020
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems
EMNLP 2020
Automatic Evaluation vs. User Preference in Neural Textual QuestionAnswering over COVID-19 Scientific Literature
EMNLP 2020
DoQA - Accessing Domain-Specific FAQs via Conversational QA
ACL 2020
Analyzing the Limitations of Cross-lingual Word Embedding Mappings
ACL 2019
Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation
CONLL 2018
The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD
ACL 2018
Alleviating Poor Context with Background Knowledge for Named Entity Disambiguation
ACL 2016
Random Walks and Neural Network Language Models on Knowledge Bases
NAACL 2015
Improving distant supervision using inference learning
ACL 2015
Improving distant supervision using inference learning
IJCNLP 2015
“One Entity per Discourse” and “One Entity per Collocation” Improve Named-Entity Disambiguation
COLING 2014
PATHS: A System for Accessing Cultural Heritage Collections
ACL 2013
Comparing Taxonomies for Organising Collections of Documents
COLING 2012
Kyoto: An Integrated System for Specific Domain WSD
SEMEVAL 2010
A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches
NAACL 2009
Personalizing PageRank for Word Sense Disambiguation
EACL 2009
UBC-AS: A Graph Based Unsupervised System for Induction and Classification
SEMEVAL 2007
SemEval-2007 Task 02: Evaluating Word Sense Induction and Discrimination Systems
SEMEVAL 2007
Two graph-based algorithms for state-of-the-art WSD
EMNLP 2006