Jakub Simko
13 papers · 2022–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (28) π Conference Polyglot (4) π Renaissance Researcher (5) π Interdisciplinary Bridge π§ Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(4)
π€
Dynamic Duo
(10)
π
Century Club
(10)
β
The Questioner
π₯
Unstoppable
(5)
ποΈ
Keyword Collector
(60)
Conferences
EMNLP (7)
EACL (3)
ACL (1)
IJCAI (1)
NAACL (1)
Top co-authors
Keywords
large language model
(5)
data augmentation
(3)
text classification
(3)
low-resource language
(3)
text augmentation
(3)
machine-generated text detection
(2)
synthetic data generation
(2)
paraphrase generation
(2)
synthetic datum
(2)
ensemble learning
(1)
cross-lingual transfer
(1)
transfer learning
(1)
few-shot learning
(1)
information retrieval
(1)
multilingual nlp
(1)
knowledge distillation
(1)
domain generalization
(1)
model robustness
(1)
adversarial attack
(1)
intent classification
(1)
Papers
RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets
EACL 2026
MultiCW: A Large-Scale Balanced Benchmark Dataset for Training Robust Check-Worthiness Detection Models
EACL 2026
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
EACL 2026
A Rigorous Evaluation of LLM Data Generation Strategies for Low-Resource Languages
EMNLP 2025
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation
EMNLP 2025
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
NAACL 2025
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
ACL 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
EMNLP 2024
Authorship Obfuscation in Multilingual Machine-Generated Text Detection
EMNLP 2024
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness
EMNLP 2023
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
EMNLP 2023
Multilingual Previously Fact-Checked Claim Retrieval
EMNLP 2023
Black-box Audit of YouTube's Video Recommendation: Investigation of Misinformation Filter Bubble Dynamics (Extended Abstract)
IJCAI 2022