Jan Cegin
9 papers · 2023–2026 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Conference Polyglot (3) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (23) π§ Keyword Pioneer π Cross-Pollinator (15)
β
The Questioner
Conferences
EMNLP (4)
EACL (3)
ACL (1)
NAACL (1)
Top co-authors
Keywords
large language model
(5)
data augmentation
(3)
low-resource language
(3)
text augmentation
(3)
text classification
(2)
synthetic data generation
(2)
synthetic datum
(2)
ensemble learning
(1)
model selection
(1)
cross-lingual transfer
(1)
intent classification
(1)
paraphrase generation
(1)
pre-trained language model
(1)
transfer learning
(1)
parameter-efficient fine-tuning
(1)
knowledge distillation
(1)
sample selection
(1)
noise regularization
(1)
model robustness
(1)
domain generalization
(1)
Papers
MultiCW: A Large-Scale Balanced Benchmark Dataset for Training Robust Check-Worthiness Detection Models
EACL 2026
RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets
EACL 2026
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
EACL 2026
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation
EMNLP 2025
A Rigorous Evaluation of LLM Data Generation Strategies for Low-Resource Languages
EMNLP 2025
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
NAACL 2025
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
ACL 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
EMNLP 2024
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness
EMNLP 2023