← Application Areas

Machine Learning › Application Areas ›

Data Augmentation

3622 directly classified papers

Papers per year

Papers

No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data EACL 2026

The Impact of Highlighting Subjective Language on Perceived News Trustworthiness EACL 2026

TabGeoFlow: A Geometric Flow Matching Model for Tabular Data Synthesis AAAI 2026

Neural Tangent Kernels Under Stochastic Data Augmentation AAAI 2026

Attribution Analysis-based Concept Alignment: A Human-in-the-loop Data Debugging Framework AAAI 2026

MIMIC: Multi-party Dialogue Augmentation via Speaker Stylistic Transfer EACL 2026

Enhancing Urdu Sentiment Classification through Instruction-Tuned LLMs and Cross-Lingual Transfer EACL 2026

Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation EACL 2026

Boundary-Aware LLM Augmentation for Low-Resource Event Argument Extraction EACL 2026

STAR-1: Safer Alignment of Reasoning LLMs with 1K Data AAAI 2026

MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy AAAI 2026

DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation EACL 2026

VietMix: A Naturally-Occurring Parallel Corpus and Augmentation Framework for Vietnamese-English Code-Mixed Machine Translation EACL 2026

How DDAIR you? Disambiguated Data Augmentation for Intent Recognition EACL 2026

R-GDA: Reflective Guidance Data Augmentation with Multi-Agent Feedback for Domain-Specific Named Entity Recognition EACL 2026

BhashaKritika: Building Synthetic Pretraining Data at Scale for Indic Languages AAAI 2026

CGMIS: Concept-Graph Based Multi-Hop Instructions Synthesis for Enhancing Long-Context Reasoning AAAI 2026

Spectral Property-Driven Data Augmentation for Hyperspectral Single-Source Domain Generalization AAAI 2026

DP-GenG: Differentially Private Dataset Distillation Guided by DP-Generated Data AAAI 2026

SVS-GAN for Semantic Synthesis of Traffic Videos for Autonomous Driving WACV 2026

DiRe: Diversity-promoting Regularization for Dataset Condensation WACV 2026

Angeliki Linardatou at SemEval-2025 Task 11: Multi-label Emotion Detection ACL 2025

iShumei-Chinchunmei at SemEval-2025 Task 4: A balanced forgetting and retention multi-task framework using effective unlearning loss ACL 2025

FJWU_Squad at SemEval-2025 Task 1: An Idiom Visual Understanding Dataset for Idiom Learning SEMEVAL 2025

Towards Adversarially Robust Dataset Distillation by Curvature Regularization AAAI 2025