Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Application Areas
Machine Learning
›
Application Areas
›
Data Augmentation
3622 directly classified papers
Papers per year
2002: 2
2006: 1
2008: 2
2009: 1
2011: 3
2012: 3
2013: 9
2014: 8
2015: 7
2016: 35
2017: 45
2018: 108
2019: 239
2020: 329
2021: 477
2022: 518
2023: 607
2024: 561
2025: 546
2026: 121
Papers
A Survey on Efficient Large Language Model Training: From Data-centric Perspectives
ACL 2025
Interactive platform for the exploration of large-scale ‘living’ systematic maps
ACL 2025
K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean
ACL 2025
Overlapping Context with Variable-Length Stride Increases Diversity when Training Large Language Model for Code
ACL 2025
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
ACL 2025
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Models
ACL 2025
Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation
ACL 2025
Anastasia at SemEval-2025 Task 9: Subtask 1, Ensemble Learning with Data Augmentation and Focal Loss for Food Risk Classification.
ACL 2025
HTU at SemEval-2025 Task 11: Divide and Conquer - Multi-Label emotion classification using 6 DziriBERTs submodels with Label-fused Iterative Mask Filling technique for low-resource data augmentation.
ACL 2025
Tuebingen at SemEval-2025 Task 10: Class Weighting, External Knowledge and Data Augmentation in BERT Models
ACL 2025
Ustnlp16 at SemEval-2025 Task 9: Improving Model Performance through Imbalance Handling and Focal Loss
ACL 2025
Scalable Vision Language Model Training via High Quality Data Curation
ACL 2025
ScanEZ: Integrating Cognitive Models with Self-Supervised Learning for Spatiotemporal Scanpath Prediction
ACL 2025
Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction
ACL 2025
PATeam at SemEval-2025 Task 9: LLM-Augmented Fusion for AI-Driven Food Safety Hazard Detection
ACL 2025
BrightCookies at SemEval-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification
ACL 2025
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
ACL 2025
Transforming Causal LLM into MLM Encoder for Detecting Social Media Manipulation in Telegram
ACL 2025
Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios
CVPR 2025
Gender Swapping as a Data Augmentation Technique: Developing Gender-Balanced Datasets for Ukrainian Language Processing
ACL 2025
TeleAI at SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection with Prompt Engineering and Data Augmentation
ACL 2025
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
CVPR 2025
Augmenting Math Word Problems via Iterative Question Composing
AAAI 2025
BJTU at BEA 2025 Shared Task: Task-Aware Prompt Tuning and Data Augmentation for Evaluating AI Math Tutors
ACL 2025
Augmenting Perceptual Super-Resolution via Image Quality Predictors
CVPR 2025
<
1
…
9
10
11
…
145
>