Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Application Areas
Machine Learning
›
Application Areas
›
Data Augmentation
3622 directly classified papers
Papers per year
2002: 2
2006: 1
2008: 2
2009: 1
2011: 3
2012: 3
2013: 9
2014: 8
2015: 7
2016: 35
2017: 45
2018: 108
2019: 239
2020: 329
2021: 477
2022: 518
2023: 607
2024: 561
2025: 546
2026: 121
Papers
Fine-grained Control of Generative Data Augmentation in IoT Sensing
NIPS 2024
BIGOS V2 Benchmark for Polish ASR: Curated Datasets and Tools for Reproducible Evaluation
NIPS 2024
Grammar-based Data Augmentation for Low-Resource Languages: The Case of Guarani-Spanish Neural Machine Translation
NAACL 2024
CCSum: A Large-Scale and High-Quality Dataset for Abstractive News Summarization
NAACL 2024
Edu-ConvoKit: An Open-Source Library for Education Conversation Data
NAACL 2024
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation
ACL 2024
Tweak to Trust: Assessing the Reliability of Summarization Metrics in Contact Centers via Perturbed Summaries
NAACL 2024
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
ACL 2024
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
NIPS 2024
Brandeis at VarDial 2024 DSL-ML Shared Task: Multilingual Models, Simple Baselines and Data Augmentation
NAACL 2024
ATLAS: A System for PDF-centric Human Interaction Data Collection
NAACL 2024
DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset
NAACL 2024
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning
ACL 2024
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset
NAACL 2024
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
ACL 2024
DKE-Research at SemEval-2024 Task 2: Incorporating Data Augmentation with Generative Models and Biomedical Knowledge to Enhance Inference Robustness
NAACL 2024
EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models
NIPS 2024
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
ACL 2024
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions
ACL 2024
Assemblage: Automatic Binary Dataset Construction for Machine Learning
NIPS 2024
ROUGE-K: Do Your Summaries Have Keywords?
NAACL 2024
On the Robustness of Neural Models for Full Sentence Transformation
NAACL 2024
Oasis: Data Curation and Assessment System for Pretraining of Large Language Models
IJCAI 2024
Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis
NIPS 2024
<
1
…
30
31
32
…
145
>