← Application Areas

Machine Learning › Application Areas ›

Data Augmentation

3622 directly classified papers

Papers per year

Papers

Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains ACL 2025

Nonlinear functional regression by functional deep neural network with kernel embedding JMLR 2025

A Survey on Efficient Large Language Model Training: From Data-centric Perspectives ACL 2025

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search EMNLP 2025

Interactive platform for the exploration of large-scale ‘living’ systematic maps ACL 2025

Rethinking Data Selection at Scale: Random Selection is Almost All You Need EMNLP 2025

Zhoumou at SemEval-2025 Task 1: Leveraging Multimodal Data Augmentation and Large Language Models for Enhanced Idiom Understanding SEMEVAL 2025

K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean ACL 2025

Overlapping Context with Variable-Length Stride Increases Diversity when Training Large Language Model for Code ACL 2025

Trans-Sent at SemEval-2025 Task 11: Text-based Multi-label Emotion Detection using Pre-Trained BERT Transformer Models ACL 2025

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation ACL 2025

Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities EMNLP 2025

D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Models ACL 2025

LLM-based Adversarial Dataset Augmentation for Automatic Media Bias Detection NAACL 2025

Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation ACL 2025

Deep at SemEval-2025 Task 11: A Multi-Stage Approach to Emotion Detection ACL 2025

Anastasia at SemEval-2025 Task 9: Subtask 1, Ensemble Learning with Data Augmentation and Focal Loss for Food Risk Classification. ACL 2025

Investigating the Effect of Backtranslation for Indic Languages COLING 2025

HTU at SemEval-2025 Task 11: Divide and Conquer - Multi-Label emotion classification using 6 DziriBERTs submodels with Label-fused Iterative Mask Filling technique for low-resource data augmentation. ACL 2025

Boosting Sentiment Analysis in Persian through a GAN-Based Synthetic Data Augmentation Method COLING 2025

Tuebingen at SemEval-2025 Task 10: Class Weighting, External Knowledge and Data Augmentation in BERT Models ACL 2025

Enhancing Low-Resource Text Classification with LLM-Generated Corpora : A Case Study on Olfactory Reference Extraction IJCNLP 2025

Ustnlp16 at SemEval-2025 Task 9: Improving Model Performance through Imbalance Handling and Focal Loss ACL 2025

Habib University at SemEval-2025 Task 9: Using Ensemble Models for Food Hazard Detection ACL 2025

Augmenting Math Word Problems via Iterative Question Composing AAAI 2025