Co-occurring keywords
Papers
Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives
EACL 2026
Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models
EMNLP 2025
Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning
ACL 2025
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models
ACL 2024