Co-occurring keywords
Papers
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
CVPR 2025
ToVo: Toxicity Taxonomy via Voting
NAACL 2025
The Kyrgyz Seed Dataset Submission to the WMT25 Open Language Data Initiative Shared Task
EMNLP 2025
Explainable CED: A Dataset for Explainable Critical Error Detection in Machine Translation
NAACL 2024
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation
EMNLP 2024
You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions
EMNLP 2024
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
ACL 2024
Improving Multilingual Instruction Finetuning via Linguistically Natural and Diverse Datasets
EMNLP 2024