Co-occurring keywords
Papers
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
NIPS 2024
Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation
ACL 2024
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
NIPS 2024
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models
NIPS 2024