Feiyu Duan
7 papers · 2024–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (5) π Cross-Pollinator (13) πΊοΈ Taxonomy Completionist (15)
β‘
Prolific Year
(5)
Conferences
ACL (2)
EMNLP (2)
AAAI (1)
COLING (1)
NIPS (1)
Top co-authors
Keywords
large language model
(6)
data selection
(3)
language model
(2)
data quality
(2)
transfer learning
(2)
information retrieval
(1)
scaling law
(1)
continual pre-training
(1)
retrieval-augmented generation
(1)
downstream performance
(1)
mixture ratio
(1)
cross-domain generalization
(1)
pre-training datum
(1)
corpus filtering
(1)
multi-stage training
(1)
query generation
(1)
length control
(1)
quality rating
(1)
missing information
(1)
knowledge density
(1)
Papers
Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data
ACL 2025
Enhancing LLMs via High-Knowledge Data Selection
AAAI 2025
FRAME: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
ACL 2025
LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation
COLING 2025
FIRE: Flexible Integration of Data Quality Ratings for Effective Pretraining
EMNLP 2025
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
NIPS 2024
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness
EMNLP 2024