Seungone Kim
22 papers · 2022–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
🐝 Cross-Pollinator (13) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5)
🌈
Renaissance Researcher
(5)
🌍
Conference Polyglot
(8)
👥
Mega-Team
(49)
🤝
Dynamic Duo
(10)
🔬
Deep Specialist
(11)
❓
The Questioner
⚡
Prolific Year
(8)
🗃️
Keyword Collector
(77)
💎
Century Club
(22)
Conferences
ACL (5)
EMNLP (5)
ICLR (5)
NAACL (2)
NIPS (2)
COLING (1)
EACL (1)
ICML (1)
Top co-authors
Keywords
large language model
(8)
language model
(5)
model evaluation
(3)
instruction following
(3)
instruction tuning
(3)
chain of thought
(3)
zero-shot learning
(2)
benchmark evaluation
(2)
language model evaluation
(2)
text classification
(1)
text generation
(1)
transfer learning
(1)
commonsense knowledge
(1)
language model adaptation
(1)
multilingual nlp
(1)
few-shot learning
(1)
llm evaluation
(1)
preference alignment
(1)
model alignment
(1)
dialogue safety
(1)
Papers
Evaluating Language Models as Synthetic Data Generators
ACL 2025
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
ACL 2025
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
NAACL 2025
Measuring Sycophancy of Language Models in Multi-turn Dialogues
EMNLP 2025
KMMLU: Measuring Massive Multitask Language Understanding in Korean
NAACL 2025
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
ICLR 2025
Bridging the Data Provenance Gap Across Text, Speech, and Video
ICLR 2025
Better Instruction-Following Through Minimum Bayes Risk
ICLR 2025
Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models
ICLR 2024
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
ICLR 2024
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
EMNLP 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons
NIPS 2024
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
ACL 2024
LangBridge: Multilingual Reasoning Without Multilingual Supervision
ACL 2024
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
ACL 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
EMNLP 2024
Aligning to Thousands of Preferences via System Message Generalization
NIPS 2024
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards
EMNLP 2024
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
EMNLP 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
ICML 2023
CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification
EACL 2023
Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization
COLING 2022