Seungone Kim

22 papers · 2022–2025 · 8 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🐝 Cross-Pollinator (13) 🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5)

🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (8) 👥 Mega-Team (49) 🤝 Dynamic Duo (10) 🔬 Deep Specialist (11) ❓ The Questioner ⚡ Prolific Year (8) 🗃️ Keyword Collector (77) 💎 Century Club (22)

Conferences

ACL (5) EMNLP (5) ICLR (5) NAACL (2) NIPS (2) COLING (1) EACL (1) ICML (1)

Top co-authors

Minjoon Seo (10) Seonghyeon Ye (7) Graham Neubig (5) Shayne Longpre (5) Joel Jang (5) Jamin Shin (4) Seongyun Lee (4) Niklas Muennighoff (4) Doyoung Kim (4) Juyoung Suk (4)

Keywords

large language model (8) language model (5) model evaluation (3) instruction following (3) instruction tuning (3) chain of thought (3) zero-shot learning (2) benchmark evaluation (2) language model evaluation (2) text classification (1) text generation (1) transfer learning (1) commonsense knowledge (1) language model adaptation (1) multilingual nlp (1) few-shot learning (1) llm evaluation (1) preference alignment (1) model alignment (1) dialogue safety (1)

Papers

Evaluating Language Models as Synthetic Data Generators ACL 2025 LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation ACL 2025 The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models NAACL 2025 Measuring Sycophancy of Language Models in Multi-turn Dialogues EMNLP 2025 KMMLU: Measuring Massive Multitask Language Understanding in Korean NAACL 2025 Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages ICLR 2025 Bridging the Data Provenance Gap Across Text, Speech, and Video ICLR 2025 Better Instruction-Following Through Minimum Bayes Risk ICLR 2025 Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models ICLR 2024 FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets ICLR 2024 Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models EMNLP 2024 Consent in Crisis: The Rapid Decline of the AI Data Commons NIPS 2024 Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once? ACL 2024 LangBridge: Multilingual Reasoning Without Multilingual Supervision ACL 2024 Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation ACL 2024 Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models EMNLP 2024 Aligning to Thousands of Preferences via System Message Generalization NIPS 2024 Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards EMNLP 2024 The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning EMNLP 2023 Exploring the Benefits of Training Expert Language Models over Instruction Tuning ICML 2023 CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification EACL 2023 Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization COLING 2022