conftrace_

Chanjun Park

53 papers · 2021–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+10 more ↓

🌈 Renaissance Researcher (8) 🐝 Cross-Pollinator (14) 🧭 Keyword Pioneer 🌍 Conference Polyglot (7) 🌉 Interdisciplinary Bridge

🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (85) 🤝 Dynamic Duo (38) 🏆 Keyword Champion (2) 🔬 Deep Specialist (16) ⚡ Prolific Year (6) 🔥 Unstoppable (5) 💎 Century Club (52) 🗃️ Keyword Collector (220) ❓ The Questioner (6)

Conferences

EMNLP (16) NAACL (14) ACL (10) COLING (6) IJCNLP (3) AACL (2) EACL (2)

Top co-authors

Heuiseok Lim (39) Jaehyung Seo (22) Hyeonseok Moon (22) Sugyeong Eo (19) Yungi Kim (10) DaHyun Jung (8) Dahyun Kim (7) Seungyoon Lee (7) Sukyung Lee (7) Jihoo Kim (6)

Research topics

Keywords

large language model (20) korean language (7) machine translation (7) question answering (6) benchmark evaluation (5) automatic speech recognition (4) retrieval-augmented generation (4) language model (3) low-resource language (3) critical error detection (3) language model evaluation (3) text generation (3) ensemble learning (2) few-shot learning (2) benchmark dataset (2) model ensemble (2) speech recognition (2) transfer learning (2) quality estimation (2) in-context learning (2)

Papers

LangSAE Editing: Improving Multilingual Information Retrieval via Post-hoc Language Identity Removal ACL 2026 Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora ACL 2025 From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems ACL 2025 Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval ACL 2025 Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching EMNLP 2025 Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models NAACL 2025 Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning EMNLP 2025 Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks EMNLP 2025 MultiDocFusion : Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents EMNLP 2025 LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models EMNLP 2025 ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction EMNLP 2025 HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts EMNLP 2025 CharacterGPT: A Persona Reconstruction Framework for Role-Playing Agents NAACL 2025 MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation NAACL 2025 Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models COLING 2025 sDPO: Don’t Use Your Data All at Once COLING 2025 Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models NAACL 2025 FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models NAACL 2025 Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs NAACL 2025 Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard NAACL 2025 CoME: An Unlearning-based Approach to Conflict-free Model Editing NAACL 2025 LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs NAACL 2025 Search if you don’t know! Knowledge-Augmented Korean Grammatical Error Correction with Large Language Models EMNLP 2024 Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark ACL 2024 Length-aware Byte Pair Encoding for Mitigating Over-segmentation in Korean Machine Translation ACL 2024 KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models ACL 2024 Detecting Critical Errors Considering Cross-Cultural Factors in English-Korean Translation COLING 2024 Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean COLING 2024 Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing EACL 2024 Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation EACL 2024 Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts EMNLP 2024 Evalverse: Unified and Accessible Library for Large Language Model Evaluation EMNLP 2024 SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models EMNLP 2024 Translation of Multifaceted Data without Re-Training of Machine Translation Systems EMNLP 2024 Explainable CED: A Dataset for Explainable Critical Error Detection in Machine Translation NAACL 2024 Exploring Inherent Biases in LLMs within Korean Social Context: A Comparative Analysis of ChatGPT and GPT-4 NAACL 2024 SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling NAACL 2024 CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients EMNLP 2023 KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing EMNLP 2023 Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection IJCNLP 2023 PEEP-Talk: A Situational Dialogue-based Chatbot for English Education ACL 2023 Improving Formality-Sensitive Machine Translation Using Data-Centric Approaches and Prompt Engineering ACL 2023 Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection AACL 2023 PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities IJCNLP 2022 A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation NAACL 2022 KU X Upstage’s Submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task EMNLP 2022 Focus on FoCus: Is FoCus focused on Context, Knowledge and Persona? COLING 2022 QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation COLING 2022 PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities AACL 2022 BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text ACL 2021 Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation EMNLP 2021 Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification NAACL 2021 BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text IJCNLP 2021