Chanjun Park
53 papers · 2021–2026 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Renaissance Researcher (8) π Cross-Pollinator (14) π§ Keyword Pioneer π Conference Polyglot (7) π Interdisciplinary Bridge
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(85)
π€
Dynamic Duo
(38)
π
Keyword Champion
(2)
π¬
Deep Specialist
(16)
β‘
Prolific Year
(6)
π₯
Unstoppable
(5)
π
Century Club
(52)
ποΈ
Keyword Collector
(220)
β
The Questioner
(6)
Conferences
EMNLP (16)
NAACL (14)
ACL (10)
COLING (6)
IJCNLP (3)
AACL (2)
EACL (2)
Top co-authors
Research topics
Keywords
large language model
(20)
korean language
(7)
machine translation
(7)
question answering
(6)
benchmark evaluation
(5)
automatic speech recognition
(4)
retrieval-augmented generation
(4)
language model
(3)
low-resource language
(3)
critical error detection
(3)
language model evaluation
(3)
text generation
(3)
ensemble learning
(2)
few-shot learning
(2)
benchmark dataset
(2)
model ensemble
(2)
speech recognition
(2)
transfer learning
(2)
quality estimation
(2)
in-context learning
(2)
Papers
LangSAE Editing: Improving Multilingual Information Retrieval via Post-hoc Language Identity Removal
ACL 2026
Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora
ACL 2025
From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems
ACL 2025
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval
ACL 2025
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
EMNLP 2025
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models
NAACL 2025
Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning
EMNLP 2025
Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks
EMNLP 2025
MultiDocFusion : Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents
EMNLP 2025
LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models
EMNLP 2025
ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction
EMNLP 2025
HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts
EMNLP 2025
CharacterGPT: A Persona Reconstruction Framework for Role-Playing Agents
NAACL 2025
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation
NAACL 2025
Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models
COLING 2025
sDPO: Donβt Use Your Data All at Once
COLING 2025
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models
NAACL 2025
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
NAACL 2025
Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs
NAACL 2025
Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard
NAACL 2025
CoME: An Unlearning-based Approach to Conflict-free Model Editing
NAACL 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
NAACL 2025
Search if you donβt know! Knowledge-Augmented Korean Grammatical Error Correction with Large Language Models
EMNLP 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
ACL 2024
Length-aware Byte Pair Encoding for Mitigating Over-segmentation in Korean Machine Translation
ACL 2024
KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models
ACL 2024
Detecting Critical Errors Considering Cross-Cultural Factors in English-Korean Translation
COLING 2024
Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean
COLING 2024
Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing
EACL 2024
Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation
EACL 2024
Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts
EMNLP 2024
Evalverse: Unified and Accessible Library for Large Language Model Evaluation
EMNLP 2024
SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models
EMNLP 2024
Translation of Multifaceted Data without Re-Training of Machine Translation Systems
EMNLP 2024
Explainable CED: A Dataset for Explainable Critical Error Detection in Machine Translation
NAACL 2024
Exploring Inherent Biases in LLMs within Korean Social Context: A Comparative Analysis of ChatGPT and GPT-4
NAACL 2024
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
NAACL 2024
CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
EMNLP 2023
KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing
EMNLP 2023
Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection
IJCNLP 2023
PEEP-Talk: A Situational Dialogue-based Chatbot for English Education
ACL 2023
Improving Formality-Sensitive Machine Translation Using Data-Centric Approaches and Prompt Engineering
ACL 2023
Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection
AACL 2023
PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities
IJCNLP 2022
A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
NAACL 2022
KU X Upstageβs Submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task
EMNLP 2022
Focus on FoCus: Is FoCus focused on Context, Knowledge and Persona?
COLING 2022
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
COLING 2022
PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities
AACL 2022
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
ACL 2021
Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation
EMNLP 2021
Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification
NAACL 2021
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
IJCNLP 2021