Junxian He

54 papers · 2018–2026 · 8 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (19) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (19) 🔬 Deep Specialist (11) 🧬 Topic Evolution 🤝 Dynamic Duo (17) 👑 Triple Crown 🗃️ Keyword Collector (181) ❓ The Questioner (3) ⚡ Prolific Year (11) 📈 Trend Setter 💎 Century Club (53) 🔥 Unstoppable (8)

Conferences

EMNLP (14) ACL (12) ICLR (10) NIPS (8) ICML (7) CONLL (1) EACL (1) IJCNLP (1)

Top co-authors

Graham Neubig (17) Taylor Berg-Kirkpatrick (11) shiqi chen (8) Xuezhe Ma (6) Jinghan Zhang (6) Junteng Liu (5) Yuxi Xie (4) Siyang Gao (4) Chang Ma (4) Chunting Zhou (4)

Research topics

Reinforcement Learning (1) Reasoning (1)

Keywords

large language model (11) language model (9) unsupervised learning (8) text generation (6) representation learning (4) dependency parsing (4) question answering (4) instruction tuning (3) cross-lingual transfer (3) synthetic datum (3) machine translation (3) variational autoencoder (3) low-resource language (3) benchmark evaluation (2) grammar induction (2) text summarization (2) visual grounding (2) constituency parsing (2) contrastive learning (2) prompt engineering (2)

Papers

How Can Synthetic Data Improve Multilingual Language Model Pretraining? A Data Quality Perspective ACL 2026 B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners ICLR 2025 OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis ACL 2025 Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning ACL 2025 Revisiting Scaling Laws for Language Models: The Role of Data Quality and Training Strategies ACL 2025 High-Dimensional Interlingual Representations of Large Language Models ACL 2025 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging ICML 2025 Non-myopic Generation of Language Models for Reasoning and Planning ICLR 2025 Predictive Data Selection: The Data That Predicts Is the Data That Teaches ICML 2025 Diving into Self-Evolving Training for Multimodal Reasoning ICML 2025 CodeIO: Condensing Reasoning Patterns via Code Input-Output Prediction ICML 2025 On the Perception Bottleneck of VLMs for Chart Understanding EMNLP 2025 Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas ICML 2025 Belief Revision: The Adaptability of Large Language Models Reasoning EMNLP 2024 IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce EMNLP 2024 DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving NIPS 2024 Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in LLMs NIPS 2024 AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents NIPS 2024 Prompt Optimization via Adversarial In-Context Learning ACL 2024 InstructCoder: Instruction Tuning Large Language Models for Code Editing ACL 2024 In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation ICML 2024 Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs ICLR 2024 What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning ICLR 2024 On the Universal Truthfulness Hyperplane Inside LLMs EMNLP 2024 Automatic Model Selection with Large Language Models for Reasoning EMNLP 2023 Mega: Moving Average Equipped Gated Attention ICLR 2023 Composing Parameter-Efficient Modules with Arithmetic Operation NIPS 2023 Self-Evaluation Guided Beam Search for Reasoning NIPS 2023 Contrastive Learning of Sentence Embeddings from Scratch EMNLP 2023 C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models NIPS 2023 FELM: Benchmarking Factuality Evaluation of Large Language Models NIPS 2023 Simple Temporal Adaptation to Changing Label Sets: Hashtag Prediction via Dense KNN EMNLP 2023 Prompt Consistency for Zero-Shot Task Generalization EMNLP 2022 CTRLsum: Towards Generic Controllable Text Summarization EMNLP 2022 Capturing Structural Locality in Non-parametric Language Models ICLR 2022 Towards a Unified View of Parameter-Efficient Transfer Learning ICLR 2022 Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval ICML 2022 Efficient Nearest Neighbor Language Models EMNLP 2021 Dependency Induction Through the Lens of Visual Perception CONLL 2021 The Source-Target Domain Mismatch Problem in Machine Translation EACL 2021 Dependency Induction Through the Lens of Visual Perception EMNLP 2021 Revisiting Self-Training for Neural Sequence Generation ICLR 2020 On the Sentence Embeddings from Pre-trained Language Models EMNLP 2020 Learning Sparse Prototypes for Text Generation NIPS 2020 A Probabilistic Formulation of Unsupervised Text Style Transfer ICLR 2020 Lagging Inference Networks and Posterior Collapse in Variational Autoencoders ICLR 2019 A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text EMNLP 2019 A Surprisingly Effective Fix for Deep Latent Variable Modeling of Text IJCNLP 2019 Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation ACL 2019 Cross-Lingual Syntactic Transfer through Unsupervised Adaptation of Invertible Projections ACL 2019 Choosing Transfer Languages for Cross-Lingual Learning ACL 2019 Unsupervised Learning of Syntactic Structure with Invertible Neural Projections EMNLP 2018 Texar: A Modularized, Versatile, and Extensible Toolbox for Text Generation ACL 2018 StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing ACL 2018