Conghui He
64 papers · 2021–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Conference Polyglot (10) π§ Keyword Pioneer π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π Cross-Pollinator (14)
π
Cross-Pollinator
(14)
π
Renaissance Researcher
(7)
π₯
Mega-Team
(38)
π
Keyword Champion
(2)
π€
Dynamic Duo
(19)
π¬
Deep Specialist
(12)
π
Grand Slam
π
Century Club
(58)
π₯
Unstoppable
(5)
β
The Questioner
(4)
β‘
Prolific Year
(15)
ποΈ
Keyword Collector
(235)
Conferences
ACL (18)
CVPR (10)
ICCV (8)
EMNLP (7)
ICLR (7)
AAAI (6)
ECCV (5)
ICML (1)
NAACL (1)
NIPS (1)
Top co-authors
Keywords
large language model
(15)
multimodal learning
(6)
language model
(5)
data selection
(5)
vision-language model
(5)
hallucination mitigation
(4)
benchmark evaluation
(4)
remote sensing
(3)
document understanding
(3)
chain-of-thought reasoning
(3)
building segmentation
(3)
vision language model
(3)
mathematical reasoning
(3)
instruction tuning
(3)
semantic segmentation
(3)
multi-view learning
(2)
document parsing
(2)
video understanding
(2)
temporal reasoning
(2)
in-context learning
(2)
Papers
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
ACL 2026
The Data Frontier for Large Language Models: Selection, Synthesis, and Tools
ACL 2026
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once
ACL 2026
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs
ACL 2026
ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch
ACL 2026
Heterogeneous Adaptive Policy Optimization: Tailoring Optimization to Every Tokenβs Nature
ACL 2026
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration
ACL 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
ACL 2025
A Strategic Coordination Framework of Small LMs Matches Large LMs in Data Synthesis
ACL 2025
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning
ACL 2025
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge
ACL 2025
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
ACL 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
ACL 2025
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
ACL 2025
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
ICLR 2025
VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
AAAI 2025
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
AAAI 2025
Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning
AAAI 2025
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition
ACL 2025
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion
ACL 2025
Dataset Distillation with Neural Characteristic Function: A Minmax Perspective
CVPR 2025
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
CVPR 2025
Conical Visual Concentration for Efficient Large Vision-Language Models
CVPR 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
CVPR 2025
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
CVPR 2025
Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
EMNLP 2025
Stop Looking for βImportant Tokensβ in Multimodal Language Models: Duplication Matters More
EMNLP 2025
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer
EMNLP 2025
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
EMNLP 2025
Where am I? Cross-View Geo-localization with Natural Language Descriptions
ICCV 2025
Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis
ICCV 2025
LEGION: Learning to Ground and Explain for Synthetic Image Detection
ICCV 2025
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
ICCV 2025
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
ICCV 2025
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
ICLR 2025
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
ICLR 2025
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
ICLR 2025
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
ICLR 2025
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
ICLR 2025
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
ICLR 2025
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation
NAACL 2025
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
ICML 2024
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
ACL 2024
Parrot Captions Teach CLIP to Spot Text
ECCV 2024
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-Training
EMNLP 2024
LOCR: Location-Guided Transformer for Optical Character Recognition
EMNLP 2024
LongWanjuan: Towards Systematic Measurement for Long Text Quality
EMNLP 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
CVPR 2024
3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions
CVPR 2024
SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation
CVPR 2024
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training
ACL 2024
VIGC: Visual Instruction Generation and Correction
AAAI 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
NIPS 2024
MMBENCH: Is Your Multi-Modal Model an All-around Player?
ECCV 2024
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
ECCV 2024
Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network
ECCV 2024
Think Twice Before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving
CVPR 2023
V3Det: Vast Vocabulary Visual Detection Dataset
ICCV 2023
SEPT: Towards Scalable and Efficient Visual Pre-training
AAAI 2023
OmniCity: Omnipotent City Understanding With Multi-Level and Multi-View Images
CVPR 2023
PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark
ECCV 2022
Joint Semantic-geometric Learning for Polygonal Building Segmentation
AAAI 2021
Influence Selection for Active Learning
ICCV 2021
3D Building Reconstruction From Monocular Remote Sensing Images
ICCV 2021