Ge Zhang
68 papers · 2018–2026 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (15) π Conference Polyglot (14)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Academic Marathon
(7)
π€
Dynamic Duo
(21)
π
Grand Slam
π₯
Mega-Team
(32)
π¬
Deep Specialist
(15)
π§¬
Topic Evolution
π
Keyword Champion
(2)
β
The Questioner
(2)
ποΈ
Keyword Collector
(263)
π
Century Club
(61)
π₯
Unstoppable
(5)
π
Trend Setter
β‘
Prolific Year
(25)
Conferences
ACL (17)
ICLR (11)
NIPS (9)
EMNLP (7)
NAACL (5)
AAAI (3)
ICCV (3)
COLING (2)
CVPR (2)
EACL (2)
ICML (2)
IJCAI (2)
AACL (1)
ECCV (1)
SEMEVAL (1)
Top co-authors
Research topics
Keywords
large language model
(22)
benchmark evaluation
(10)
transfer learning
(5)
multimodal large language model
(5)
instruction tuning
(4)
multimodal learning
(4)
vision-language model
(4)
knowledge graph
(3)
reinforcement learning
(3)
multimodal understanding
(3)
domain adaptation
(2)
efficient computing
(2)
visual reasoning
(2)
language model
(2)
code generation
(2)
depth estimation
(2)
chain-of-thought reasoning
(2)
question answering
(2)
factuality evaluation
(2)
visual question answering
(2)
Papers
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
AAAI 2026
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
EACL 2026
MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
EACL 2026
EA-Agent: A Structured Multi-Step Reasoning Agent for Entity Alignment
ACL 2026
PRISM: Probabilistic Reward Model with Inherent Structural Modeling
ACL 2026
Do LLMs Know Tool Irrelevance? Demystifying Structural Alignment Bias in Tool Invocations
ACL 2026
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
ACL 2026
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
ICCV 2025
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
ICCV 2025
M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation
ACL 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
ACL 2025
Toward Modality Gap: Vision Prototype Learning for Weakly-supervised Semantic Segmentation with CLIP
AAAI 2025
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
ACL 2025
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
ICLR 2025
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
ICLR 2025
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
ICLR 2025
MuPT: A Generative Symbolic Music Pretrained Transformer
ICLR 2025
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text
ICLR 2025
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
ICLR 2025
McEval: Massively Multilingual Code Evaluation
ICLR 2025
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning
NAACL 2025
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
ICML 2025
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
NAACL 2025
LIME: Less Is More for MLLM Evaluation
ACL 2025
KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Modelβs Reasoning Path Aggregation
ACL 2025
Can MLLMs Understand the Deep Implication Behind Chinese Images?
ACL 2025
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
ACL 2025
MIO: A Foundation Model on Multimodal Tokens
EMNLP 2025
MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
EMNLP 2025
OAgents: An Empirical Study of Building Effective Agents
EMNLP 2025
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
AAAI 2025
Training Socially Aligned Language Models on Simulated Social Interactions
ICLR 2024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
NIPS 2024
RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts
NIPS 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
NIPS 2024
MAmmoTH2: Scaling Instructions from the Web
NIPS 2024
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
NIPS 2024
DDK: Distilling Domain Knowledge for Efficient Large Language Models
NIPS 2024
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
ACL 2024
E2-LLM: Efficient and Extreme Length Extension of Large Language Models
ACL 2024
ChatMusician: Understanding and Generating Music Intrinsically with LLM
ACL 2024
CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models
ACL 2024
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval
ACL 2024
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
ACL 2024
CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation
COLING 2024
MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces
COLING 2024
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
CVPR 2024
Improving Depth Completion via Depth Feature Upsampling
CVPR 2024
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
ECCV 2024
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
EMNLP 2024
MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language
EMNLP 2024
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
ICLR 2024
Massive Editing for Large Language Models via Meta Learning
ICLR 2024
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
ICLR 2024
AutoAgents: A Framework for Automatic Agent Generation
IJCAI 2024
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
NAACL 2024
MARBLE: Music Audio Representation Benchmark for Universal Evaluation
NIPS 2023
LRRU: Long-short Range Recurrent Updating Networks for Depth Completion
ICCV 2023
1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task
SEMEVAL 2022
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
NIPS 2022
Aligning Generative Language Models with Human Values
NAACL 2022
Dual-discriminative Graph Neural Network for Imbalanced Graph-level Anomaly Detection
NIPS 2022
HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models
AACL 2022
1Cademy at Semeval-2022 Task 1: Investigating the Effectiveness of Multilingual, Multitask, and Language-Agnostic Tricks for the Reverse Dictionary Task
NAACL 2022
1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector
EMNLP 2022
1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event Data
EMNLP 2022
Tilting the playing field: Dynamical loss functions for machine learning
ICML 2021
Finding Communities with Hierarchical Semantics by Distinguishing General and Specialized topics
IJCAI 2018