Songyang Zhang
52 papers · 2017–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (13) π Interdisciplinary Bridge π Renaissance Researcher (6) π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π
Conference Polyglot
(13)
π
Cross-Pollinator
(12)
π₯
Mega-Team
(24)
π
Grand Slam
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π€
Dynamic Duo
(19)
ποΈ
Keyword Collector
(220)
β
The Questioner
(3)
β‘
Prolific Year
(7)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(49)
π₯
Unstoppable
(7)
Conferences
ACL (10)
CVPR (7)
AAAI (6)
ECCV (6)
EMNLP (4)
ICCV (4)
NAACL (4)
NIPS (4)
ICML (2)
IJCAI (2)
COLING (1)
ICLR (1)
INTERSPEECH (1)
Top co-authors
Research topics
Keywords
large language model
(14)
benchmark evaluation
(7)
evaluation benchmark
(5)
scene graph generation
(4)
vision transformer
(3)
few-shot learning
(3)
vision language model
(3)
image classification
(3)
multi-modal learning
(3)
graph neural network
(3)
visual recognition
(2)
unsupervised learning
(2)
game theory
(2)
semantic segmentation
(2)
reinforcement learning
(2)
class imbalance
(2)
grammar induction
(2)
automatic speech recognition
(2)
mathematical reasoning
(2)
natural language understanding
(2)
Papers
RouteMoA: Dynamic Routing without Pre-Inference Boosts Efficient Mixture-of-Agents
ACL 2026
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
AAAI 2026
Knowledge-to-Verification: Exploring RLVR for LLMs in Knowledge-Intensive Domains
ACL 2026
OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
ACL 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
EMNLP 2025
LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation
ICCV 2025
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
AAAI 2025
DualGFL: Federated Learning with a Dual-Level Coalition-Auction Game
AAAI 2025
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
ACL 2025
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
ACL 2025
Are Your LLMs Capable of Stable Reasoning?
ACL 2025
InternLM-Law: An Open-Sourced Chinese Legal Large Language Model
COLING 2025
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark
ACL 2024
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
NIPS 2024
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
EMNLP 2024
FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data
ICML 2024
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step
ACL 2024
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
ACL 2024
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
ACL 2024
LawBench: Benchmarking Legal Knowledge of Large Language Models
EMNLP 2024
MMBENCH: Is Your Multi-Modal Model an All-around Player?
ECCV 2024
BotChat: Evaluating LLMsβ Capabilities of Having Multi-Turn Dialogues
NAACL 2024
Fake Alignment: Are LLMs Really Aligned Well?
NAACL 2024
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks
NAACL 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
NIPS 2024
GTA: A Benchmark for General Tool Agents
NIPS 2024
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
CVPR 2024
Make-A-Video: Text-to-Video Generation without Text-Video Data
ICLR 2023
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
ICCV 2023
RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer
CVPR 2023
TG-VQA: Ternary Game of Video Question Answering
IJCAI 2023
Expanding Language-Image Pretrained Models for General Video Recognition
ECCV 2022
The Devil Is in the Labels: Noisy Label Correction for Robust Scene Graph Generation
CVPR 2022
SGTR: End-to-End Scene Graph Generation With Transformer
CVPR 2022
Action Quality Assessment with Temporal Parsing Transformer
ECCV 2022
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
ECCV 2022
Learning Semantic Correspondence with Sparse Annotations
ECCV 2022
Learning a Grammar Inducer from Massive Uncurated Instructional Videos
EMNLP 2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding
ICCV 2021
Dynamic Grained Encoder for Vision Transformers
NIPS 2021
Learning Implicit Temporal Alignment for Few-shot Video Classification
IJCAI 2021
Video-aided Unsupervised Grammar Induction
NAACL 2021
Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation
CVPR 2021
Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition
CVPR 2021
Boundary Proposal Network for Two-stage Natural Language Video Localization
AAAI 2021
Transformer with Bidirectional Decoder for Speech Recognition
INTERSPEECH 2020
Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language
AAAI 2020
Part-aware Prototype Network for Few-shot Semantic Segmentation
ECCV 2020
A Dual Attention Network with Semantic Embedding for Few-Shot Learning
AAAI 2019
Dynamic Context Correspondence Network for Semantic Alignment
ICCV 2019
LatentGNN: Learning Efficient Non-local Relations for Visual Recognition
ICML 2019
Predicting Salient Face in Multiple-Face Videos
CVPR 2017