Xiaojun Chang
80 papers · 2015–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π£ Hot Topic Early Bird π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (13) π Interdisciplinary Bridge π Conference Polyglot (13)
π
Conference Polyglot
(13)
πΊοΈ
Taxonomy Completionist
(13)
π§
Keyword Pioneer
π
Conference Loyalist
(23)
π€
Dynamic Duo
(27)
π
Grand Slam
π₯
Mega-Team
(21)
π¬
Deep Specialist
(16)
π
Keyword Champion
(2)
π₯
Unstoppable
(11)
β‘
Prolific Year
(10)
ποΈ
Keyword Collector
(334)
π
Century Club
(75)
β
The Questioner
π
Trend Setter
π
Conference Pioneer
Conferences
CVPR (23)
AAAI (12)
IJCAI (11)
ICCV (7)
ICLR (7)
ECCV (6)
ICML (4)
NIPS (3)
ACL (2)
EMNLP (2)
EACL (1)
IJCNLP (1)
JMLR (1)
Top co-authors
Keywords
neural architecture search
(9)
feature extraction
(5)
self-supervised learning
(5)
model compression
(4)
video understanding
(4)
zero-shot learning
(4)
semi-supervised learning
(4)
multimodal learning
(4)
knowledge distillation
(4)
video classification
(4)
vision-language navigation
(4)
action recognition
(4)
domain adaptation
(3)
person re-identification
(3)
cross-modal retrieval
(3)
contrastive learning
(3)
data augmentation
(3)
vision transformer
(3)
attention mechanism
(3)
event detection
(3)
Papers
Measuring Social Bias in Vision-Language Models with Face-Only Counterfactuals from Real Photos
ACL 2026
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
AAAI 2026
Correspondence Coverage Matters for Multi-Modal Dataset Distillation
AAAI 2026
Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
AAAI 2026
Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models
AAAI 2026
Towards Efficient General Feature Prediction in Masked Skeleton Modeling
ICCV 2025
Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
AAAI 2025
Towards Open-Vocabulary Audio-Visual Event Localization
CVPR 2025
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
CVPR 2025
HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
AAAI 2025
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
ICLR 2025
Let LLM Tell What to Prune and How Much to Prune
ICML 2025
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
ICLR 2025
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
CVPR 2025
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
ECCV 2024
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation
AAAI 2024
SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation
AAAI 2024
ProAgent: Building Proactive Cooperative Agents with Large Language Models
AAAI 2024
Video Recognition in Portrait Mode
CVPR 2024
MLP Can Be A Good Transformer Learner
CVPR 2024
Masked Distillation Advances Self-Supervised Transformer Architecture Search
ICLR 2024
SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS
ICLR 2024
LongVLM: Efficient Long Video Understanding via Large Language Models
ECCV 2024
Learning with Counterfactual Explanations for Radiology Report Generation
ECCV 2024
Maximum Entropy Heterogeneous-Agent Reinforcement Learning
ICLR 2024
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities
EACL 2023
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
ICLR 2023
Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation
CVPR 2023
3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
AAAI 2023
Vision Language Navigation with Knowledge-driven Environmental Dreamer
IJCAI 2023
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library
JMLR 2023
Mask Propagation for Efficient Video Semantic Segmentation
NIPS 2023
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
ICCV 2023
FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration
ICCV 2023
Erratum to: 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
AAAI 2023
Knowledge Distillation via the Target-Aware Transformer
CVPR 2022
Dual-AI: Dual-Path Actor Interaction Learning for Group Activity Recognition
CVPR 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
CVPR 2022
Self-Supervised Global-Local Structure Modeling for Point Cloud Domain Adaptation With Reliable Voted Pseudo Labels
CVPR 2022
An Efficient Spatio-Temporal Pyramid Transformer for Action Detection
ECCV 2022
PAR: Political Actor Representation Learning with Social Context and Expert Knowledge
EMNLP 2022
Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL
ICML 2022
Cross-Modal Clinical Graph Transformer for Ophthalmic Report Generation
CVPR 2022
BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule
CVPR 2022
Beyond Fixation: Dynamic Window Visual Transformer
CVPR 2022
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
ICML 2021
Dynamic Slimmable Network
CVPR 2021
SOON: Scenario Oriented Object Navigation With Graph-Based Exploration
CVPR 2021
Vision-Language Navigation With Random Environmental Mixup
ICCV 2021
BossNAS: Exploring Hybrid CNN-Transformers With Block-Wisely Self-Supervised Neural Architecture Search
ICCV 2021
Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation
ICCV 2021
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers
ICLR 2021
Person Search Challenges and Solutions: A Survey
IJCAI 2021
Hierarchical Neural Architecture Search for Deep Stereo Matching
NIPS 2020
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation
CVPR 2020
Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement
NIPS 2020
Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization
CVPR 2020
Unity Style Transfer for Person Re-Identification
CVPR 2020
Mining Inter-Video Proposal Relations for Video Object Detection
ECCV 2020
Vision-Dialog Navigation by Exploring Cross-Modal Memory
CVPR 2020
Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks
CVPR 2020
Quadratic Sparse Gaussian Graphical Model Estimation Method for Massive Variables
IJCAI 2020
ZSTAD: Zero-Shot Temporal Activity Detection
CVPR 2020
Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting
ACL 2020
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
EMNLP 2019
Distributionally Robust Semi-Supervised Learning for People-Centric Sensing
AAAI 2019
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
IJCNLP 2019
Teaching Semi-Supervised Classifier via Generalized Distillation
IJCAI 2018
RCAA: Relational Context-Aware Agents for Person Search
ECCV 2018
Uncertainty Sampling for Action Recognition via Maximizing Expected Average Precision
IJCAI 2018
Reinforcement Cutting-Agent Learning for Video Object Segmentation
CVPR 2018
Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos
ICCV 2017
Discriminative Dictionary Learning With Ranking Metric Embedded for Person Re-Identification
IJCAI 2017
Top-k Supervise Feature Selection via ADMM for Integer Programming
IJCAI 2017
Self-paced Mixture of Regressions
IJCAI 2017
Adaptive Semi-Supervised Learning with Discriminative Least Squares Regression
IJCAI 2017
How Unlabeled Web Videos Help Complex Event Detection?
IJCAI 2017
They Are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers
CVPR 2016
Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection
IJCAI 2015
Complex Event Detection using Semantic Saliency and Nearly-Isotonic SVM
ICML 2015