Jun Chen

79 papers · 2018–2026 · 15 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (17) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🌍 Conference Polyglot (15)

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (15) 🗺️ Taxonomy Completionist (17) 🤝 Dynamic Duo (16) 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion (2) 🚀 Conference Pioneer 🔥 Unstoppable (8) ⚡ Prolific Year (15) 🗃️ Keyword Collector (282) 💎 Century Club (75) ❓ The Questioner (2) 📈 Trend Setter

Conferences

AAAI (12) CVPR (9) ICCV (9) ECCV (8) NIPS (8) INTERSPEECH (6) ICLR (5) ACL (4) IJCAI (4) WACV (4) EMNLP (3) MICCAI (3) ICML (2) JMLR (1) NAACL (1)

Top co-authors

Mohamed Elhoseiny (16) Yong Liu (11) Hong Chen (6) Guang Dai (6) Xiaohong Liu (6) Zhiyong Wu (6) Huan Liu (6) Han Zhou (5) Wei Dong (5) Bin Gu (5)

Research topics

Privacy (1)

Keywords

image restoration (7) contrastive learning (5) attention mechanism (4) multimodal learning (4) domain adaptation (4) generalization bound (4) graph neural network (4) vision-language model (3) stochastic gradient descent (3) model compression (3) speech enhancement (3) video understanding (3) deep learning (3) convolutional neural network (3) unsupervised learning (3) knowledge distillation (2) information theory (2) speech recognition (2) semantic segmentation (2) zero-shot learning (2)

Papers

Zero-Reference Joint Low-Light Enhancement and Deblurring via Visual Autoregressive Modeling with VLM-Derived Modulation AAAI 2026 Robust Pedestrian Detection with Uncertain Modality AAAI 2026 From Recognition to Reasoning: Benchmarking and Enhancing MLLMs on Real-World Receipt Document Understanding ACL 2026 Listening Like Humans: Semantics-Guided Noise-Robust Multimodal Speech Recognition ACL 2026 LITA-GS: Illumination-Agnostic Novel View Synthesis via Reference-Free 3D Gaussian Splatting and Physical Priors CVPR 2025 Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents CVPR 2025 WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation ICCV 2025 Local Masked Reconstruction for Efficient Self-Supervised Learning on High-Resolution Images WACV 2025 Temporal Model-Based Federated Active Medical Image Classification MICCAI 2025 Eliminating Language Bias for Medical Visual Question Answering with Counterfactual Contrastive Training MICCAI 2025 Low-Light Image Enhancement via Generative Perceptual Priors AAAI 2025 Error Analysis Affected by Heavy-Tailed Gradients for Non-Convex Pairwise Stochastic Gradient Descent AAAI 2025 Balancing Privacy and Performance: A Many-in-One Approach for Image Anonymization AAAI 2025 Improving Retrieval Augmented Language Model with Self-Reasoning AAAI 2025 CLIMD: A Curriculum Learning Framework for Imbalanced Multimodal Diagnosis MICCAI 2025 LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement ACL 2025 LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding ICML 2025 Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling ICLR 2025 How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction ICML 2025 GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models ICLR 2025 Diffusion-Based Imaginative Coordination for Bimanual Manipulation ICCV 2025 4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding ICCV 2025 Structured Optimal Brain Pruning for Large Language Models EMNLP 2024 ECMamba: Consolidating Selective State Space Model with Retinex Guidance for Efficient Multiple Exposure Correction NIPS 2024 Minimum Entropy Coupling with Bottleneck NIPS 2024 How Does Black-Box Impact the Learning Guarantee of Stochastic Compositional Optimization? NIPS 2024 Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations AAAI 2024 A Multimodal, Multi-Task Adapting Framework for Video Action Recognition AAAI 2024 SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes AAAI 2024 Shallow-Deep Collaborative Learning for Unsupervised Visible-Infrared Person Re-Identification CVPR 2024 GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval ECCV 2024 Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time ECCV 2024 Timestep-Aware Correction for Quantized Diffusion Models ECCV 2024 UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks EMNLP 2024 MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models ICLR 2024 Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold ICLR 2024 Cross-Scale Domain Adaptation with Comprehensive Information for Pansharpening IJCAI 2024 Learning Discretized Neural Networks under Ricci Flow JMLR 2024 SumCSE: Summary as a transformation for Contrastive Learning NAACL 2024 Meta-Auxiliary Learning for Future Depth Prediction in Videos WACV 2023 Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only ICCV 2023 Towards Grand Unified Representation Learning for Unsupervised Visible-Infrared Person Re-Identification ICCV 2023 Learning Global-aware Kernel for Image Harmonization ICCV 2023 Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning ICCV 2023 MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding CVPR 2023 Contrastive Semi-Supervised Learning for Underwater Image Restoration via Reliable Bank CVPR 2023 Fine-Grained Theoretical Analysis of Federated Zeroth-Order Optimization NIPS 2023 SUBP: Soft Uniform Block Pruning for 1$\times$N Sparse CNNs Multithreading Acceleration NIPS 2023 Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information INTERSPEECH 2023 MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation INTERSPEECH 2023 Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction INTERSPEECH 2023 On the choice of Perception Loss Function for Learned Video Compression NIPS 2023 Stability-Based Generalization Analysis for Mixtures of Pointwise and Pairwise Learning AAAI 2023 On the Stability and Generalization of Triplet Learning AAAI 2023 RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition CVPR 2022 VisualGPT: Data-Efficient Adaptation of Pretrained Language Models for Image Captioning CVPR 2022 SuperLine3D: Self-Supervised Line Segmentation and Description for LiDAR Point Cloud ECCV 2022 Speech Enhancement with Fullband-Subband Cross-Attention Network INTERSPEECH 2022 Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay ECCV 2022 3DRefTransformer: Fine-Grained Object Identification in Real-World Scenes Using Natural Language WACV 2022 Learning to Train a Point Cloud Reconstruction Network without Matching ECCV 2022 LOSSY COMPRESSION WITH DISTRIBUTION SHIFT AS ENTROPY CONSTRAINED OPTIMAL TRANSPORT ICLR 2022 REMOTE: Reinforced Motion Transformation Network for Semi-supervised 2D Pose Estimation in Videos AAAI 2022 Resolution-Free Point Cloud Sampling Network with Data Distillation ECCV 2022 Towards Multi-Domain Single Image Dehazing via Test-Time Training CVPR 2022 Video Frame Interpolation Transformer CVPR 2022 A Novel Sequence-to-Subgraph Framework for Diagnosis Classification IJCAI 2021 Exploring Long Tail Visual Relationship Recognition With Large Vocabulary ICCV 2021 Universal Rate-Distortion-Perception Representations for Lossy Compression NIPS 2021 Automatic Detection of Alzheimer’s Disease Using Spontaneous Speech Only INTERSPEECH 2021 The Graph-based Mutual Attentive Network for Automatic Diagnosis IJCAI 2020 Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs ACL 2020 End-To-End Trainable Video Super-Resolution Based on a New Mechanism for Implicit Motion Estimation and Compensation WACV 2020 Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation NIPS 2020 When Pedestrian Detection Meets Nighttime Surveillance: A New Benchmark IJCAI 2020 Manifold Projection for Adversarial Defense on Face Recognition ECCV 2020 GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing ICCV 2019 An Attention-Based Hybrid Network for Automatic Detection of Alzheimer’s Disease from Narrative Speech INTERSPEECH 2019 Keyphrase Generation with Correlation Constraints EMNLP 2018