Kai Zhang
166 papers · 2006–2026 · 19 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+20 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (19) π Interdisciplinary Bridge π Conference Polyglot (18)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(19)
π£
Hot Topic Early Bird
π
Keyword Trendsetter Combo
(4)
π
Conference Loyalist
(32)
π
Keyword Champion
π€
Dynamic Duo
(20)
π
Grand Slam
π₯
Mega-Team
(22)
π
Triple Crown
π±
Topic Pioneer
π¬
Deep Specialist
(27)
π§¬
Topic Evolution
π
Century Club
(159)
ποΈ
Keyword Collector
(64)
π
Trend Setter
π
Conference Pioneer
β‘
Prolific Year
(13)
π₯
Unstoppable
(11)
β
The Questioner
(2)
Conferences
CVPR (32)
AAAI (18)
ACL (16)
EMNLP (16)
ICLR (15)
ICCV (12)
NIPS (12)
ECCV (10)
ICML (9)
IJCAI (6)
NAACL (5)
IJCNLP (3)
COLING (2)
AISTATS (2)
MICCAI (2)
MIDL (2)
NSDI (2)
AACL (1)
WACV (1)
Top co-authors
Research topics
Keywords
image restoration
(14)
large language model
(14)
diffusion model
(10)
convolutional neural network
(7)
zero-shot learning
(7)
self-supervised learning
(7)
3d reconstruction
(6)
graph neural network
(6)
model compression
(6)
attention mechanism
(6)
reinforcement learning
(6)
contrastive learning
(5)
representation learning
(5)
unsupervised learning
(4)
language model
(4)
transfer learning
(4)
image super-resolution
(4)
data augmentation
(4)
multimodal learning
(4)
image denoising
(4)
Papers
RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
MIDL 2026
LADR: Locality-Aware Dynamic Rescue for Efficient Text-to-Image Generation with Diffusion Large Language Models
ACL 2026
Localized Low-Rank Adaptation within Clustered Parameter Subspaces
ACL 2026
From Darkness to Detail: Frequency-Aware SSMs for Low-Light Vision
WACV 2026
FantasyHSI: Video-Generation-Centric 4D Human Synthesis in Any Scene Through a Graph-Based Multi-Agent Framework
AAAI 2026
TTT-UNet: Enhancing U-Net with Test-Time Training Layers for Biomedical Image Segmentation
MIDL 2026
Markovian Linguistic-Temporal Bridge: Unlocking the Potential of LLMs for Time Series Forecasting
ACL 2026
RecCocktail: A Generalizable and Efficient Framework for LLM-Based Recommendation
AAAI 2026
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
CVPR 2025
Generating 3D-Consistent Videos from Unposed Internet Photos
CVPR 2025
Turbo3D: Ultra-fast Text-to-3D Generation
CVPR 2025
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
CVPR 2025
Brain-Heart-Gut Guided Multi-Constraint Knowledge Distillation for Early Alzheimerβs Disease Diagnosis
MICCAI 2025
EditGRPO: Reinforcement Learning with Post -Rollout Edits for Clinically Accurate Chest X-Ray Report Generation
IJCNLP 2025
Decoupling and Reconstructing: A Multimodal Sentiment Analysis Framework Towards Robustness
IJCAI 2025
WDMIR: Wavelet-Driven Multimodal Intent Recognition
IJCAI 2025
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
CVPR 2025
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
CVPR 2025
Enhancing Low-Light Images: A Synthetic Data Perspective on Practical and Generalizable Solutions
AAAI 2025
Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection
AAAI 2025
Intent Oriented Contrastive Learning for Sequential Recommendation
AAAI 2025
Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation
AAAI 2025
Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition
AAAI 2025
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
AAAI 2025
EditGRPO: Reinforcement Learning with Post -Rollout Edits for Clinically Accurate Chest X-Ray Report Generation
AACL 2025
AAAR-1.0: Assessing AIβs Potential to Assist Research
ICML 2025
Gaussian Mixture Flow Matching Models
ICML 2025
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
ICLR 2025
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
ICLR 2025
Problem-Parameter-Free Federated Learning
ICLR 2025
Stepwise Reasoning Disruption Attack of LLMs
ACL 2025
RolePlot: A Systematic Framework for Evaluating and Enhancing the Plot-Progression Capabilities of Role-Playing Agents
ACL 2025
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
ACL 2025
MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality
ACL 2025
Unified Parameter-Efficient Unlearning for LLMs
ICLR 2025
Testing Conditional Independence with Deep Neural Network Based Binary Expansion Testing (DeepBET)
AISTATS 2025
PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
ICLR 2025
Revealing the Barriers of Language Agents in Planning
NAACL 2025
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
ICLR 2025
Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
ICLR 2025
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
ICCV 2025
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
ICCV 2025
Reverse Convolution and Its Applications to Image Restoration
ICCV 2025
RayZer: A Self-supervised Large View Synthesis Model
ICCV 2025
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction
ICCV 2025
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
EMNLP 2025
ReAL: How Can LLMs Simulate the Real Teacher? Retrieval-enhanced Agent for Adaptive Learning
EMNLP 2025
Beyond Dynamic Quantization: An Efficient Static Hierarchical Mix-precision Framework for Near-Lossless LLM Compression
EMNLP 2025
ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression
CVPR 2025
DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution
CVPR 2025
DATENeRF: Depth-Aware Text-based Editing of NeRFs
ECCV 2024
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
NIPS 2024
MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
NIPS 2024
DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking
NIPS 2024
Neural Gaffer: Relighting Any Object via Diffusion
NIPS 2024
PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology
AAAI 2024
DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images
AAAI 2024
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
AAAI 2024
Ο-Light: Programmatic Interpretable Reinforcement Learning for Resource-Limited Traffic Signal Control
AAAI 2024
SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training
ACL 2024
Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization
ACL 2024
CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models
ACL 2024
RePair: Automated Program Repair with Process-based Feedback
ACL 2024
Knowledge Triplets Derivation from Scientific Publications via Dual-Graph Resonance
COLING 2024
Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising
CVPR 2024
Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling
CVPR 2024
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
CVPR 2024
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
CVPR 2024
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
CVPR 2024
Deep Equilibrium Diffusion Restoration with Parallel Sampling
CVPR 2024
Equivariant Multi-Modality Image Fusion
CVPR 2024
Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
ECCV 2024
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
ECCV 2024
MegaScenes: Scene-Level View Synthesis at Scale
ECCV 2024
MoVideo: Motion-Aware Video Generation with Diffusion Models
ECCV 2024
PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
ECCV 2024
Message Passing on Semantic-Anchor-Graphs for Fine-grained Emotion Representation Learning and Classification
EMNLP 2024
ARM: An Alignment-and-Replacement Module for Chinese Spelling Check Based on LLMs
EMNLP 2024
Dynamic Multi-granularity Attribution Network for Aspect-based Sentiment Analysis
EMNLP 2024
Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models
EMNLP 2024
OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting
EMNLP 2024
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction Following
ICLR 2024
ImagenHub: Standardizing the evaluation of conditional image generation models
ICLR 2024
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
ICLR 2024
Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model
ICLR 2024
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model
ICLR 2024
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
ICLR 2024
LRM: Large Reconstruction Model for Single Image to 3D
ICLR 2024
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
ICML 2024
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
ICML 2024
Federated Self-Explaining GNNs with Anti-shortcut Augmentations
ICML 2024
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
ICML 2024
Lightweight Image Super-Resolution via Flexible Meta Pruning
ICML 2024
Cross-View Diversity Embedded Consensus Learning for Multi-View Clustering
IJCAI 2024
Pre-training General User Representation with Multi-type APP Behaviors
IJCAI 2024
An Evaluation of State-of-the-Art Projectors in the Presence of Noise and Nonlinearity in the Beer-Lambert Law
MICCAI 2024
LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination
NAACL 2024
Mindβs Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models
NAACL 2024
Dual-Channel Span for Aspect Sentiment Triplet Extraction
EMNLP 2023
Content- and Topology-Aware Representation Learning for Scientific Multi-Literature
EMNLP 2023
AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking
NIPS 2023
Automatic Evaluation of Attribution by Large Language Models
EMNLP 2023
Event-Based Frame Interpolation With Ad-Hoc Deblurring
CVPR 2023
CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution
CVPR 2023
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
NIPS 2023
Keep Skills in Mind: Understanding and Implementing Skills in Commonsense Question Answering
IJCAI 2023
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors
ACL 2023
Enhancing Hierarchical Text Classification through Knowledge Graph Integration
ACL 2023
Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation
ICCV 2023
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
ICCV 2023
RHGN: Relation-gated Heterogeneous Graph Network for Entity Alignment in Knowledge Graphs
ACL 2023
Towards Interpretable Video Super-Resolution via Alternating Optimization
ECCV 2022
Reference-Based Image Super-Resolution with Deformable Attention Transformer
ECCV 2022
IRON: Inverse Rendering by Optimizing Neural SDFs and Materials From Photometric Images
CVPR 2022
Structural Landmarking and Interaction Modelling: A βSLIMβ Network for Graph Classification
AAAI 2022
WT-MVSNet: Window-based Transformers for Multi-view Stereo
NIPS 2022
Recurrent Video Restoration Transformer with Guided Deformable Attention
NIPS 2022
Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
EMNLP 2022
CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations
COLING 2022
SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization
NIPS 2022
APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction
NIPS 2022
ARF: Artistic Radiance Fields
ECCV 2022
Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis
ACL 2022
ClusterGNN: Cluster-Based Coarse-To-Fine Graph Neural Network for Efficient Feature Matching
CVPR 2022
Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks
IJCNLP 2021
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution
ICCV 2021
Adversarial Language Games for Advanced Natural Language Intelligence
AAAI 2021
RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER
AAAI 2021
Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling
ICCV 2021
Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution
ICCV 2021
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures
CVPR 2021
Towards Flexible Blind JPEG Artifacts Removal
ICCV 2021
Flow-Based Kernel Prior With Application to Blind Super-Resolution
CVPR 2021
Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks
ACL 2021
NeuralAC: Learning Cooperation and Competition Effects for Match Outcome Prediction
AAAI 2021
PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting
CVPR 2021
Open Hierarchical Relation Extraction
NAACL 2021
AIRCODE: Hidden Screen-Camera Communication on an Invisible and Inaudible Dual Channel
NSDI 2021
GMOT-40: A Benchmark for Generic Multiple Object Tracking
CVPR 2021
PQA: Perceptual Question Answering
CVPR 2021
GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks
EMNLP 2021
Cascaded Semantic and Positional Self-Attention Network for Document Classification
EMNLP 2020
Neural Blind Deconvolution Using Deep Priors
CVPR 2020
Depth Sensing Beyond LiDAR Range
CVPR 2020
Deep Unfolding Network for Image Super-Resolution
CVPR 2020
Adaptive Structural Fingerprints for Graph Attention Networks
ICLR 2020
DHP: Differentiable Meta Pruning via HyperNetworks
ECCV 2020
Multi-Stage Pre-training for Automated Chinese Essay Scoring
EMNLP 2020
Learning Adaptive Random Features
AAAI 2019
Interactive Attention Transfer Network for Cross-Domain Sentiment Classification
AAAI 2019
Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style
ICCV 2019
Greedy Orthogonal Pivoting Algorithm for Non-Negative Matrix Factorization
ICML 2019
Toward Convolutional Blind Denoising of Real Photographs
CVPR 2019
Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels
CVPR 2019
Unsupervised Context Rewriting for Open Domain Conversation
EMNLP 2019
Unsupervised Context Rewriting for Open Domain Conversation
IJCNLP 2019
TOI-CNN: a Solution of Information Extraction on Chinese Insurance Policy
NAACL 2019
G-NET: Effective GPU Sharing in NFV Systems
NSDI 2018
Learning a Single Convolutional Super-Resolution Network for Multiple Degradations
CVPR 2018
Learning Deep CNN Denoiser Prior for Image Restoration
CVPR 2017
Distributed Flexible Nonlinear Tensor Factorization
NIPS 2016
Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events
IJCAI 2016
Covariate Shift in Hilbert Space: A Solution via Sorrogate Kernels
ICML 2013
Scaling up Kernel SVM on Limited Resources: A Low-rank Linearization Approach
AISTATS 2012
Simplifying Mixture Models through Function Approximation
NIPS 2006