Guangtao Zhai
67 papers · 2020–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Academic Marathon (5) π Conference Polyglot (9) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (5)
π
Cross-Pollinator
(5)
π
Renaissance Researcher
(10)
πΊοΈ
Taxonomy Completionist
(94)
π
Conference Loyalist
(20)
π¬
Deep Specialist
(13)
π€
Dynamic Duo
(27)
π
Triple Crown
π
Grand Slam
π
Keyword Champion
(7)
β‘
Prolific Year
(14)
π₯
Unstoppable
(6)
ποΈ
Keyword Collector
(269)
β
The Questioner
(3)
π
Century Club
(62)
Conferences
CVPR (20)
ICCV (12)
AAAI (9)
NIPS (8)
ECCV (6)
ACL (4)
ICLR (3)
IJCAI (3)
ICML (2)
Top co-authors
Keywords
large multimodal model
(7)
video quality assessment
(6)
benchmark evaluation
(5)
image quality assessment
(4)
diffusion model
(4)
image restoration
(3)
visual attention
(3)
generative model
(3)
3d vision
(3)
multimodal large language model
(3)
multimodal learning
(3)
video compression
(3)
video generation
(3)
image generation
(2)
visual perception
(2)
implicit representation
(2)
text-to-image generation
(2)
neural rendering
(2)
synthetic data generation
(2)
object detection
(2)
Papers
Scaling-up Perceptual Video Quality Assessment
AAAI 2026
VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning
AAAI 2026
GeoX-Bench: Benchmarking Cross-View Geo-Localization and Pose Estimation Capabilities of Large Multimodal Models
AAAI 2026
One Battle After Another: Probing LLMsβ Limits on Multi-Turn Instruction Following with a Benchmark Evolving Framework
ACL 2026
Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition
ACL 2026
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
ICCV 2025
FPEM: Face Prior Enhanced Facial Attractiveness Prediction for Live Videos with Face Retouching
ICCV 2025
VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression
AAAI 2025
Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads
ICCV 2025
Textured Mesh Saliency: Bridging Geometry and Texture for Human Perception in 3D Graphics
AAAI 2025
Low-Light Image Enhancement via Generative Perceptual Priors
AAAI 2025
Medical Manifestation-Aware De-Identification
AAAI 2025
Redundancy Principles for MLLMs Benchmarks
ACL 2025
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference
ACL 2025
AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
ICML 2025
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
ICLR 2025
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
ICLR 2025
LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
ICCV 2025
Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification
ICCV 2025
Information Density Principle for MLLM Benchmarks
ICCV 2025
TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning
ICCV 2025
Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs
CVPR 2025
4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video
CVPR 2025
Towards All-in-One Medical Image Re-Identification
CVPR 2025
FineVQ: Fine-Grained User Generated Content Video Quality Assessment
CVPR 2025
Image Quality Assessment: From Human to Machine Preference
CVPR 2025
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
CVPR 2025
Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes
CVPR 2025
Shadow Generation Using Diffusion Model with Geometry Prior
CVPR 2025
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
CVPR 2025
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM
CVPR 2025
UniProcessor: A Text-induced Unified Low-level Image Processor
ECCV 2024
Face2QR: A Unified Framework for Aesthetic, Face-Preserving, and Scannable QR Code Generation
NIPS 2024
Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare
NIPS 2024
GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
NIPS 2024
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
NIPS 2024
ResAD: A Simple Framework for Class Generalizable Anomaly Detection
NIPS 2024
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
CVPR 2024
Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation
CVPR 2024
Towards Open-ended Visual Quality Comparison
ECCV 2024
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval
ECCV 2024
Free-VSC: Free Semantics from Visual Foundation Models for Unsupervised Video Semantic Compression
ECCV 2024
Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision
ICLR 2024
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
ICML 2024
DiffStega: Towards Universal Training-Free Coverless Image Steganography with Diffusion Models
IJCAI 2024
Non-Semantics Suppressed Mask Learning for Unsupervised Video Semantic Compression
ICCV 2023
AccFlow: Backward Accumulation for Long-Range Optical Flow
ICCV 2023
CASP-Net: Rethinking Video Saliency Prediction From an Audio-Visual Consistency Perceptual Perspective
CVPR 2023
GANHead: Towards Generative Animatable Neural Head Avatars
CVPR 2023
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
CVPR 2023
Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
CVPR 2023
MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment
IJCAI 2023
Agglomerative Transformer for Human-Object Interaction Detection
ICCV 2023
End-to-End Human-Gaze-Target Detection With Transformers
CVPR 2022
Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography
CVPR 2022
Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop
NIPS 2022
Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows
ECCV 2022
CageNeRF: Cage-based Neural Radiance Field for Generalized 3D Deformation and Animation
NIPS 2022
Video-based Human-Object Interaction Detection from Tubelet Tokens
NIPS 2022
Learning Local Neighboring Structure for Robust 3D Shape Representation
AAAI 2021
Dual Attention Guided Gaze Target Detection in the Wild
CVPR 2021
Learning Spectral Dictionary for Local Representation of Mesh
IJCAI 2021
Self-Conditioned Probabilistic Learning of Video Rescaling
ICCV 2021
Looking Here or There? Gaze Following in 360-Degree Images
ICCV 2021
A New Ensemble Adversarial Attack Powered by Long-Term Gradient Memories
AAAI 2020
Blurry Video Frame Interpolation
CVPR 2020
Self-supervised Motion Representation via Scattering Local Motion Cues
ECCV 2020