Bin Zhu
27 papers · 2019–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Interdisciplinary Bridge π Renaissance Researcher (10) π Academic Marathon (6) π Conference Polyglot (11) πΊοΈ Taxonomy Completionist (68)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(11)
π
Grand Slam
π
Conference Pioneer
π
Century Club
(24)
ποΈ
Keyword Collector
(141)
β‘
Prolific Year
(9)
π₯
Unstoppable
(7)
Conferences
CVPR (5)
AAAI (4)
ACL (4)
WACV (4)
ICCV (3)
NIPS (2)
ECCV (1)
EMNLP (1)
ICLR (1)
ICML (1)
IJCAI (1)
Top co-authors
Research topics
Keywords
multimodal learning
(3)
image generation
(3)
object segmentation
(2)
large language model
(2)
robotic manipulation
(2)
video understanding
(2)
convolutional neural network
(2)
adversarial training
(2)
egocentric video
(2)
food image
(2)
generative model
(2)
semantic segmentation
(2)
secure computation
(1)
video segmentation
(1)
benchmark evaluation
(1)
embedding space
(1)
causal inference
(1)
model security
(1)
text-to-image synthesis
(1)
representation learning
(1)
Papers
OSCBench: Benchmarking Object State Change in Text-to-Video Generation
ACL 2026
Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward
AAAI 2026
Next Patch Prediction for AutoRegressive Visual Generation
AAAI 2026
Hand1000: Generating Realistic Hands from Text with Only 1,000 Images
AAAI 2025
Multimodal Interpretable Depression Analysis using Visual Physiological Audio and Textual Data
WACV 2025
Retrieval Augmented Recipe Generation
WACV 2025
Preference Optimization for Combinatorial Optimization Problems
ICML 2025
HD-EPIC: A Highly-Detailed Egocentric Video Dataset
CVPR 2025
RAGG: Retrieval-Augmented Grasp Generation Model
AAAI 2025
PolarNeXt: Rethink Instance Segmentation with Polar Representation
CVPR 2025
From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning
ICCV 2025
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
ICCV 2025
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
EMNLP 2024
On the Vulnerability of Safety Alignment in Open-Access LLMs
ACL 2024
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
ECCV 2024
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
ICLR 2024
Controlling Neural Style Transfer with Deep Reinforcement Learning
IJCAI 2023
Exploring Robust Overfitting for Pre-trained Language Models
ACL 2023
Towards Attack-tolerant Federated Learning via Critical Parameter Analysis
ICCV 2023
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
NIPS 2022
TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training
NIPS 2022
Improving Robustness of Language Models from a Geometry-aware Perspective
ACL 2022
CPM R-CNN: Calibrating Point-Guided Misalignment in Object Detection
WACV 2021
CookGAN: Causality Based Text-to-Image Synthesis
CVPR 2020
Graph Neural Networks for Image Understanding Based on Multiple Cues: Group Emotion Recognition and Event Recognition as Use Cases
WACV 2020
FALCON: A Fourier Transform Based Approach for Fast and Secure Convolutional Neural Network Predictions
CVPR 2020
R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network
CVPR 2019