Bin Zhu

27 papers · 2019–2026 · 11 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (10) 🏃 Academic Marathon (6) 🌍 Conference Polyglot (11) 🗺️ Taxonomy Completionist (68)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (11) 🏆 Grand Slam 🚀 Conference Pioneer 💎 Century Club (24) 🗃️ Keyword Collector (141) ⚡ Prolific Year (9) 🔥 Unstoppable (7)

Conferences

CVPR (5) AAAI (4) ACL (4) WACV (4) ICCV (3) NIPS (2) ECCV (1) EMNLP (1) ICLR (1) ICML (1) IJCAI (1)

Top co-authors

Jingjing Chen (5) Chong-Wah Ngo (5) Li Yuan (4) Yanbin Hao (4) Bin Lin (4) Yu-Gang Jiang (3) Yatian Pang (3) Jiaxi Cui (2) Harry Yang (2) Ser-Nam Lim (2)

Research topics

Privacy (2)

Keywords

multimodal learning (3) image generation (3) object segmentation (2) large language model (2) robotic manipulation (2) video understanding (2) convolutional neural network (2) adversarial training (2) egocentric video (2) food image (2) generative model (2) semantic segmentation (2) secure computation (1) video segmentation (1) benchmark evaluation (1) embedding space (1) causal inference (1) model security (1) text-to-image synthesis (1) representation learning (1)

Papers

OSCBench: Benchmarking Object State Change in Text-to-Video Generation ACL 2026 Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward AAAI 2026 Next Patch Prediction for AutoRegressive Visual Generation AAAI 2026 Hand1000: Generating Realistic Hands from Text with Only 1,000 Images AAAI 2025 Multimodal Interpretable Depression Analysis using Visual Physiological Audio and Textual Data WACV 2025 Retrieval Augmented Recipe Generation WACV 2025 Preference Optimization for Combinatorial Optimization Problems ICML 2025 HD-EPIC: A Highly-Detailed Egocentric Video Dataset CVPR 2025 RAGG: Retrieval-Augmented Grasp Generation Model AAAI 2025 PolarNeXt: Rethink Instance Segmentation with Polar Representation CVPR 2025 From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning ICCV 2025 DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses ICCV 2025 Video-LLaVA: Learning United Visual Representation by Alignment Before Projection EMNLP 2024 On the Vulnerability of Safety Alignment in Open-Access LLMs ACL 2024 Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective ECCV 2024 LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment ICLR 2024 Controlling Neural Style Transfer with Deep Reinforcement Learning IJCAI 2023 Exploring Robust Overfitting for Pre-trained Language Models ACL 2023 Towards Attack-tolerant Federated Learning via Critical Parameter Analysis ICCV 2023 EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations NIPS 2022 TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-training NIPS 2022 Improving Robustness of Language Models from a Geometry-aware Perspective ACL 2022 CPM R-CNN: Calibrating Point-Guided Misalignment in Object Detection WACV 2021 CookGAN: Causality Based Text-to-Image Synthesis CVPR 2020 Graph Neural Networks for Image Understanding Based on Multiple Cues: Group Emotion Recognition and Event Recognition as Use Cases WACV 2020 FALCON: A Fourier Transform Based Approach for Fast and Secure Convolutional Neural Network Predictions CVPR 2020 R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network CVPR 2019