Bowen Zhang

67 papers · 2016–2026 · 16 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌍 Conference Polyglot (16) 🏃 Academic Marathon (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (12)

🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (11) 🗺️ Taxonomy Completionist (107) 🤝 Dynamic Duo (13) 👥 Mega-Team (29) 🧬 Topic Evolution 🏆 Grand Slam 🚀 Conference Pioneer 🗃️ Keyword Collector (246) 💎 Century Club (59) ⚡ Prolific Year (6) 🔥 Unstoppable (6) ❓ The Questioner 📈 Trend Setter

Conferences

EMNLP (11) AAAI (10) CVPR (9) ICLR (7) COLING (5) ECCV (5) NIPS (5) ACL (4) ICCV (3) ICML (2) ACML (1) INTERSPEECH (1) MICCAI (1) NAACL (1) SEMEVAL (1) WACV (1)

Top co-authors

Yinfei Yang (13) Zhe Gan (9) Haotian Zhang (8) Xianzhi Du (7) Fuqiang Niu (7) Dong Chen (6) Yifan liu (6) Fei Sha (5) Zhengfeng Lai (5) Chen Chen (5)

Keywords

stance detection (6) semantic segmentation (5) multimodal learning (5) large language model (5) zero-shot learning (4) diffusion model (4) contrastive learning (4) knowledge distillation (3) attention mechanism (3) representation learning (3) graph neural network (3) transfer learning (3) question answering (2) few-shot learning (2) vision transformer (2) semi-supervised learning (2) benchmark evaluation (2) domain adaptation (2) image retrieval (2) image segmentation (2)

Papers

TwiUSD: A Benchmark Dataset and Structure-Aware LLM Framework for User Stance Detection ACL 2026 PBR3DGen: A VLM-Guided Mesh Generation with High-Quality PBR Texture AAAI 2026 From Logical to Computational Sparsity: Structure-Aware Block-Sparse Attention for Long-Code Completion ACL 2026 Improving Day-Ahead Grid Carbon Intensity Forecasting by Joint Modeling of Local-Temporal and Cross-Variable Dependencies Across Different Frequencies AAAI 2026 Induce, Align, Predict: Zero-Shot Stance Detection via Cognitive Inductive Reasoning AAAI 2026 Spiking-Aided Neural Architecture for Efficient and Robust WiFi Sensing AAAI 2026 MoETTA: Test-Time Adaptation Under Mixed Distribution Shifts with MoE-LayerNorm AAAI 2026 Sat2Flow: A Structure-Aware Diffusion Framework for Human Flow Generation from Satellite Imagery AAAI 2026 EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing ICLR 2025 Core Knowledge Learning Framework for Graph AAAI 2025 Improve Vision Language Model Chain-of-thought Reasoning ACL 2025 Structured 3D Latents for Scalable and Versatile 3D Generation CVPR 2025 Adapting to Observation Length of Trajectory Prediction via Contrastive Learning CVPR 2025 SPARK: Simulating the Co-evolution of Stance and Topic Dynamics in Online Discourse with LLM-based Agents EMNLP 2025 CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling EMNLP 2025 Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis ICCV 2025 STIV: Scalable Text and Image Conditioned Video Generation ICCV 2025 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning ICLR 2025 Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models ICLR 2025 MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA ICLR 2025 Contrastive Localized Language-Image Pre-Training ICML 2025 Hi-Patch: Hierarchical Patch GNN for Irregular Multivariate Time Series ICML 2025 Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework CVPR 2024 MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval CVPR 2024 MIPS at SemEval-2024 Task 3: Multimodal Emotion-Cause Pair Extraction in Conversations with Multimodal Language Models NAACL 2024 MIPS at SemEval-2024 Task 3: Multimodal Emotion-Cause Pair Extraction in Conversations with Multimodal Language Models SEMEVAL 2024 Compress3D: a Compressed Latent Space for 3D Generation from a Single Image ECCV 2024 "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training" ECCV 2024 VeCLIP: Improving CLIP Training via Visual-enriched Captions ECCV 2024 Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction EMNLP 2024 Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss EMNLP 2024 Pcc-tuning: Breaking the Contrastive Learning Ceiling in Semantic Textual Similarity EMNLP 2024 BPKD: Boundary Privileged Knowledge Distillation for Semantic Segmentation WACV 2024 iTrendRNN: An Interpretable Trend-Aware RNN for Meteorological Spatiotemporal Prediction AAAI 2024 MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning AAAI 2024 MOFI: Learning Image Representations from Noisy Entity Annotated Images ICLR 2024 Compressing LLMs: The Truth is Rarely Pure and Never Simple ICLR 2024 Ferret: Refer and Ground Anything Anywhere at Any Granularity ICLR 2024 Amodal Scene Analysis via Holistic Occlusion Relation Inference and Generative Mask Completion AAAI 2024 GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling NIPS 2024 Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles NIPS 2024 Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis MICCAI 2024 A Challenge Dataset and Effective Models for Conversational Stance Detection COLING 2024 EDDA: An Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection COLING 2024 MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property COLING 2024 WildlifeMapper: Aerial Image Analysis for Multi-Species Detection and Identification CVPR 2024 RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models ECCV 2024 Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation ICCV 2023 ZegCLIP: Towards Adapting CLIP for Zero-Shot Semantic Segmentation CVPR 2023 Stance Detection on Social Media with Background Knowledge EMNLP 2023 STAIR: Learning Sparse Text and Image Representation in Grounded Tokens EMNLP 2023 MetaPortrait: Identity-Preserving Talking Head Generation With Fast Personalized Adaptation CVPR 2023 Margin Calibration for Long-Tailed Visual Recognition ACML 2022 Noise Learning for Text Classification: A Benchmark COLING 2022 Sentiment Interpretable Logic Tensor Network for Aspect-Term Sentiment Analysis COLING 2022 StyleSwin: Transformer-Based GAN for High-Resolution Image Generation CVPR 2022 SegViT: Semantic Segmentation with Plain Vision Transformers NIPS 2022 Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training INTERSPEECH 2022 Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation NIPS 2021 FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling NIPS 2021 Visually Grounded Concept Composition EMNLP 2021 Systematic Generalization on gSCAN: What is Nearly Solved and What is Next? EMNLP 2021 Enhancing Cross-target Stance Detection with Transferable Semantic-Emotion Knowledge ACL 2020 Learning to Represent Image and Text with Denotation Graph EMNLP 2020 A Probabilistic Model for Joint Learning of Word Embeddings from Texts and Images EMNLP 2018 Cross-Modal and Hierarchical Modeling of Video and Text ECCV 2018 Real-Time Action Recognition With Enhanced Motion Vector CNNs CVPR 2016