Serena Yeung-Levy
20 papers · 2024–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Cross-Pollinator (13) π Conference Polyglot (10) π Interdisciplinary Bridge π§ Keyword Pioneer π Renaissance Researcher (6)
π
Cross-Pollinator
(13)
π€
Dynamic Duo
(11)
π₯
Mega-Team
(23)
π
Century Club
(19)
β‘
Prolific Year
(6)
β
The Questioner
(3)
ποΈ
Keyword Collector
(62)
Conferences
CVPR (5)
ECCV (3)
ICLR (3)
NIPS (2)
ACL (1)
EACL (1)
EMNLP (1)
ICCV (1)
ICML (1)
MLHC (1)
WACV (1)
Top co-authors
Keywords
vision-language model
(6)
benchmark evaluation
(3)
visual question answering
(3)
vision language model
(3)
zero-shot learning
(2)
question answering
(2)
biomedical imaging
(2)
chain-of-thought reasoning
(1)
catastrophic forgetting
(1)
image classification
(1)
prototype learning
(1)
self-supervised learning
(1)
domain adaptation
(1)
reinforcement learning
(1)
image captioning
(1)
transfer learning
(1)
multimodal learning
(1)
test-time adaptation
(1)
multi-modal learning
(1)
information retrieval
(1)
Papers
PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR
EACL 2026
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
ICLR 2025
CellFlux: Simulating Cellular Morphology Changes via Flow Matching
ICML 2025
The Impact of Image Resolution on Biomedical Multimodal Large Language Models
MLHC 2025
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models
WACV 2025
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
CVPR 2025
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
CVPR 2025
NegVQA: Can Vision Language Models Understand Negation?
ACL 2025
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
CVPR 2025
Apollo: An Exploration of Video Understanding in Large Multimodal Models
CVPR 2025
Data or Language Supervision: What Makes CLIP Better than DINO?
EMNLP 2025
Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration
ICCV 2025
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
ICLR 2025
Video Action Differencing
ICLR 2025
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
ECCV 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
NIPS 2024
Describing Differences in Image Sets with Natural Language
CVPR 2024
Depth-guided NeRF Training via Earth Moverβs Distance
ECCV 2024
Viewpoint textual inversion: discovering scene representations and 3D view control in 2D diffusion models
ECCV 2024
Micro-Bench: A Microscopy Benchmark for Vision-Language Understanding
NIPS 2024