Ruohan Gao
32 papers · 2017–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Interdisciplinary Bridge π Renaissance Researcher (8) π Academic Marathon (8) π Conference Polyglot (7) πΊοΈ Taxonomy Completionist (36)
π
Academic Marathon
(8)
πΊοΈ
Taxonomy Completionist
(36)
π
Renaissance Researcher
(8)
π¬
Deep Specialist
(13)
π€
Dynamic Duo
(11)
π§¬
Topic Evolution
π
Keyword Champion
(2)
β‘
Prolific Year
(5)
π
Century Club
(32)
ποΈ
Keyword Collector
(118)
π
Conference Pioneer
π₯
Unstoppable
(9)
Conferences
CVPR (12)
ICCV (7)
ECCV (5)
CORL (4)
ICLR (2)
AAAI (1)
NIPS (1)
Top co-authors
Keywords
multimodal learning
(10)
audio-visual learning
(5)
differentiable rendering
(3)
object recognition
(3)
room acoustics
(3)
action recognition
(3)
room impulse response
(3)
3d reconstruction
(2)
multisensory learning
(2)
neural network
(2)
impact sound
(2)
tactile sensing
(2)
multisensory perception
(2)
video classification
(1)
source separation
(1)
robotic manipulation
(1)
speech separation
(1)
sim-to-real transfer
(1)
human detection
(1)
cross-modal learning
(1)
Papers
Learning to Highlight Audio by Watching Movies
CVPR 2025
AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs
ICCV 2025
GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
ICCV 2025
EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception
ICCV 2025
Hearing Anywhere in Any Environment
CVPR 2025
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
ICCV 2025
Differentiable Room Acoustic Rendering with Multi-View Vision Priors
ICCV 2025
Multisensory Machine Intelligence
AAAI 2025
The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective
CVPR 2024
Hearing Anything Anywhere
CVPR 2024
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
ECCV 2024
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
ECCV 2024
An Extensible Multi-modal Multi-task Object Dataset with Materials
ICLR 2023
NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities
CORL 2023
RealImpact: A Dataset of Impact Sound Fields for Real Objects
CVPR 2023
The ObjectFolder Benchmark: Multisensory Learning With Neural and Real Objects
CVPR 2023
SoundCam: A Dataset for Finding Humans Using Room Acoustics
NIPS 2023
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
CVPR 2022
See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation
CORL 2022
Visual Acoustic Matching
CVPR 2022
DiffImpact: Differentiable Rendering and Identification of Impact Sounds
CORL 2021
ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations
CORL 2021
VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency
CVPR 2021
Learning to Set Waypoints for Audio-Visual Navigation
ICLR 2021
Listen to Look: Action Recognition by Previewing Audio
CVPR 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation
ECCV 2020
Co-Separating Sounds of Visual Objects
ICCV 2019
2.5D Visual Sound
CVPR 2019
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
ECCV 2018
Im2Flow: Motion Hallucination From Static Images for Action Recognition
CVPR 2018
Learning to Separate Object Sounds by Watching Unlabeled Video
ECCV 2018
On-Demand Learning for Deep Image Restoration
ICCV 2017