Peihao Chen
20 papers · 2019–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (6) π Renaissance Researcher (7) π Conference Polyglot (8) πΊοΈ Taxonomy Completionist (39)
πΊοΈ
Taxonomy Completionist
(39)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(17)
π
Grand Slam
π§¬
Topic Evolution
β‘
Prolific Year
(5)
ποΈ
Keyword Collector
(88)
π
Century Club
(19)
π₯
Unstoppable
(7)
Conferences
CVPR (6)
NIPS (4)
AAAI (3)
ECCV (2)
ICCV (2)
ICLR (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
large language model
(4)
self-supervised learning
(3)
embodied ai
(2)
action recognition
(2)
video representation learning
(2)
vision-language model
(2)
spatial reasoning
(2)
video representation
(2)
curriculum learning
(1)
trajectory prediction
(1)
vision-language navigation
(1)
3d vision
(1)
multi-modal learning
(1)
video understanding
(1)
zero-shot learning
(1)
cross-modal learning
(1)
natural language understanding
(1)
audio-visual learning
(1)
temporal modeling
(1)
object tracking
(1)
Papers
NaVLA$^2$: A Vision-Language-Audio-Action Model for Multimodal Instruction Navigation
AAAI 2026
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
CVPR 2025
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
CVPR 2025
Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
IJCAI 2025
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
CVPR 2024
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
ICLR 2024
FlexAttention for Efficient High-Resolution Vision-Language Models
ECCV 2024
3D-VLA: A 3D Vision-Language-Action Generative World Model
ICML 2024
RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation
CVPR 2024
Learning Vision-and-Language Navigation from YouTube Videos
ICCV 2023
FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
NIPS 2023
Masked Motion Encoding for Self-Supervised Video Representation Learning
CVPR 2023
3D-LLM: Injecting the 3D World into Large Language Models
NIPS 2023
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
NIPS 2022
Learning Active Camera for Multi-Object Navigation
NIPS 2022
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
AAAI 2021
Foley Music: Learning to Generate Music from Videos
ECCV 2020
Dense Regression Network for Video Grounding
CVPR 2020
Location-Aware Graph Convolutional Networks for Video Question Answering
AAAI 2020
Self-Supervised Moving Vehicle Tracking With Stereo Sound
ICCV 2019