conftrace_

Po-Yao Huang

23 papers · 2018–2024 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+10 more ↓ πŸŒ‰ Interdisciplinary Bridge 🌈 Renaissance Researcher (6) πŸƒ Academic Marathon (6) 🌍 Conference Polyglot (11) πŸ—ΊοΈ Taxonomy Completionist (43)
πŸ—ΊοΈ Taxonomy Completionist (43) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🀝 Dynamic Duo (10) 🧬 Topic Evolution πŸ’Ž Century Club (23) ⚑ Prolific Year (6) πŸ—ƒοΈ Keyword Collector (92) πŸ”₯ Unstoppable (7) πŸš€ Conference Pioneer

Conferences

ACL (4) EMNLP (3) ICCV (3) CVPR (2) ECCV (2) ICLR (2) IJCNLP (2) NIPS (2) ICML (1) INTERSPEECH (1) NAACL (1)

Papers

Demystifying CLIP Data ICLR 2024 MoDE: CLIP Data Experts via Clustering CVPR 2024 Altogether: Image Captioning via Re-aligning Alt-text EMNLP 2024 Self-Supervised Audio-Visual Soundscape Stylization ECCV 2024 VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild ACL 2024 Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles ICML 2023 STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition CVPR 2023 CiT: Curation in Training for Effective Vision-Language Data ICCV 2023 Diffusion Models as Masked Autoencoders ICCV 2023 Generating Hashtags for Short-form Videos with Guided Signals ACL 2023 MAViL: Masked Audio-Video Learners NIPS 2023 Masked Autoencoders that Listen NIPS 2022 AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification INTERSPEECH 2022 Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models NAACL 2021 VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding ACL 2021 VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding EMNLP 2021 Space-Time Crop & Attend: Improving Cross-Modal Video Representation Learning ICCV 2021 Support-set bottlenecks for video-text representation learning ICLR 2021 VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding IJCNLP 2021 Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting ACL 2020 Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations IJCNLP 2019 Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations EMNLP 2019 RCAA: Relational Context-Aware Agents for Person Search ECCV 2018