conftrace_

Haoxuan You

24 papers · 2019–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+12 more ↓ πŸƒ Academic Marathon (6) 🌍 Conference Polyglot (10) 🧭 Keyword Pioneer πŸŒ‰ Interdisciplinary Bridge 🐝 Cross-Pollinator (12)
🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (8) πŸ—ΊοΈ Taxonomy Completionist (54) πŸ‘₯ Mega-Team (23) 🀝 Dynamic Duo (12) ❓ The Questioner ⚑ Prolific Year (7) πŸš€ Conference Pioneer πŸ’Ž Century Club (24) πŸ“ˆ Trend Setter πŸ—ƒοΈ Keyword Collector (103) πŸ”₯ Unstoppable (7)

Conferences

ICLR (5) AAAI (4) EMNLP (4) ACL (2) CVPR (2) ECCV (2) NIPS (2) ICCV (1) IJCAI (1) NAACL (1)

Papers

MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA ICLR 2025 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning ICLR 2025 DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models CVPR 2025 CoBIT: A Contrastive Bi-directional Image-Text Generation Model ICLR 2024 JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images NIPS 2024 Ferret: Refer and Ground Anything Anywhere at Any Granularity ICLR 2024 Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond EMNLP 2023 UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding ACL 2023 IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models EMNLP 2023 SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning AAAI 2022 Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework ICLR 2022 Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training ECCV 2022 Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense EMNLP 2022 Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding EMNLP 2022 Bridging the Gap between Recognition-level Pre-training and Commonsensical Vision-language Tasks ACL 2022 Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions NAACL 2021 Learning Visual Commonsense for Robust Scene Graph Generation ECCV 2020 Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering CVPR 2019 PVRNet: Point-View Relation Neural Network for 3D Shape Recognition AAAI 2019 MeshNet: Mesh Neural Network for 3D Shape Representation AAAI 2019 Hypergraph Neural Networks AAAI 2019 Decoding EEG by Visual-guided Deep Neural Networks IJCAI 2019 Multi-Modality Latent Interaction Network for Visual Question Answering ICCV 2019 PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation NIPS 2019