conftrace_

Bowen Shi

37 papers · 2019–2026 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+12 more ↓ 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) πŸŒ‰ Interdisciplinary Bridge πŸ—ΊοΈ Taxonomy Completionist (16) 🐣 Hot Topic Early Bird
🌈 Renaissance Researcher (5) πŸ—ΊοΈ Taxonomy Completionist (16) 🧭 Keyword Pioneer πŸ”¬ Deep Specialist (10) πŸ† Keyword Champion (2) 🀝 Dynamic Duo (13) πŸ† Grand Slam πŸ—ƒοΈ Keyword Collector (137) ⚑ Prolific Year (8) πŸ“ˆ Trend Setter πŸ’Ž Century Club (35) πŸ”₯ Unstoppable (7)

Conferences

INTERSPEECH (8) ACL (7) ICLR (4) CVPR (3) EMNLP (3) ICML (3) NIPS (3) ECCV (2) ICCV (2) AAAI (1) JMLR (1)

Papers

Profiling-Free Mixed-Precision Quantization for MoE LLMs via Fuzzy Rule Interpolation ACL 2026 CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation ACL 2026 METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models ICCV 2025 MDCure: A Scalable Pipeline for Multi-Document Instruction-Following ACL 2025 MusicFlow: Cascaded Flow Matching for Text Guided Music Generation ICML 2024 BarLeRIa: An Efficient Tuning Framework for Referring Image Segmentation ICLR 2024 Scaling Speech Technology to 1,000+ Languages JMLR 2024 Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners ICLR 2024 Generative Pre-training for Speech with Flow Matching ICLR 2024 Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning INTERSPEECH 2024 Towards Privacy-Aware Sign Language Translation at Scale ACL 2024 XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception ACL 2024 Bootstrap AutoEncoders With Contrastive Paradigm for Self-supervised Gaze Estimation ICML 2024 UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding ECCV 2024 Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation AAAI 2023 Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale NIPS 2023 Adapting Shortcut With Normalizing Flow: An Efficient Tuning Framework for Visual Recognition CVPR 2023 ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration CVPR 2023 TTIC’s Submission to WMT-SLT 23 EMNLP 2023 SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning ICML 2023 MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation INTERSPEECH 2023 Expresso: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis INTERSPEECH 2023 AiluRus: A Scalable ViT Framework for Dense Prediction NIPS 2023 TTIC’s WMT-SLT 22 Sign Language Translation System EMNLP 2022 Open-Domain Sign Language Translation Learned from Online Video EMNLP 2022 A Transformer-Based Decoder for Semantic Segmentation with Multi-level Context Mining ECCV 2022 Robust Self-Supervised Audio-Visual Speech Recognition INTERSPEECH 2022 Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT INTERSPEECH 2022 Searching for fingerspelled content in American Sign Language ACL 2022 Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction ICLR 2022 u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality NIPS 2022 Fingerspelling Detection in American Sign Language CVPR 2021 A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling INTERSPEECH 2020 A Cross-Task Analysis of Text Span Representations ACL 2020 Compression of Acoustic Event Detection Models with Quantized Distillation INTERSPEECH 2019 On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval INTERSPEECH 2019 Fingerspelling Recognition in the Wild With Iterative Visual Attention ICCV 2019