Raphael Tang

29 papers · 2019–2026 · 5 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8) 🏃 Academic Marathon (6) 🌍 Conference Polyglot (5) 🗺️ Taxonomy Completionist (62)

🏃 Academic Marathon (6) 🗺️ Taxonomy Completionist (62) 🌈 Renaissance Researcher (8) 🌟 Keyword Trendsetter Combo (3) 🤝 Dynamic Duo (23) 🔥 Unstoppable (7) 💎 Century Club (28) 📈 Trend Setter ⚡ Prolific Year (6) ❓ The Questioner 🗃️ Keyword Collector (123)

Conferences

EMNLP (13) ACL (9) IJCNLP (3) NAACL (3) EACL (1)

Top co-authors

Jimmy Lin (23) Ji Xin (9) Yaoliang Yu (6) Zhiying Jiang (5) Yao Lu (5) Jaejun Lee (5) Wenyan Li (5) Pontus Stenetorp (5) Crystina Zhang (4) Ferhan Ture (4)

Keywords

model compression (5) knowledge distillation (4) neural network (4) inference acceleration (3) question answering (3) transfer learning (3) speech recognition (3) keyword spotting (3) vision-language model (2) diffusion model (2) document classification (2) information bottleneck (2) compositional representation (2) classifier cascade (2) selective prediction (2) text classification (2) automatic speech recognition (2) bert model (2) text-to-image generation (2) semantic similarity (2)

Papers

The Role of Mixed-Language Documents for Multilingual Large Language Model Pretraining ACL 2026 Lost in Embeddings: Information Loss in Vision–Language Models EMNLP 2025 Multilingual Language Model Pretraining using Machine-translated Data EMNLP 2025 Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt Optimisation NAACL 2024 Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation EMNLP 2024 Understanding Retrieval Robustness for Retrieval-augmented Image Captioning ACL 2024 FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture EMNLP 2024 Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models NAACL 2024 What the DAAM: Interpreting Stable Diffusion Using Cross Attention ACL 2023 “Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors ACL 2023 Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers ACL 2023 SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale EMNLP 2022 How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks EMNLP 2021 The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing IJCNLP 2021 BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression EACL 2021 The Art of Abstention: Selective Prediction and Error Regularization for Natural Language Processing ACL 2021 Voice Query Auto Completion EMNLP 2021 Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset EMNLP 2020 Showing Your Work Doesn’t Always Work ACL 2020 Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT ACL 2020 Inserting Information Bottlenecks for Attribution in Transformers EMNLP 2020 Howl: A Deployed, Open-Source Wake Word Detection System EMNLP 2020 DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference ACL 2020 Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting IJCNLP 2019 Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling IJCNLP 2019 Rethinking Complex Neural Network Architectures for Document Classification NAACL 2019 Natural Language Generation for Effective Knowledge Distillation EMNLP 2019 Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting EMNLP 2019 Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling EMNLP 2019