Ruoming Pang
33 papers · 2018–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (11) π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (10)
π
Conference Polyglot
(10)
π
Academic Marathon
(7)
π
Cross-Pollinator
(12)
π€
Dynamic Duo
(10)
π₯
Mega-Team
(29)
π§¬
Topic Evolution
β‘
Prolific Year
(5)
π₯
Unstoppable
(8)
π
Trend Setter
π
Century Club
(33)
ποΈ
Keyword Collector
(115)
β
The Questioner
Conferences
INTERSPEECH (13)
ICLR (6)
ACL (3)
CVPR (3)
ECCV (2)
NAACL (2)
EMNLP (1)
ICCV (1)
ICML (1)
NIPS (1)
Top co-authors
Keywords
speech recognition
(4)
end-to-end model
(4)
automatic speech recognition
(4)
convolutional neural network
(4)
word error rate
(3)
latency optimization
(3)
large language model
(3)
on-device speech recognition
(3)
language model
(3)
neural architecture search
(3)
shallow fusion
(2)
recurrent neural network transducer
(2)
object detection
(2)
end-to-end speech recognition
(2)
tool use
(2)
domain adaptation
(2)
model scaling
(2)
efficient computing
(2)
knowledge distillation
(2)
attention mechanism
(2)
Papers
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains
NAACL 2025
Improve Vision Language Model Chain-of-thought Reasoning
ACL 2025
Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?
ACL 2025
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
ICLR 2025
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
ICLR 2025
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
ICLR 2025
Instruction-Following Pruning for Large Language Models
ICML 2025
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities
NAACL 2025
"MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"
ECCV 2024
STAIR: Learning Sparse Text and Image Representation in Grounded Tokens
EMNLP 2023
Vector-quantized Image Modeling with Improved VQGAN
ICLR 2022
Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition
INTERSPEECH 2022
A Language Agnostic Multilingual Streaming On-Device ASR System
INTERSPEECH 2022
Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models
INTERSPEECH 2021
Searching for Fast Model Families on Datacenter Accelerators
CVPR 2021
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
ICLR 2021
Unsupervised Learning of Disentangled Speech Content and Style Representation
INTERSPEECH 2021
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling
INTERSPEECH 2021
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
ECCV 2020
Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition
INTERSPEECH 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
INTERSPEECH 2020
Emitting Word Timings with End-to-End Models
INTERSPEECH 2020
Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus
INTERSPEECH 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
INTERSPEECH 2020
EfficientDet: Scalable and Efficient Object Detection
CVPR 2020
Hierarchical Generative Modeling for Controllable Speech Synthesis
ICLR 2019
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation
ACL 2019
Shallow-Fusion End-to-End Contextual Biasing
INTERSPEECH 2019
Two-Pass End-to-End Speech Recognition
INTERSPEECH 2019
MnasNet: Platform-Aware Neural Architecture Search for Mobile
CVPR 2019
Searching for MobileNetV3
ICCV 2019
Compression of End-to-End Models
INTERSPEECH 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
NIPS 2018