Bryan Catanzaro
64 papers · 2013–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
🌍 Conference Polyglot (12) 🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (11)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(11)
🤝
Dynamic Duo
(29)
👑
Triple Crown
👥
Mega-Team
(69)
🔬
Deep Specialist
(10)
🏆
Keyword Champion
(2)
🚀
Conference Pioneer
🗃️
Keyword Collector
(209)
📈
Trend Setter
⚡
Prolific Year
(10)
🔥
Unstoppable
(8)
❓
The Questioner
(2)
💎
Century Club
(63)
Conferences
ICLR (13)
NIPS (11)
ICML (10)
EMNLP (7)
CVPR (6)
ACL (5)
EACL (3)
ICCV (3)
ECCV (2)
INTERSPEECH (2)
IJCNLP (1)
WACV (1)
Top co-authors
Keywords
large language model
(8)
language model
(7)
text generation
(4)
instruction tuning
(4)
neural network
(4)
retrieval-augmented generation
(4)
question answering
(3)
video generation
(3)
knowledge distillation
(3)
zero-shot learning
(2)
speech synthesis
(2)
image generation
(2)
domain adaptation
(2)
unsupervised pretraining
(2)
dialogue generation
(2)
few-shot learning
(2)
reward modeling
(2)
contrastive learning
(2)
factual accuracy
(2)
benchmark evaluation
(2)
Papers
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
EACL 2026
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
ICLR 2025
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
ICLR 2025
RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models
CVPR 2025
ETTA: Elucidating the Design Space of Text-to-Audio Models
ICML 2025
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
ICML 2025
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
ICLR 2025
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
ICLR 2025
Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity
ICML 2025
FeatSharp: Your Vision Model Features, Sharper
ICML 2025
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
ICLR 2025
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
ICLR 2025
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
ICLR 2025
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
ACL 2025
Fugatto 1: Foundational Generative Audio Transformer Opus 1
ICLR 2025
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
ACL 2025
Retrieval meets Long Context Large Language Models
ICLR 2024
ChatQA: Surpassing GPT-4 on Conversational QA and RAG
NIPS 2024
Compact Language Models via Pruning and Knowledge Distillation
NIPS 2024
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
NIPS 2024
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
EMNLP 2024
LLM-Evolve: Evaluation for LLM’s Evolving Capability on Benchmarks
EMNLP 2024
ODIN: Disentangled Reward Mitigates Hacking in RLHF
ICML 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
ICML 2024
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
ICML 2024
Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement
WACV 2024
P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting
NIPS 2023
CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
INTERSPEECH 2023
RAD-MMM: Multilingual Multiaccented Multispeaker Text To Speech
INTERSPEECH 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
EMNLP 2023
Context Generation Improves Open Domain Question Answering
EACL 2023
Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models
ICCV 2023
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
ICLR 2023
Adding Instructions during Pretraining: Effective way of Controlling Toxicity in Language Models
EACL 2023
Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
EMNLP 2023
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
NIPS 2022
Evaluating Parameter Efficient Learning for Generation
EMNLP 2022
Multi-Stage Prompting for Knowledgeable Dialogue Generation
ACL 2022
Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators
ICLR 2022
Factuality Enhanced Language Models for Open-Ended Text Generation
NIPS 2022
End-to-End Training of Neural Retrievers for Open-Domain Question Answering
IJCNLP 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
NIPS 2021
Dual Contrastive Loss and Attention for GANs
ICCV 2021
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
ICLR 2021
DiffWave: A Versatile Diffusion Model for Audio Synthesis
ICLR 2021
End-to-End Training of Neural Retrievers for Open-Domain Question Answering
ACL 2021
View Generalization for Single Image Textured 3D Models
CVPR 2021
MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
EMNLP 2020
Training Question Answering Models From Synthetic Data
EMNLP 2020
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?
NIPS 2020
Large Scale Multi-Actor Generative Dialog Modeling
ACL 2020
Neural FFTs for Universal Texture Image Synthesis
NIPS 2020
Panoptic-Based Image Synthesis
CVPR 2020
Few-shot Video-to-Video Synthesis
NIPS 2019
Graphical Contrastive Losses for Scene Graph Parsing
CVPR 2019
Improving Semantic Segmentation via Video Propagation and Label Relaxation
CVPR 2019
Unsupervised Video Interpolation Using Cycle Consistency
ICCV 2019
Video-to-Video Synthesis
NIPS 2018
High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs
CVPR 2018
SDC-Net: Video prediction using spatially-displaced convolution
ECCV 2018
Image Inpainting for Irregular Holes Using Partial Convolutions
ECCV 2018
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin
ICML 2016
Persistent RNNs: Stashing Recurrent Weights On-Chip
ICML 2016
Deep learning with COTS HPC systems
ICML 2013