Mostafa Dehghani
35 papers · 2018–2024 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (11) π Academic Marathon (6) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (3)
π
Cross-Pollinator
(3)
π
Renaissance Researcher
(5)
πΊοΈ
Taxonomy Completionist
(57)
π¬
Deep Specialist
(12)
π₯
Mega-Team
(43)
π€
Dynamic Duo
(17)
π
Triple Crown
π±
Topic Pioneer
ποΈ
Keyword Collector
(91)
β‘
Prolific Year
(8)
π
Trend Setter
π
Conference Pioneer
π
Century Club
(35)
β
The Questioner
(3)
Conferences
ICLR (12)
NIPS (5)
CVPR (4)
EMNLP (3)
ICML (3)
ACL (2)
IJCNLP (2)
ECCV (1)
ICCV (1)
JMLR (1)
NAACL (1)
Top co-authors
Research topics
Keywords
few-shot learning
(5)
transformer architecture
(5)
model architecture
(4)
transfer learning
(3)
model scaling
(3)
large language model
(3)
parameter-efficient fine-tuning
(3)
object detection
(2)
scaling law
(2)
pretrained model
(2)
representation learning
(2)
adaptive computation
(2)
convolutional neural network
(2)
vision transformer
(2)
video transformer
(2)
multi-task learning
(2)
image recognition
(2)
video understanding
(2)
action recognition
(2)
language model
(2)
Papers
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
End-to-End Spatio-Temporal Action Localisation with Video Transformers
CVPR 2024
Scaling Instruction-Finetuned Language Models
JMLR 2024
Fractal Patterns May Illuminate the Success of Next-Token Prediction
NIPS 2024
Low-Rank Adaptation for Multilingual Summarization: An Empirical Study
NAACL 2024
Frozen Feature Augmentation for Few-Shot Image Classification
CVPR 2024
Transcending Scaling Laws with 0.1% Extra Compute
EMNLP 2023
DSI++: Updating Transformer Memory with New Documents
EMNLP 2023
Patch nβ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
NIPS 2023
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
EMNLP 2023
$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
ICLR 2023
UL2: Unifying Language Learning Paradigms
ICLR 2023
Scaling Vision Transformers to 22 Billion Parameters
ICML 2023
Adaptive Computation with Elastic Input Sequence
ICML 2023
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
ICLR 2023
Scale Efficiently: Insights from Pretraining and Finetuning Transformers
ICLR 2022
Exploring the Limits of Large Scale Pre-training
ICLR 2022
Scenic: A JAX Library for Computer Vision Research and Beyond
CVPR 2022
Simple Open-Vocabulary Object Detection with Vision Transformers
ECCV 2022
Confident Adaptive Language Modeling
NIPS 2022
Transformer Memory as a Differentiable Search Index
NIPS 2022
The Efficiency Misnomer
ICLR 2022
Discrete Representations Strengthen Vision Transformer Robustness
ICLR 2022
TokenLearner: Adaptive Space-Time Tokenization for Videos
NIPS 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
ACL 2021
Are Pretrained Convolutions Better than Pretrained Transformers?
ACL 2021
ViViT: A Video Vision Transformer
ICCV 2021
IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression
ICLR 2021
Long Range Arena : A Benchmark for Efficient Transformers
ICLR 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
ICLR 2021
OmniNet: Omnidirectional Representations from Transformers
ICML 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
IJCNLP 2021
Are Pretrained Convolutions Better than Pretrained Transformers?
IJCNLP 2021
Universal Transformers
ICLR 2019
Fidelity-Weighted Learning
ICLR 2018