Dara Bahri
24 papers · 2018–2024 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Academic Marathon (6) π Conference Polyglot (5) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (9)
π
Cross-Pollinator
(9)
πΊοΈ
Taxonomy Completionist
(36)
π€
Dynamic Duo
(18)
β
The Questioner
(2)
β‘
Prolific Year
(9)
π₯
Unstoppable
(5)
π
Century Club
(24)
ποΈ
Keyword Collector
(68)
Conferences
ICLR (7)
ICML (6)
ACL (5)
NIPS (4)
IJCNLP (2)
Top co-authors
Keywords
transformer architecture
(3)
model architecture
(3)
neural network
(3)
constituency parsing
(2)
few-shot learning
(2)
sharpness-aware minimization
(2)
large language model
(2)
masked language modeling
(2)
language model
(2)
dependency parsing
(2)
convolutional neural network
(2)
neural network optimization
(1)
neural text generation
(1)
unsupervised parsing
(1)
machine translation
(1)
artifact detection
(1)
neural network training
(1)
label noise
(1)
early exit
(1)
information retrieval
(1)
Papers
A Universal Class of Sharpness-Aware Minimization Algorithms
ICML 2024
UL2: Unifying Language Learning Paradigms
ICLR 2023
Sharpness-Aware Minimization Leads to Low-Rank Features
NIPS 2023
Churn Reduction via Distillation
ICLR 2022
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
ICLR 2022
Transformer Memory as a Differentiable Search Index
NIPS 2022
Sharpness-Aware Minimization Improves Language Model Generalization
ACL 2022
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
ACL 2022
Confident Adaptive Language Modeling
NIPS 2022
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
ICLR 2022
Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption
ICLR 2022
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
IJCNLP 2021
Are Pretrained Convolutions Better than Pretrained Transformers?
ACL 2021
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
ACL 2021
HyperGrid Transformers: Towards A Single Model for Multiple Tasks
ICLR 2021
Long Range Arena : A Benchmark for Efficient Transformers
ICLR 2021
Locally Adaptive Label Smoothing Improves Predictive Churn
ICML 2021
Synthesizer: Rethinking Self-Attention for Transformer Models
ICML 2021
OmniNet: Omnidirectional Representations from Transformers
ICML 2021
Are Pretrained Convolutions Better than Pretrained Transformers?
IJCNLP 2021
Deep k-NN for Noisy Labels
ICML 2020
Sparse Sinkhorn Attention
ICML 2020
Reverse Engineering Configurations of Neural Text Generation Models
ACL 2020
Diminishing Returns Shape Constraints for Interpretability and Regularization
NIPS 2018