Yoshua Bengio
293 papers · 2003–2026 · 18 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+21 more ↓ Show less ↑
🗺️ Taxonomy Completionist (44) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (8) 🐣 Hot Topic Early Bird
🏃
Academic Marathon
(22)
🗺️
Taxonomy Completionist
(44)
🌈
Renaissance Researcher
(8)
🌟
Keyword Trendsetter Combo
(6)
🏠
Conference Loyalist
(78)
📛
The Namer
👑
Domain Dominant
(40)
🏆
Keyword Champion
👑
Triple Crown
🌱
Topic Pioneer
🔬
Deep Specialist
(13)
🤝
Dynamic Duo
(25)
🏆
Grand Slam
👥
Mega-Team
(39)
⚡
Prolific Year
(29)
📈
Trend Setter
🚀
Conference Pioneer
🔥
Unstoppable
(17)
❓
The Questioner
(8)
💎
Century Club
(292)
🗃️
Keyword Collector
(224)
Conferences
NIPS (78)
ICLR (73)
ICML (52)
ACL (15)
JMLR (15)
AISTATS (14)
AAAI (13)
INTERSPEECH (8)
UAI (7)
EMNLP (4)
ICCV (3)
NAACL (3)
EACL (2)
IJCNLP (2)
ECCV (1)
CVPR (1)
IJCAI (1)
CLEAR (1)
Top co-authors
Research topics
Keywords
representation learning
(27)
neural network
(23)
recurrent neural network
(19)
variational inference
(16)
generative model
(15)
generative flow network
(12)
deep learning
(11)
feature learning
(9)
semi-supervised learning
(9)
unsupervised learning
(9)
generative adversarial network
(9)
reinforcement learning
(9)
graph neural network
(9)
attention mechanism
(8)
restricted boltzmann machine
(7)
variational autoencoder
(7)
deep neural network
(6)
out-of-distribution generalization
(6)
question answering
(5)
speech recognition
(5)
Papers
Extendable Planning via Multiscale Diffusion
AAAI 2026
Outsourced Diffusion Sampling: Efficient Posterior Inference in Latent Spaces of Generative Models
ICML 2025
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
ICLR 2025
Rejecting Hallucinated State Targets during Planning
ICML 2025
Structure Language Models for Protein Conformation Generation
ICLR 2025
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
ICLR 2025
Monte Carlo Tree Diffusion for System 2 Planning
ICML 2025
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text
ICLR 2025
AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N
ICML 2025
Geometric Signatures of Compositionality Across a Language Model’s Lifetime
ACL 2025
On the Transfer of Object-Centric Representation Learning
ICLR 2025
RL, but don’t do anything I wouldn’t do
UAI 2025
Can a Bayesian Oracle Prevent Harm from an Agent?
UAI 2025
Action abstractions for amortized sampling
ICLR 2025
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
ICLR 2025
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
ICLR 2025
Towards Improving Exploration through Sibling Augmented GFlowNets
ICLR 2025
Adaptive teachers for amortized samplers
ICLR 2025
Ant Colony Sampling with GFlowNets for Combinatorial Optimization
AISTATS 2025
AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
ICLR 2025
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
ICLR 2025
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
ICLR 2025
Towards a Formal Theory of Representational Compositionality
ICML 2025
Improving Gradient-Guided Nested Sampling for Posterior Inference
ICML 2024
Amortizing intractable inference in diffusion models for vision, language, and control
NIPS 2024
Pre-Training and Fine-Tuning Generative Flow Networks
ICLR 2024
Tree Cross Attention
ICLR 2024
Delta-AI: Local objectives for amortized inference in sparse graphical models
ICLR 2024
Amortizing intractable inference in large language models
ICLR 2024
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning
ICLR 2024
Local Search GFlowNets
ICLR 2024
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
ICML 2024
Regeneration Learning: A Learning Paradigm for Data Generation
AAAI 2024
Simulation-Free Schrödinger Bridges via Score and Flow Matching
AISTATS 2024
Cycle Consistency Driven Object Discovery
ICLR 2024
Object centric architectures enable efficient causal representation learning
ICLR 2024
PhyloGFN: Phylogenetic inference with generative flow networks
ICLR 2024
Memory Efficient Neural Processes via Constant Memory Attention Block
ICML 2024
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
NIPS 2024
RGFN: Synthesizable Molecular Generation Using GFlowNets
NIPS 2024
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization
ICLR 2024
Improved off-policy training of diffusion samplers
NIPS 2024
Trajectory Flow Matching with Applications to Clinical Time Series Modelling
NIPS 2024
Expected flow networks in stochastic environments and two-player zero-sum games
ICLR 2024
Discrete Probabilistic Inference as Control in Multi-path Environments
UAI 2024
PhAST: Physics-Aware, Scalable, and Task-Specific GNNs for Accelerated Catalyst Design
JMLR 2024
Learning to Scale Logits for Temperature-Conditional GFlowNets
ICML 2024
SatBird: a Dataset for Bird Species Distribution Modeling using Remote Sensing and Citizen Science Data
NIPS 2023
The Effect of Diversity in Meta-Learning
AAAI 2023
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization for Heterogeneous Representational Coarseness
AAAI 2023
Predictive Inference with Feature Conformal Prediction
ICLR 2023
Latent State Marginalization as a Low-cost Approach for Improving Exploration
ICLR 2023
Robust and Controllable Object-Centric Learning through Energy-based Models
ICLR 2023
GFlowNets and variational inference
ICLR 2023
Generative Augmented Flow Networks
ICLR 2023
Equivariance with Learned Canonicalization Functions
ICML 2023
Multi-Objective GFlowNets
ICML 2023
GFlowNet-EM for Learning Compositional Latent Variable Models
ICML 2023
FAENet: Frame Averaging Equivariant GNN for Materials Modeling
ICML 2023
Interventional Causal Representation Learning
ICML 2023
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning
ICML 2023
MixupE: Understanding and improving Mixup from directional derivative perspective
UAI 2023
Stochastic Generative Flow Networks
UAI 2023
A theory of continuous generative flow networks
ICML 2023
GFlowOut: Dropout with Generative Flow Networks
ICML 2023
Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
ICML 2023
Better Training of GFlowNets with Local Credit and Incomplete Trajectories
ICML 2023
Hyena Hierarchy: Towards Larger Convolutional Language Models
ICML 2023
Discrete Key-Value Bottleneck
ICML 2023
Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning
ICLR 2023
Latent Bottlenecked Attentive Neural Processes
ICLR 2023
Improving *day-ahead* Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context
NIPS 2023
Let the Flows Tell: Solving Graph Combinatorial Problems with GFlowNets
NIPS 2023
Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions
NIPS 2023
Reusable Slotwise Mechanisms
NIPS 2023
Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL
NIPS 2023
Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network
NIPS 2023
HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution
NIPS 2023
GEO-Bench: Toward Foundation Models for Earth Monitoring
NIPS 2023
DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets
NIPS 2023
Combining Parameter-efficient Modules for Task-level Generalisation
EACL 2023
GFlowNet Foundations
JMLR 2023
Benchmarking Graph Neural Networks
JMLR 2023
Neural Attentive Circuits
NIPS 2022
ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods
ICLR 2022
Chunked Autoregressive GAN for Conditional Waveform Synthesis
ICLR 2022
Graph Neural Networks with Learnable Structural and Positional Representations
ICLR 2022
Coordination Among Neural Modules Through a Shared Global Workspace
ICLR 2022
Compositional Attention: Disentangling Search and Retrieval
ICLR 2022
Properties from mechanisms: an equivariance perspective on identifiable representation learning
ICLR 2022
Unifying Likelihood-free Inference with Black-box Optimization and Beyond
ICLR 2022
Continuous-Time Meta-Learning with Forward Mode Differentiation
ICLR 2022
Building Robust Ensembles via Margin Boosting
ICML 2022
Generative Flow Networks for Discrete Probabilistic Modeling
ICML 2022
Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL
UAI 2022
Bayesian structure learning with generative flow networks
UAI 2022
Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints
NIPS 2022
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning
NIPS 2022
Trajectory balance: Improved credit assignment in GFlowNets
NIPS 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
NIPS 2022
Weakly Supervised Representation Learning with Sparse Perturbations
NIPS 2022
Is a Modular Architecture Enough?
NIPS 2022
MAgNet: Mesh Agnostic Neural PDE Solver
NIPS 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
ICML 2022
Biological Sequence Design with GFlowNets
ICML 2022
Towards Scaling Difference Target Propagation by Learning Backprop Targets
ICML 2022
VIM: Variational Independent Modules for Video Prediction
CLEAR 2022
Systematic generalisation with group invariant predictions
ICLR 2021
Object-Centric Image Generation from Layouts
AAAI 2021
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning
ICLR 2021
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
NIPS 2021
FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters
ICCV 2021
Neural Production Systems
NIPS 2021
Dynamic Inference with Neural Interpreters
NIPS 2021
The Causal-Neural Connection: Expressiveness, Learnability, and Inference
NIPS 2021
Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
NIPS 2021
Discrete-Valued Neural Communication
NIPS 2021
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
NIPS 2021
Gradient Starvation: A Learning Proclivity in Neural Networks
NIPS 2021
Recurrent Independent Mechanisms
ICLR 2021
Predicting Infectiousness for Proactive Contact Tracing
ICLR 2021
Saliency is a Possible Red Herring When Diagnosing Poor Generalization
ICLR 2021
Learning Neural Generative Dynamics for Molecular Conformation Generation
ICLR 2021
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments
ICLR 2021
Fast And Slow Learning Of Recurrent Independent Mechanisms
ICLR 2021
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs
ICLR 2021
Spatially Structured Recurrent Modules
ICLR 2021
An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
ICML 2021
An Analysis of the Adaptation Speed of Causal Models
AISTATS 2021
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers
AISTATS 2021
GraphMix: Improved Training of GNNs for Semi-Supervised Learning
AAAI 2021
Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting
AAAI 2021
Visual Concept Reasoning Networks
AAAI 2021
hBERT + BiasCorp - Fighting Racism on the Web
EACL 2021
Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models
AAAI 2021
Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies
AAAI 2021
Revisiting Fundamentals of Experience Replay
ICML 2020
Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
ICML 2020
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
ICML 2020
Small-GAN: Speeding up GAN Training using Core-Sets
ICML 2020
Perceptual Generative Autoencoders
ICML 2020
Experience Grounds Language
EMNLP 2020
On the interplay between noise and curvature and its effect on optimization and generalization
AISTATS 2020
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives
ICLR 2020
Untangling tradeoffs between recurrence and self-attention in artificial neural networks
NIPS 2020
Hybrid Models for Learning to Branch
NIPS 2020
Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling
NIPS 2020
N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
ICLR 2020
Learning the Arrow of Time for Problems in Reinforcement Learning
ICLR 2020
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
ACL 2020
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget
ICLR 2020
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
ECCV 2020
Compositional Generalization by Factorizing Alignment and Translation
ACL 2020
A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
ICLR 2020
Combating False Negatives in Adversarial Imitation Learning (Student Abstract)
AAAI 2020
Interpolation Consistency Training for Semi-supervised Learning
IJCAI 2019
Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input
NIPS 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
NIPS 2019
Unsupervised State Representation Learning in Atari
NIPS 2019
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
NIPS 2019
Variational Temporal Abstraction
NIPS 2019
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets
NIPS 2019
Gradient based sample selection for online continual learning
NIPS 2019
On Adversarial Mixup Resynthesis
NIPS 2019
Wasserstein Dependency Measure for Representation Learning
NIPS 2019
Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies
AAAI 2019
Combined Reinforcement Learning via Abstract Representations
AAAI 2019
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study
ACL 2019
Interactive Language Learning by Question Answering
EMNLP 2019
Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization
ICCV 2019
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction
ICCV 2019
Modeling the Long Term Future in Model-Based Reinforcement Learning
ICLR 2019
An Empirical Study of Example Forgetting during Deep Neural Network Learning
ICLR 2019
Recall Traces: Backtracking Models for Efficient Reinforcement Learning
ICLR 2019
Probabilistic Planning with Sequential Monte Carlo methods
ICLR 2019
InfoBot: Transfer and Exploration via the Information Bottleneck
ICLR 2019
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
ICLR 2019
On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length
ICLR 2019
Learning deep representations by mutual information estimation and maximization
ICLR 2019
h-detach: Modifying the LSTM Gradient Towards Better Optimization
ICLR 2019
Deep Graph Infomax
ICLR 2019
Quaternion Recurrent Neural Networks
ICLR 2019
Adversarial Domain Adaptation for Stable Brain-Machine Interfaces
ICLR 2019
State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations
ICML 2019
GMNN: Graph Markov Neural Networks
ICML 2019
On the Spectral Bias of Neural Networks
ICML 2019
Manifold Mixup: Better Representations by Interpolating Hidden States
ICML 2019
Interactive Language Learning by Question Answering
IJCNLP 2019
Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks
INTERSPEECH 2019
Speech Model Pre-Training for End-to-End Spoken Language Understanding
INTERSPEECH 2019
Learning Speaker Representations with Mutual Information
INTERSPEECH 2019
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
ACL 2018
Deep Complex Networks
ICLR 2018
Residual Connections Encourage Iterative Inference
ICLR 2018
Twin Networks: Matching the Future for Sequence Generation
ICLR 2018
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
ICLR 2018
Graph Attention Networks
ICLR 2018
Fraternal Dropout
ICLR 2018
Boundary Seeking GANs
ICLR 2018
Commonsense mining as knowledge base completion? A study on the impact of novelty
NAACL 2018
Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences
ACL 2018
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
JMLR 2018
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
EMNLP 2018
Mutual Information Neural Estimation
ICML 2018
Focused Hierarchical RNNs for Conditional Sequence Processing
ICML 2018
Twin Regularization for Online Speech Recognition
INTERSPEECH 2018
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
INTERSPEECH 2018
Neural Models for Key Phrase Extraction and Question Generation
ACL 2018
Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding
NIPS 2018
Bayesian Model-Agnostic Meta-Learning
NIPS 2018
Image-to-image translation for cross-domain disentanglement
NIPS 2018
MetaGAN: An Adversarial Approach to Few-Shot Learning
NIPS 2018
Dendritic cortical microcircuits approximate the backpropagation algorithm
NIPS 2018
Improving Speech Recognition by Revising Gated Recurrent Units
INTERSPEECH 2017
Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net
NIPS 2017
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
CVPR 2017
GibbsNet: Iterative Adversarial Inference for Deep Graphical Models
NIPS 2017
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
ACL 2017
A Closer Look at Memorization in Deep Networks
ICML 2017
Sharp Minima Can Generalize For Deep Nets
ICML 2017
Plan, Attend, Generate: Planning for Sequence-to-Sequence Models
NIPS 2017
Z-Forcing: Training Stochastic Recurrent Networks
NIPS 2017
Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition
INTERSPEECH 2017
Unitary Evolution Recurrent Neural Networks
ICML 2016
Bidirectional Helmholtz Machines
ICML 2016
Noisy Activation Functions
ICML 2016
Architectural Complexity Measures of Recurrent Neural Networks
NIPS 2016
Professor Forcing: A New Algorithm for Training Recurrent Networks
NIPS 2016
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation
ACL 2016
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus
ACL 2016
Pointing the Unknown Words
ACL 2016
Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks
INTERSPEECH 2016
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
NAACL 2016
On Multiplicative Integration with Recurrent Neural Networks
NIPS 2016
Knowledge Matters: Importance of Prior Information for Optimization
JMLR 2016
Binarized Neural Networks
NIPS 2016
Deconstructing the Ladder Network Architecture
ICML 2016
Gated Feedback Recurrent Neural Networks
ICML 2015
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
ICML 2015
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
ICML 2015
On Using Very Large Target Vocabulary for Neural Machine Translation
ACL 2015
Equilibrated adaptive learning rates for non-convex optimization
NIPS 2015
On Using Very Large Target Vocabulary for Neural Machine Translation
IJCNLP 2015
Attention-Based Models for Speech Recognition
NIPS 2015
BinaryConnect: Training Deep Neural Networks with binary weights during propagations
NIPS 2015
A Recurrent Latent Variable Model for Sequential Data
NIPS 2015
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
NIPS 2014
Deep Generative Stochastic Networks Trainable by Backprop
ICML 2014
Marginalized Denoising Auto-encoders for Nonlinear Representations
ICML 2014
On the Number of Linear Regions of Deep Neural Networks
NIPS 2014
Generative Adversarial Nets
NIPS 2014
Iterative Neural Autoregressive Distribution Estimator NADE-k
NIPS 2014
How transferable are features in deep neural networks?
NIPS 2014
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
EMNLP 2014
What Regularized Auto-Encoders Learn from the Data-Generating Distribution
JMLR 2014
Maxout Networks
ICML 2013
Texture Modeling with Convolutional Spike-and-Slab RBMs and Deep Extensions
AISTATS 2013
Better Mixing via Deep Representations
ICML 2013
Multi-Prediction Deep Boltzmann Machines
NIPS 2013
Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
NIPS 2013
Generalized Denoising Auto-Encoders as Generative Models
NIPS 2013
On the difficulty of training recurrent neural networks
ICML 2013
Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing
AISTATS 2012
Deep Learning for NLP (without Magic)
ACL 2012
Random Search for Hyper-Parameter Optimization
JMLR 2012
Learning Algorithms for the Classification Restricted Boltzmann Machine
JMLR 2012
Deep Sparse Rectifier Neural Networks
AISTATS 2011
Discussion of “The Neural Autoregressive Distribution Estimator”
AISTATS 2011
Shallow vs. Deep Sum-Product Networks
NIPS 2011
Algorithms for Hyper-Parameter Optimization
NIPS 2011
The Manifold Tangent Classifier
NIPS 2011
On Tracking The Partition Function
NIPS 2011
A Spike and Slab Restricted Boltzmann Machine
AISTATS 2011
Deep Learners Benefit More from Out-of-Distribution Examples
AISTATS 2011
Tempered Markov Chain Monte Carlo for training of Restricted Boltzmann Machines
AISTATS 2010
Why Does Unsupervised Pre-training Help Deep Learning?
JMLR 2010
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
JMLR 2010
Understanding the difficulty of training deep feedforward neural networks
AISTATS 2010
Word Representations: A Simple and General Method for Semi-Supervised Learning
ACL 2010
Why Does Unsupervised Pre-training Help Deep Learning?
AISTATS 2010
Slow, Decorrelated Features for Pretraining Complex Cell-like Networks
NIPS 2009
Exploring Strategies for Training Deep Neural Networks
JMLR 2009
Quadratic Features and Deep Architectures for Chunking
NAACL 2009
Incorporating Functional Knowledge in Neural Networks
JMLR 2009
An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism
NIPS 2009
Augmented Functional Time Series Representation and Forecasting with Gaussian Processes
NIPS 2007
Topmoumoute Online Natural Gradient Algorithm
NIPS 2007
Learning the 2-D Topology of Images
NIPS 2007
Greedy Layer-Wise Training of Deep Networks
NIPS 2006
No Unbiased Estimator of the Variance of K-Fold Cross-Validation
JMLR 2004
Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models
ACL 2004
Extensions to Metric-Based Model Selection
JMLR 2003
A Neural Probabilistic Language Model
JMLR 2003