Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Techniques
Deep Learning
›
Techniques
›
Pretraining
2471 directly classified papers
Papers per year
2009: 2
2010: 1
2012: 1
2013: 1
2014: 4
2015: 5
2016: 19
2017: 26
2018: 33
2019: 117
2020: 218
2021: 311
2022: 333
2023: 451
2024: 448
2025: 373
2026: 128
Papers
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
INTERSPEECH 2024
Optimizing Diffusion Noise Can Serve As Universal Motion Priors
CVPR 2024
PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion
CVPR 2024
DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing
CVPR 2024
Do Vision and Language Encoders Represent the World Similarly?
CVPR 2024
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
INTERSPEECH 2024
Exploring Pre-trained Speech Model for Articulatory Feature Extraction in Dysarthric Speech Using ASR
INTERSPEECH 2024
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
CVPR 2024
Retrieval-Augmented Egocentric Video Captioning
CVPR 2024
Speech and Language Recognition with Low-rank Adaptation of Pretrained Models
INTERSPEECH 2024
Data-Efficient Multimodal Fusion on a Single GPU
CVPR 2024
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
CVPR 2024
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis
NIPS 2024
An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
INTERSPEECH 2024
LoRA-MER: Low-Rank Adaptation of Pre-Trained Speech Models for Multimodal Emotion Recognition Using Mutual Information
INTERSPEECH 2024
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach
CVPR 2024
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
CVPR 2024
Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models
INTERSPEECH 2024
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
CVPR 2024
Cross-transfer Knowledge between Speech and Text Encoders to Evaluate Customer Satisfaction
INTERSPEECH 2024
VoxSim: A perceptual voice similarity dataset
INTERSPEECH 2024
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
NIPS 2024
Textual-Driven Adversarial Purification for Speaker Verification
INTERSPEECH 2024
All in One: Multi-task Prompting for Graph Neural Networks (Extended Abstract)
IJCAI 2024
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
NIPS 2024
<
1
…
35
36
37
…
99
>