conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Models
Deep Learning
›
Models
›
Foundation Models
278 papers
Papers per year
2021: 5
5
2022: 13
13
2023: 23
23
2024: 104
104
2025: 117
117
2026: 16
16
Papers
(Almost) Free Modality Stitching of Foundation Models
EMNLP 2025
Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D
EMNLP 2025
fLSA: Learning Semantic Structures in Document Collections Using Foundation Models
EMNLP 2025
SEMMA: A Semantic Aware Knowledge Graph Foundation Model
EMNLP 2025
SciSketch: An Open-source Framework for Automated Schematic Diagram Generation in Scientific Papers
EMNLP 2025
Enhancing Foundation Models in Transaction Understanding with LLM-based Sentence Embeddings
EMNLP 2025
VisualEDU: A Benchmark for Assessing Coding and Visual Comprehension through Educational Problem-Solving Video Generation
EMNLP 2025
jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval
EMNLP 2025
Detect Anything 3D in the Wild
ICCV 2025
Equipping Vision Foundation Model with Mixture of Experts for Out-of-Distribution Detection
ICCV 2025
Find Any Part in 3D
ICCV 2025
SAM4D: Segment Anything in Camera and LiDAR Streams
ICCV 2025
Scaling Laws for Native Multimodal Models
ICCV 2025
SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images
ICCV 2025
Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection
ICCV 2025
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
ICCV 2025
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
ICCV 2025
Unified Multimodal Understanding via Byte-Pair Visual Encoding
ICCV 2025
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
ICCV 2025
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
ICCV 2025
FE-CLIP: Frequency Enhanced CLIP Model for Zero-Shot Anomaly Detection and Segmentation
ICCV 2025
RoboTron-Mani: All-in-One Multimodal Large Model for Robotic Manipulation
ICCV 2025
DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation
ICCV 2025
Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?
ICCV 2025
Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild
ICCV 2025
<
1
2
3
4
5
…
12
>