Papers - Conftrace

Visual Agents as Fast and Slow Thinkers

Guangyan Sun, Mingyu Jin, Zhenting Wang et al.

2025 ICLR

Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs

Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar et al.

2025 ICLR

Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark

Tsung-Han Wu, Giscard Biamby, Jerome Quenum et al.

2025 ICLR

Visually Consistent Hierarchical Image Classification

Seulki Park, Youren Zhang, Stella X. Yu et al.

2025 ICLR

Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models

Donghoon Kim, Minji Bae, Kyuhong Shim et al.

2025 ICLR

Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning

Minheng Ni, YuTao Fan, Lei Zhang et al.

2025 ICLR

Visual Perturbation and Adaptive Hard Negative Contrastive Learning for Compositional Reasoning in Vision-Language Models

Xin Huang, Ruibin Li, Tong Jia et al.

2025 IJCAI

VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning

Yichao Liang, Nishanth Kumar, Hao Tang et al.

2025 ICLR

VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation

Wei Zhao, Pengxiang Ding, Zhang Min et al.

2025 ICLR

VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration

Dezhan Tu, Danylo Vashchilenko, Yuzhe Lu et al.

2025 ICLR

VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning

Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales

2025 ICLR

VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

Ziyan Jiang, Rui Meng, Xinyi Yang et al.

2025 ICLR

VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning

Nilay Yilmaz, Maitreya Patel, Yiran Lawrence Luo et al.

2025 ICLR

VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?

Xize Cheng, Ruofan Hu, Xiaoda Yang et al.

2025 ICLR

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

Yumeng Li, William H. Beluch, Margret Keuper et al.

2025 ICLR

VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning

Qingtao Liu, Yu Cui, Zhengnan Sun et al.

2025 ICLR

VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems

Xudong Gong, Feng Dawei, Kele Xu et al.

2025 ICLR

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

Katie Matton, Robert Ness, John Guttag et al.

2025 ICLR

Ward: Provable RAG Dataset Inference via LLM Watermarks

Nikola Jovanović, Robin Staab, Maximilian Baader et al.

2025 ICLR

WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning

Kai Jungel, Dario Paccagnan, Axel Parmentier et al.

2025 ICLR

Warm Diffusion: Recipe for Blur-Noise Mixture Diffusion Models

Hao-Chien Hsueh, Wen-Hsiao Peng, Ching-Chun Huang

2025 ICLR

Wasserstein Distances, Neuronal Entanglement, and Sparsity

Shashata Sawmya, Linghao Kong, Ilia Markov et al.

2025 ICLR

Wasserstein-Regularized Conformal Prediction under General Distribution Shift

Rui Xu, Chao Chen, Yue Sun et al.

2025 ICLR

Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy

Jiangxing Wang, Zongqing Lu

2025 ICLR

Watermark Anything With Localized Messages

Tom Sander, Pierre Fernandez, Alain Oliviero Durmus et al.

2025 ICLR