Papers
Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences?
Yingjin Song, Yupei Du, Denis Paperno et al.
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient
Yuan Gao, Zujing Liu, Weizhong Zhang et al.
Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems
William Hackett, Lewis Birch, Stefan Trawicki et al.
Byte Latent Transformer: Patches Scale Better Than Tokens
Artidoro Pagnoni, Ramakanth Pasunuru, Pedro Rodriguez et al.
C2KD: Cross-layer and Cross-head Knowledge Distillation for Small Language Model-based Recommendation
Xiao Chen, Changyi Ma, Wenqi Fan et al.
C2LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation
Yanyang Li, Wong Tin Long, Cheung To Hung et al.
C²RBench: A Chinese Complex Reasoning Benchmark for Large Language Models
Junru Wu, Tianhao Shen, Linxi Su et al.
CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction
Jiali Chen, Xusen Hei, HongFei Liu et al.
CA-GAR: Context-Aware Alignment of LLM Generation for Document Retrieval
Heng Yu, Junfeng Kang, Rui Li et al.
CAIDAS at SemEval-2025 Task 7: Enriching Sparse Datasets with LLM-Generated Content for Improved Information Retrieval
Dominik Benchert, Severin Meßlinger, Sven Goller et al.
CAISA at SemEval-2025 Task 7: Multilingual and Cross-lingual Fact-Checked Claim Retrieval
Muqaddas Haroon, Shaina Ashraf, Ipek Baris et al.
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
Haitao Li, Junjie Chen, Qingyao Ai et al.
Call for Rigor in Reporting Quality of Instruction Tuning Data
Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
CaLMQA: Exploring culturally specific long-form question answering across 23 languages
Shane Arora, Marzena Karpinska, Hung-Ting Chen et al.
CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration
Yizhe Yang, Palakorn Achananuparp, Heyan Huang et al.
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device
Yicheng Fu, Raviteja Anantha, Jianpeng Cheng
Can a Large Language Model Keep My Secrets? A Study on LLM-Controlled Agents
Niklas Hemken, Sai Koneru, Florian Jacob et al.
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz, Jeremiah Greer, Akul Datta et al.
Can Community Notes Replace Professional Fact-Checkers?
Nadav Borenstein, Greta Warren, Desmond Elliott et al.
Can Explicit Gender Information Improve Zero-Shot Machine Translation?
Van-Hien Tran, Huy Hien Vu, Hideki Tanaka et al.
Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?
Arduin Findeis, Floris Weers, Guoli Yin et al.
Can GPTZero’s AI Vocabulary Distinguish Between LLM-Generated and Student-Written Essays?
Veronica Schmalz, Anaïs Tack
Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?
Yuyao Ge, Shenghua Liu, Baolong Bi et al.
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?
Zihao Li, Lecheng Zheng, Bowen Jin et al.
Can Hallucination Correction Improve Video-Language Alignment?
Lingjun Zhao, Mingyang Xie, Paola Cascante-Bonilla et al.