Papers
3,922 papers found
Is Information Density Uniform when Utterances are Grounded on Perception and Discourse?
Matteo Gay, Coleman Haley, Mario Giulianelli et al.
Is Micro Domain-Adaptive Pre-Training Effective for Real-World Operations? Multi-Step Evaluation Reveals Potential and Bottlenecks
Masaya Tsunokake, Yuta Koreeda, Terufumi Morishita et al.
Is Sentiment Banana-Shaped? Exploring the Geometry and Portability of Sentiment Concept Vectors
Laurits Lyngbaek, Pascale Feldkamp, Yuri Bizzoni et al.
Is This LLM Library Learning? Evaluation Must Account For Compute and Behaviour
Ian Berlot-Attwell, Tobias Sesterhenn, Frank Rudzicz et al.
Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration
Guangxin Wu, Hao Zhang, Zhang Zhibin et al.
It’s All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
Cristian Santini, Marieke van Erp, Mehwish Alam
ITUNLP2 at MWE-2026 AdMIRe 2: Modular Zero-Shot Pipelines for Multimodal Idiom Grounding and Ranking
Özge Umut, Bora Şenceylan
ITUNLP at MWE-2026 AdMIRe 2: A Zero-Shot LLM Pipeline for Multimodal Idiom Understanding and Ranking
Atakan Site, Oğuz Ali Arslan, Gülşen Eryiğit
IYKYK: Using language models to decode extremist cryptolects
Christine de Kock, Arij Riabi, Zeerak Talat et al.
Jailbreaking Safeguarded Text-to-Image Models via Large Language Models
Zhengyuan Jiang, Yuepeng Hu, Yuchen Yang et al.
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
James Beetham, Souradip Chakraborty, Mengdi Wang et al.
JEEM: Vision-Language Understanding in Four Arabic Dialects
Karima Kadaoui, Hanin Atwany, Hamdan Al-Ali et al.
JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models’ Detection of Human risky health behavior Content in Jirai Community
Yunze Xiao, Tingyu He, Lionel Z. Wang et al.
Joint Multimodal Preference Optimization for Fine-Grained Visual-Textual Alignment
Jiwon Kim, Hyunsoo Yoon
Journey Before Destination: On the importance of Visual Faithfulness in Slow Thinking
Rheeya Uppaal, Phu Mon Htut, Min Bai et al.
JuriFindIT: an Italian legal retrieval dataset
Niko Dalla Noce, Davide Colla, Sina Farhang Doust et al.
KAD: A Framework for Proxy-based Test-time Alignment with Knapsack Approximation Deferral
Ayoub Hammal, Pierre Zweigenbaum, Caio Corro
Kahaani: A Multimodal Co-Creative Storytelling System
Samee Arif, Muhammad Saad Haroon, Aamina Jamal Khan et al.
Kashif-AI at AbjadGenEval Shared Task: A Transformer-based Approach for Arabic AI-Generated Text Detection
Fatimah Mohamed Emad Eldin
KazakhOCR: A Synthetic Benchmark for Evaluating Multimodal Models in Low-Resource Kazakh Script OCR
Henry Gagnier, Sophie Gagnier, Ashwin Kirubakaran
KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation
Jiabin Fan, Guoqing Luo, Michael Bowling et al.
KG-CRAFT: Knowledge Graph-based Contrastive Reasoning with LLMs for Enhancing Automated Fact-checking
Vítor Lourenço, Aline Paes, Tillman Weyde et al.
KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge
Alex Robertson, Huizhi Liang, Mahbub Gani et al.
KidsArtBench: Multi-Dimensional Children’s Art Evaluation with Attribute-Aware MLLMs
Mingrui Ye, Chanjin Zheng, Zengyi Yu et al.
K-LegalDeID: A Benchmark Dataset and KLUEBERT-CRF for De-identification in Korean Court Judgments
Wooseok Choi, Hyungbin Kim, Yon Dohn Chung