Papers
VOLTA: Improving Generative Diversity by Variational Mutual Information Maximizing Autoencoder
Yueen Ma, DaFeng Chi, Jingjing Li et al.
WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction
Augustin Toma, Ronald Xie, Steven Palayew et al.
WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models
Ronald Xie, Steven Palayew, Augustin Toma et al.
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models
Piotr Molenda, Adian Liusie, Mark Gales
Wav2pos: Exploring syntactic analysis from audio for Highland Puebla Nahuatl
Robert Pugh, Varun Sreedhar, Francis Tyers
WebWISE: Unlocking Web Interface Control for LLMs via Sequential Exploration
Heyi Tao, Sethuraman T V, Michal Shlapentokh-Rothman et al.
Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection
Ayan Datta, Aryan Chandramania, Radhika Mamidi
Weight-Inherited Distillation for Task-Agnostic BERT Compression
Taiqiang Wu, Cheng Hou, Shanshan Lao et al.
Werkzeug at SemEval-2024 Task 8: LLM-Generated Text Detection via Gated Mixture-of-Experts Fine-Tuning
Youlin Wu, Kaichun Wang, Kai Ma et al.
What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases
Anthony Tiong, Junqi Zhao, Boyang Li et al.
What Causes the Failure of Explicit to Implicit Discourse Relation Recognition?
Wei Liu, Stephen Wan, Michael Strube
whatdoyoumeme at SemEval-2024 Task 4: Hierarchical-Label-Aware Persuasion Detection using Translated Texts
Nishan Chatterjee, Marko Pranjic, Boshko Koloski et al.
What Drives Performance in Multilingual Language Models?
Sina Bagheri Nezhad, Ameeta Agrawal
What explains the success of cross-modal fine-tuning with ORCA?
Paloma García-de-Herreros, Vagrant Gautam, Philipp Slusallek et al.
What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception
Chaitanya Malaviya, Subin Lee, Dan Roth et al.
What Makes Math Word Problems Challenging for LLMs?
Kv Aditya Srivatsa, Ekaterina Kochmar
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs?
Yan Zeng, Hanbo Zhang, Jiani Zheng et al.
What’s wrong with your model? A Quantitative Analysis of Relation Classification
Elisa Bassignana, Rob van der Goot, Barbara Plank
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
Christos Baziotis, Biao Zhang, Alexandra Birch et al.
When Elote, Choclo and Mazorca are not the Same. Isomorphism-Based Perspective to the Spanish Varieties Divergences
Cristina España-Bonet, Ankur Bhatt, Koel Dutta Chowdhury et al.
When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models
Yanhong Li, Chenghao Yang, Allyson Ettinger
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Weiyan Shi, Emily Dinan, Kurt Shuster et al.
When Quantization Affects Confidence of Large Language Models?
Irina Proskurina, Luc Brun, Guillaume Metzler et al.
When XGBoost Outperforms GPT-4 on Text Classification: A Case Study
Matyas Bohacek, Michal Bravansky