conftrace_

Artificial Intelligence › Core AI ›

Robustness

49 papers

Papers per year

1

2

3

1

1

3

1

37

Papers

Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation ACL 2026

Stable Language Guidance for Vision–Language–Action Models ACL 2026

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency ACL 2026

When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure ACL 2026

Teach a Reward Model to Correct Itself: Reward Guided Adversarial Failure Discovery for Robust Reward Modeling ACL 2026

Sycophants in the Courtroom: Are LLMs Fragile to Juridical Authority and Evolving Legal Standards? ACL 2026

Locket: Robust Feature-Locking Technique for Language Models ACL 2026

Robertha: Eigenspectrum Regularized Attention for Robust Natural Language Understanding ACL 2026

Evaluating Robustness of Large Language Models Against Multilingual Typographical Errors ACL 2026

SEE: Signal Embedding Energy for Quantifying Noise Interference in Large Audio Language Models ACL 2026

Identity-Robust Language Model Generation via Content Integrity Preservation ACL 2026

Backdoor Collapse: Eliminating Unknown Threats Via Known Backdoor Aggregation In Language Models ACL 2026

Probing the Safety Robustness of LLMs in Latent Space ACL 2026

Efficient Prior-Guided Reasoning for Robust Retrieval-Augmented Generation under Conflicts ACL 2026

RST-Guarder: Enhancing Long-Context Robustness for Safeguards via RST Parsing and Probabilistic Inference ACL 2026

Truth or Sophistry? LoFa: A Benchmark for LLM Robustness Against Logical Fallacies ACL 2026

Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models ACL 2026

SafetyMem: Adaptive Jailbreak Defense via Dual-Component Safety Memory ACL 2026

Quantifying and Improving the Robustness of Retrieval-Augmented Language Models Against Spurious Features in Grounding Data ACL 2026

Resolving the Security-Auditability Dilemma with Auditable Latent Chain-of-Thought Alignment ACL 2026

DiVE: Decoupling Intra-layer Visual Evidence for Mitigating Hallucinations in Large Vision-Language Models ACL 2026

Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents ACL 2026

Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy ACL 2026

Retrieval-Augmented Defense: Adaptive and Controllable Jailbreak Prevention for Large Language Models ACL 2026

Still Between Us? Evaluating and Improving Voice Assistant Robustness to Third-Party Interruptions ACL 2026