conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Security
95 papers
Papers per year
2017: 1
1
2022: 2
2
2023: 1
1
2024: 4
4
2025: 4
4
2026: 83
83
Papers
False Friends in the Shell: Unveiling the Emoticon Semantic Confusion in Large Language Models
ACL 2026
SoundBreak: A Systematic Study of Audio-Only Adversarial Attacks on Trimodal Models
ACL 2026
You Can Have a Second Chance: Unbiased and Multi-bit Watermarking for Diffusion Language Models with Regret-based Remasking
ACL 2026
VerilogLAVD: LLM-Aided Pattern Generation for Verilog CWE Detection
ACL 2026
On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference
ACL 2026
Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation
ACL 2026
ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments
ACL 2026
Frankentext: Stitching random text fragments into long-form narratives
ACL 2026
Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
ACL 2026
A Multi-Agent Framework for High-Interaction Terminal Simulation
ACL 2026
RedCoder: Automated Multi-Turn Red Teaming for Code LLMs
ACL 2026
PIArena: A Platform for Prompt Injection Evaluation
ACL 2026
Conjunctive Prompt Attacks in Multi-Agent LLM Systems
ACL 2026
SSG: Logit-Balanced Vocabulary Partitioning for LLM Watermarking
ACL 2026
From TDMA to CDMA: A Multi-bit Watermark for Diffusion Language Models
ACL 2026
When Efficiency Becomes a Vulnerability: Computational Cost Attacks on WebAgents
ACL 2026
CodeRipple: Wavelet-Based Detection of LLM-Generated Code
ACL 2026
BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks
ACL 2026
ASTRA: An Automated Framework for Strategy Discovery, Retrieval, and Evolution for Jailbreaking LLMs
ACL 2026
Activation Decomposition and Steering for LLM Backdoor Remediation
ACL 2026
Route to Rome Attack: Directing LLM Routers to Expensive Models via Adversarial Suffix Optimization
ACL 2026
Don’t Corrupt the Fact: A Trustworthy RAG Watermarking Framework based on Dual Factual Shield
ACL 2026
JARVIS or Ultron? A Survey on the Safety and Security Threats of Computer-Using Agents
ACL 2026
ReasMark: A Robust Watermark for Attributing LLM Reasoning Under Knowledge Distillation Attacks
ACL 2026
TROJail: Trajectory-Level Optimization for Multi-Turn Large Language Model Jailbreaks with Process Rewards
ACL 2026
<
1
2
3
4
>