conftrace_

Satyapriya Krishna

15 papers · 2021–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+6 more ↓

🌈 Renaissance Researcher (6) 🐝 Cross-Pollinator (5) 🌍 Conference Polyglot (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge

🌈 Renaissance Researcher (6) 👥 Mega-Team (31) ❓ The Questioner (3) 🔥 Unstoppable (5) 📈 Trend Setter 💎 Century Club (13)

Conferences

ACL (5) NIPS (3) ICML (2) EACL (1) EMNLP (1) ICLR (1) IJCNLP (1) NAACL (1)

Top co-authors

Rahul Gupta (8) Kai-Wei Chang (6) Himabindu Lakkaraju (5) Jwala Dhamala (4) Yada Pruksachatkun (4) Jiaqi Ma (2) Anil Ramakrishna (2) Apurv Verma (2) Chirag Agarwal (2) Aram Galstyan (2)

Keywords

large language model (3) post hoc explanation (2) continual learning (1) natural language processing (1) text classification (1) knowledge distillation (1) named entity recognition (1) machine learning (1) factuality evaluation (1) text generation (1) explainable ai (1) machine unlearning (1) feature attribution (1) safety alignment (1) reinforcement learning from human feedback (1) in-context learning (1) benchmark dataset (1) fairness metric (1) language model (1) benchmark evaluation (1)

Papers

From Narrow Unlearning to Emergent Misalignment in LLMs ACL 2026 ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System ACL 2026 More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness ICLR 2025 Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation NAACL 2025 Croissant: A Metadata Format for ML-Ready Datasets NIPS 2024 Understanding the Effects of Iterative Prompting on Truthfulness ICML 2024 Post Hoc Explanations of Language Models Can Improve Language Models NIPS 2023 Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten ICML 2023 OpenXAI: Towards a Transparent Evaluation of Model Explanations NIPS 2022 Measuring Fairness of Text Classifiers via Prediction Sensitivity ACL 2022 Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal ACL 2022 Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification IJCNLP 2021 Towards Realistic Single-Task Continuous Learning Research for NER EMNLP 2021 ADePT: Auto-encoder based Differentially Private Text Transformation EACL 2021 Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification ACL 2021