Ruoxi Jia

59 papers · 2019–2026 · 14 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (13) 🗺️ Taxonomy Completionist (12) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (7)

🏃 Academic Marathon (7) 🐝 Cross-Pollinator (14) 🌈 Renaissance Researcher (9) 🌱 Topic Pioneer 🏆 Grand Slam 🤝 Dynamic Duo (17) 🔬 Deep Specialist (12) 👑 Triple Crown 👥 Mega-Team (23) 🏆 Keyword Champion (2) ⚡ Prolific Year (17) 💎 Century Club (57) 🔥 Unstoppable (8) ❓ The Questioner (2) 🗃️ Keyword Collector (199)

Conferences

ICLR (13) ICML (8) NIPS (8) EMNLP (6) CVPR (5) ICCV (4) AAAI (3) ACL (3) AISTATS (3) NAACL (2) EACL (1) L4DC (1) UAI (1) WACV (1)

Top co-authors

Ming Jin (17) Yi Zeng (15) Dawn Song (12) Prateek Mittal (9) Jiachen T. Wang (8) Bo Li (8) Si Chen (7) Feiyang Kang (7) Hoang Anh Just (7) Bilgehan Sel (6)

Research topics

Differential Privacy (4) Privacy (3) Core Methods (1) Applications (1) Security & Privacy (1)

Keywords

large language model (8) data valuation (7) shapley value (5) model inversion attack (4) differential privacy (4) diffusion model (4) privacy-preserving machine learning (3) adversarial attack (3) backdoor attack (3) adversarial learning (2) language model (2) data selection (2) training datum (2) generative adversarial network (2) generative model (2) stochastic gradient descent (2) model security (2) instruction tuning (2) privacy attack (2) face recognition (2)

Papers

CONCORD: Concept-Informed Diffusion for Dataset Distillation WACV 2026 MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling EACL 2026 Optimizing Product Provenance Verification Using Data Valuation Methods AAAI 2026 LLMs Can Plan Only If We Tell Them ICLR 2025 Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation ICCV 2025 LLMs Can Reason Faster Only If We Let Them ICML 2025 Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning ICML 2025 Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls EMNLP 2025 Retracing the Past: LLMs Emit Training Data When They Get Lost EMNLP 2025 DiPT: Enhancing LLM Reasoning through Diversified Perspective-Taking NAACL 2025 MLAN: Language-Based Instruction Tuning Preserves and Transfers Knowledge in Multimodal Language Models ACL 2025 Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data ICLR 2025 SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal ICLR 2025 Data Shapley in One Training Run ICLR 2025 AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories ICLR 2025 Capturing the Temporal Dependence of Training Data Influence ICLR 2025 Detecting Adversarial Data Using Perturbation Forgery CVPR 2025 BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models EMNLP 2024 Fairness-Aware Meta-Learning via Nash Bargaining NIPS 2024 Boosting Alignment for Post-Unlearning Text-to-Image Generative Models NIPS 2024 GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration NIPS 2024 Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs ACL 2024 How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs ACL 2024 Efficient Data Shapley for Weighted Nearest Neighbor Algorithms AISTATS 2024 The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes CVPR 2024 Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization? EMNLP 2024 FASTTRACK: Reliable Fact Tracing via Clustering and LLM-Powered Evidence Validation EMNLP 2024 Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! ICLR 2024 Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs ICLR 2024 Position: A Safe Harbor for AI Evaluation and Red Teaming ICML 2024 Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models ICML 2024 Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits ICML 2024 RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content ICML 2024 Learning to Rank for Active Learning via Multi-Task Bilevel Optimization UAI 2024 Towards Robustness Certification Against Universal Perturbations ICLR 2023 LAVA: Data Valuation without Pre-Specified Learning Algorithms ICLR 2023 Learning-to-Learn to Guide Random Search: Derivative-Free Meta Blackbox Optimization on Manifold L4DC 2023 Data Banzhaf: A Robust Data Valuation Framework for Machine Learning AISTATS 2023 Revisiting Data-Free Knowledge Distillation with Poisoned Teachers ICML 2023 2D-Shapley: A Framework for Fragmented Data Valuation ICML 2023 Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources NIPS 2023 A Privacy-Friendly Approach to Data Valuation NIPS 2023 A Randomized Approach to Tight Privacy Accounting NIPS 2023 On Solution Functions of Optimization: Universal Approximation and Covering Number Bounds AAAI 2023 Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study ICCV 2023 Selective Differential Privacy for Language Modeling NAACL 2022 Just Fine-tune Twice: Selective Differential Privacy for Large Language Models EMNLP 2022 Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning NIPS 2022 CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks NIPS 2022 Label-Only Model Inversion Attacks via Boundary Repulsion CVPR 2022 Adversarial Unlearning of Backdoors via Implicit Hypergradient ICLR 2022 Knowledge-Enriched Distributional Model Inversion Attacks ICCV 2021 InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective ICLR 2021 Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification? CVPR 2021 Improving Robustness to Model Inversion Attacks via Mutual Information Regularization AAAI 2021 Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective ICCV 2021 Robust anomaly detection and backdoor attack detection via differential privacy ICLR 2020 The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks CVPR 2020 Towards Efficient Data Valuation Based on the Shapley Value AISTATS 2019