Rui Zheng
43 papers · 2020–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Conference Polyglot (11) π Academic Marathon (5) π§ Keyword Pioneer π Interdisciplinary Bridge π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π€
Dynamic Duo
(34)
π
Triple Crown
π
Grand Slam
π₯
Mega-Team
(34)
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π
Conference Pioneer
β‘
Prolific Year
(9)
β
The Questioner
π
Trend Setter
π₯
Unstoppable
(6)
ποΈ
Keyword Collector
(170)
π
Century Club
(40)
Conferences
ACL (15)
EMNLP (9)
AAAI (4)
COLING (4)
ICLR (3)
ICML (2)
MICCAI (2)
CVPR (1)
IJCAI (1)
IJCNLP (1)
NIPS (1)
Top co-authors
Keywords
large language model
(9)
language model
(5)
reinforcement learning from human feedback
(5)
reinforcement learning
(5)
adversarial attack
(4)
reward model
(4)
text classification
(4)
preference alignment
(3)
reward modeling
(3)
contrastive learning
(3)
adversarial training
(3)
adversarial defense
(3)
code generation
(3)
model robustness
(3)
language model alignment
(3)
continual learning
(2)
transfer learning
(2)
distribution shift
(2)
out-of-distribution generalization
(2)
adversarial robustness
(2)
Papers
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study
AAAI 2026
Time-Frequency Token Advantage Clipping for Training Efficient Large Reasoning Model
AAAI 2026
MetaAct-RL: Training Language Models for Reasoning Through Meta-Action-Based Reinforcement Learning
AAAI 2026
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
ICLR 2025
Alleviating Shifted Distribution in Human Preference Alignment through Meta-Learning
AAAI 2025
AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments
ACL 2025
Multi-Programming Language Sandbox for LLMs
ACL 2025
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Models
CVPR 2025
Toward Optimal LLM Alignments Using Two-Player Games
EMNLP 2025
Fine-Grained Manipulation of Arithmetic Neurons
EMNLP 2025
RMB: Comprehensively benchmarking reward models in LLM alignment
ICLR 2025
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
ACL 2024
StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback
ACL 2024
Uncertainty Aware Learning for Language Model Alignment
ACL 2024
Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation
ACL 2024
Reliable Source Approximation: Source-Free Unsupervised Domain Adaptation for Vestibular Schwannoma MRI Segmentation
MICCAI 2024
ORTicket: Let One Robust BERT Ticket Transfer across Different Tasks
COLING 2024
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals
COLING 2024
RoCoSDF: Row-Column Scanned Neural Signed Distance Fields for Freehand 3D Ultrasound Imaging Shape Reconstruction
MICCAI 2024
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning
EMNLP 2024
Reward Modeling Requires Automatic Adjustment Based on Data Quality
EMNLP 2024
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
ICLR 2024
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
NIPS 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
ICML 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
ICML 2024
Enhancing Contrastive Learning with Noise-Guided Attack: Towards Continual Relation Extraction in the Wild
ACL 2024
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
EMNLP 2023
CASN:Class-Aware Score Network for Textual Adversarial Detection
ACL 2023
Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization
ACL 2023
Characterizing the Impacts of Instances on Robustness
ACL 2023
Detecting Adversarial Samples through Sharpness of Loss Landscape
ACL 2023
Connectivity Patterns are Task Embeddings
ACL 2023
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback
EMNLP 2023
RealBehavior: A Framework for Faithfully Characterizing Foundation Modelsβ Human-like Behavior Mechanisms
EMNLP 2023
Orthogonal Subspace Learning for Language Model Continual Learning
EMNLP 2023
PlugAT: A Plug and Play Module to Defend against Textual Adversarial Attack
COLING 2022
Efficient Adversarial Training with Robust Early-Bird Tickets
EMNLP 2022
Flooding-X: Improving BERTβs Resistance to Adversarial Attacks via Loss-Restricted Fine-Tuning
ACL 2022
Robust Lottery Tickets for Pre-trained Language Models
ACL 2022
Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective
COLING 2022
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
ACL 2021
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
IJCNLP 2021
GestureDet: Real-time Student Gesture Analysis with Multi-dimensional Attention-based Detector
IJCAI 2020