conftrace_

Kai Chen

191 papers · 2012–2026 · 18 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+20 more ↓

🗺️ Taxonomy Completionist (41) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🏃 Academic Marathon (14) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🏠 Conference Loyalist (21) 🌟 Keyword Trendsetter Combo (7) 🤝 Dynamic Duo (41) 👑 Triple Crown 🏆 Keyword Champion (4) 🧬 Topic Evolution 🏆 Grand Slam 👥 Mega-Team (30) 🌱 Topic Pioneer 🔬 Deep Specialist (23) 🚀 Conference Pioneer 🔥 Unstoppable (12) ❓ The Questioner (6) 💎 Century Club (184) 🗃️ Keyword Collector (112) ⚡ Prolific Year (64) 📈 Trend Setter

Conferences

CVPR (34) ACL (28) NIPS (21) AAAI (18) ECCV (14) ICCV (14) NSDI (13) EMNLP (11) ICLR (10) INTERSPEECH (7) WACV (4) IJCAI (4) NAACL (3) COLING (3) MICCAI (2) ICML (2) OSDI (2) RSS (1)

Top co-authors

Dahua Lin (43) Wenwei Zhang (28) Songyang Zhang (19) Lanqing Hong (18) Haodong Duan (17) Zhenguo Li (17) Jiaqi Wang (16) Chen Change Loy (15) Yining Li (12) Dit-Yan Yeung (12)

Research topics

Privacy (2) Mathematics (1)

Keywords

large language model (36) object detection (13) benchmark evaluation (13) diffusion model (12) semantic segmentation (9) language model (8) synthetic datum (7) evaluation benchmark (6) multimodal learning (6) instruction tuning (6) instance segmentation (6) self-supervised learning (6) multimodal large language model (6) reinforcement learning (5) multi-modal learning (5) knowledge distillation (5) code generation (5) representation learning (5) vision-language model (5) image segmentation (4)

Papers

MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes WACV 2026 MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing ACL 2026 Powering Verifiable Learning via Automated Evolutionary Data Synthesis ACL 2026 League of LLMs: A Benchmark-Free Paradigm for Mutual Evaluation of Large Language Models ACL 2026 FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion ACL 2026 Timely Machine: Awareness of Time Makes Test-Time Scaling Agentic ACL 2026 Rethinking Flow and Diffusion Bridge Models for Speech Enhancement AAAI 2026 Enhancing Logical Expressiveness in Graph Neural Networks via Path-Neighbor Aggregation AAAI 2026 FaceShot: Bring Any Character into Life ICLR 2025 Social Recommendation via Graph-Level Counterfactual Augmentation AAAI 2025 DuMo: Dual Encoder Modulation Network for Precise Concept Erasure AAAI 2025 Semantic-guided Masked Mutual Learning for Multi-modal Brain Tumor Segmentation with Arbitrary Missing Modalities AAAI 2025 LLM-DR: A Novel LLM-Aided Diffusion Model for Rule Generation on Temporal Knowledge Graphs AAAI 2025 Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning AAAI 2025 RepeatLeakage: Leak Prompts from Repeating as Large Language Model Is a Good Repeater AAAI 2025 Mixture of insighTful Experts (MoTE): The Synergy of Reasoning Chains and Expert Mixtures in Self-Alignment ACL 2025 Scaling up the State Size of RNN LLMs for Long-Context Scenarios ACL 2025 Redundancy Principles for MLLMs Benchmarks ACL 2025 CritiQ: Mining Data Quality Criteria from Human Preferences ACL 2025 OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference ACL 2025 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement ACL 2025 Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law ACL 2025 What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices ACL 2025 SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution ACL 2025 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model ACL 2025 MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space ACL 2025 Are Your LLMs Capable of Stable Reasoning? ACL 2025 FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models COLING 2025 Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study COLING 2025 Hybrid Reciprocal Transformer with Triplet Feature Alignment for Scene Graph Generation CVPR 2025 TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models CVPR 2025 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language CVPR 2025 SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction CVPR 2025 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 UnitCoder: Scalable Code Synthesis from Pre-training Corpora EMNLP 2025 MusKGC: A Flexible Multi-source Knowledge Enhancement Framework for Open-World Knowledge Graph Completion EMNLP 2025 STEER-BENCH: A Benchmark for Evaluating the Steerability of Large Language Models EMNLP 2025 Corrupted but Not Broken: Understanding and Mitigating the Negative Impacts of Corrupted Data in Visual Instruction Tuning EMNLP 2025 CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward EMNLP 2025 Training Language Models to Critique With Multi-agent Feedback EMNLP 2025 MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control ICCV 2025 PatchScaler: An Efficient Patch-Independent Diffusion Model for Image Super-Resolution ICCV 2025 Information Density Principle for MLLM Benchmarks ICCV 2025 MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation ICCV 2025 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs ICCV 2025 CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction ICLR 2025 RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything ICLR 2025 Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs ICLR 2025 MindSearch: Mimicking Human Minds Elicits Deep AI Searcher ICLR 2025 ClipGS: Clippable Gaussian Splatting for Interactive Cinematic Visualization of Volumetric Medical Data MICCAI 2025 GREEN: Carbon-efficient Resource Scheduling for Machine Learning Clusters NSDI 2025 Enabling Efficient GPU Communication over Multiple NICs with FuseLink OSDI 2025 Automated Evaluation of Large Vision-Language Models on Self-Driving Corner Cases WACV 2025 TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models WACV 2025 Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding WACV 2025 Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks NAACL 2024 MagicDrive: Street View Generation with Diverse 3D Geometry Control ICLR 2024 GeoDiffusion: Text-Prompted Geometric Control for Object Detection Data Generation ICLR 2024 Lean Workbook: A large-scale Lean problem set formalized from natural language math problems NIPS 2024 MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding NIPS 2024 Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models NIPS 2024 GTA: A Benchmark for General Tool Agents NIPS 2024 Vision Foundation Model Enables Generalizable Object Pose Estimation NIPS 2024 HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation NIPS 2024 Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization NIPS 2024 MotionBooth: Motion-Aware Customized Text-to-Video Generation NIPS 2024 InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD NIPS 2024 ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models NIPS 2024 CriticEval: Evaluating Large-scale Language Model as Critic NIPS 2024 ANAH: Analytical Annotation of Hallucinations in Large Language Models ACL 2024 A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation ACL 2024 Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively ECCV 2024 Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models ECCV 2024 A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting ECCV 2024 LLM-REDIAL: A Large-Scale Dataset for Conversational Recommender Systems Created from User Behaviors with LLMs ACL 2024 LawBench: Benchmarking Legal Knowledge of Large Language Models EMNLP 2024 How Susceptible are Large Language Models to Ideological Manipulation? EMNLP 2024 ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs EMNLP 2024 Scaling Behavior for Large Language Models regarding Numeral Systems: An Example using Pythia EMNLP 2024 LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models ACL 2024 Flow Scheduling with Imprecise Knowledge NSDI 2024 Accelerating Neural Recommendation Training with Embedding Scheduling NSDI 2024 Towards Domain-Specific Network Transport for Distributed DNN Training NSDI 2024 STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering AAAI 2024 Everything2Motion: Synchronizing Diverse Inputs via a Unified Framework for Human Motion Synthesis AAAI 2024 MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark ACL 2024 T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step ACL 2024 Any-point Trajectory Modeling for Policy Learning RSS 2024 Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs NIPS 2024 YOLOv10: Real-Time End-to-End Object Detection NIPS 2024 Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling MICCAI 2024 Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis ICLR 2024 Safer-Instruct: Aligning Language Models with Automated Preference Data NAACL 2024 EpiGEN: An Efficient Multi-Api Code GENeration Framework under Enterprise Scenario COLING 2024 BotChat: Evaluating LLMs’ Capabilities of Having Multi-Turn Dialogues NAACL 2024 PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models CVPR 2024 OMG-Seg: Is One Model Good Enough For All Segmentation? CVPR 2024 UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement CVPR 2024 EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI CVPR 2024 Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text CVPR 2024 DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception CVPR 2024 RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation CVPR 2024 Towards Language-Driven Video Inpainting via Multimodal Large Language Models CVPR 2024 From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models CVPR 2024 Differentiable Model Scaling using Differentiable Topk ICML 2024 Can AI Assistants Know What They Don’t Know? ICML 2024 AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data NIPS 2024 DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models AAAI 2024 UMA: Facilitating Backdoor Scanning via Unlearning-Based Model Ablation AAAI 2024 Temporal Knowledge Graph Extrapolation via Causal Subhistory Identification IJCAI 2024 LLM Factoscope: Uncovering LLMs’ Factual Discernment through Measuring Inner States ACL 2024 Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models ACL 2024 4D Contrastive Superflows are Dense 3D Representation Learners ECCV 2024 MMBENCH: Is Your Multi-Modal Model an All-around Player? ECCV 2024 ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities ECCV 2024 AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation ECCV 2024 "Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation" ECCV 2024 Implicit Concept Removal of Diffusion Models ECCV 2024 Shape-guided Configuration-aware Learning for Endoscopic-image-based Pose Estimation of Flexible Robotic Instruments ECCV 2024 RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank ACL 2023 Globally Consistent Federated Graph Autoencoder for Non-IID Graphs IJCAI 2023 Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation AAAI 2023 GlyphControl: Glyph Conditional Control for Visual Text Generation NIPS 2023 Boosting Point Clouds Rendering via Radiance Mapping AAAI 2023 TG-VQA: Ternary Game of Video Question Answering IJCAI 2023 Learning Shape Primitives via Implicit Convexity Regularization ICCV 2023 Robo3D: Towards Robust and Reliable 3D Perception against Corruptions ICCV 2023 Mixed Autoencoder for Self-Supervised Visual Representation Learning CVPR 2023 RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer CVPR 2023 Dense Distinct Query for End-to-End Object Detection CVPR 2023 Consistent-Teacher: Towards Reducing Inconsistent Pseudo-Targets in Semi-Supervised Object Detection CVPR 2023 SRNIC: A Scalable Architecture for RDMA NICs NSDI 2023 FLASH: Towards a High-performance Hardware Acceleration Architecture for Cross-silo Federated Learning NSDI 2023 Improving Pixel-based MIM by Reducing Wasted Modeling Capability ICCV 2023 UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework ICCV 2023 Deep Fusion Transformer Network with Weighted Vector-Wise Keypoints Voting for Robust 6D Object Pose Estimation ICCV 2023 Segment Any Point Cloud Sequences by Distilling Vision Foundation Models NIPS 2023 Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts ICLR 2023 TransRank: Self-Supervised Video Representation Learning via Ranking-Based Transformation Recognition CVPR 2022 OCSampler: Compressing Videos to One Clip With Single-Step Sampling CVPR 2022 Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation CVPR 2022 Revisiting Skeleton-Based Action Recognition CVPR 2022 GCFSR: A Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors CVPR 2022 Group R-CNN for Weakly Semi-Supervised Object Detection With Points CVPR 2022 LAVT: Language-Aware Vision Transformer for Referring Image Segmentation CVPR 2022 Dense Siamese Network for Dense Unsupervised Learning ECCV 2022 CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving ECCV 2022 Sim-to-Real 6D Object Pose Estimation via Iterative Self-Training for Robotic Bin Picking ECCV 2022 SMASH: Improving SMAll Language Models’ Few-SHot Ability with Prompt-Based Distillation EMNLP 2022 Tiara: A Scalable and Efficient Hardware Acceleration Architecture for Stateful Layer-4 Load Balancing NSDI 2022 FAERY: An FPGA-accelerated Embedding-based Retrieval System OSDI 2022 Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation NIPS 2022 Attacking Video Recognition Models with Bullet-Screen Comments AAAI 2022 Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing AAAI 2022 RotateQVS: Representing Temporal Information as Rotations in Quaternion Vector Space for Temporal Knowledge Graph Completion ACL 2022 Seesaw Loss for Long-Tailed Instance Segmentation CVPR 2021 MultiSiam: Self-Supervised Multi-Instance Siamese Representation Learning for Autonomous Driving ICCV 2021 SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation ICCV 2021 K-Net: Towards Unified Image Segmentation NIPS 2021 Positional Encoding As Spatial Inductive Bias in GANs CVPR 2021 Learning To Identify Correct 2D-2D Line Correspondences on Sphere CVPR 2021 DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement INTERSPEECH 2021 Temporal ROI Align for Video Object Recognition AAAI 2021 Few-Shot Object Detection via Association and DIscrimination NIPS 2021 Learning Icosahedral Spherical Probability Map Based on Bingham Mixture Model for Vanishing Point Estimation ICCV 2021 U-Net Based Direct-Path Dominance Test for Robust Direction-of-Arrival Estimation INTERSPEECH 2020 Real-Time Scene Text Detection with Differentiable Binarization AAAI 2020 Side-Aware Boundary Localization for More Precise Object Detection ECCV 2020 Prime Sample Attention in Object Detection CVPR 2020 Nonlinear Residual Echo Suppression Based on Multi-Stream Conv-TasNet INTERSPEECH 2020 Extracting Symptoms and their Status from Clinical Conversations ACL 2019 An End-to-End Audio Classification System Based on Raw Waveforms and Mix-Training Strategy INTERSPEECH 2019 CARAFE: Content-Aware ReAssembly of FEatures ICCV 2019 Hybrid Task Cascade for Instance Segmentation CVPR 2019 Region Proposal by Guided Anchoring CVPR 2019 Libra R-CNN: Towards Balanced Learning for Object Detection CVPR 2019 Speech Separation Using Independent Vector Analysis with an Amplitude Variable Gaussian Mixture Model INTERSPEECH 2019 Compression of CTC-Trained Acoustic Models by Dynamic Frame-Wise Distillation or Segment-Wise N-Best Hypotheses Imitation INTERSPEECH 2019 Semi-supervised Learning for Information Extraction from Dialogue INTERSPEECH 2018 PowerMan: An Out-of-Band Management Network for Datacenters Using Power Line Communication NSDI 2018 QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension ICLR 2018 Optimizing Video Object Detection via a Scale-Time Lattice CVPR 2018 Enabling Wide-Spread Communications on Optical Fabric with MegaSwitch NSDI 2017 Discover and Learn New Objects From Documentaries CVPR 2017 Enabling ECN in Multi-Service Multi-Queue Data Centers NSDI 2016 Planning with Task-Oriented Knowledge Acquisition for a Service Robot IJCAI 2016 Explicit Path Control in Commodity Data Centers: Design and Applications NSDI 2015 Information-Agnostic Flow Scheduling for Commodity Data Centers NSDI 2015 Distributed Representations of Words and Phrases and their Compositionality NIPS 2013 Large Scale Distributed Deep Networks NIPS 2012 OSA: An Optical Switching Architecture for Data Center Networks with Unprecedented Flexibility NSDI 2012