conftrace_

Wenhai Wang

61 papers · 2018–2026 · 11 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+14 more ↓ 🌍 Conference Polyglot (11) πŸƒ Academic Marathon (7) πŸŒ‰ Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12)
🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (7) πŸ—ΊοΈ Taxonomy Completionist (69) πŸ”¬ Deep Specialist (16) 🧬 Topic Evolution πŸ‘₯ Mega-Team (38) πŸ‘‘ Triple Crown 🀝 Dynamic Duo (28) πŸ† Grand Slam πŸ’Ž Century Club (58) πŸ”₯ Unstoppable (8) πŸ“ˆ Trend Setter πŸ—ƒοΈ Keyword Collector (209) ⚑ Prolific Year (9)

Conferences

CVPR (13) NIPS (10) ECCV (9) ICCV (6) AAAI (5) ICLR (5) ACL (4) IJCAI (4) ICML (3) EMNLP (1) NAACL (1)

Papers

EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models AAAI 2026 Selective Knowledge Distillation: Fusing LLM Semantic Strengths with DNN Efficiency for Binary Code Similarity Detection ACL 2026 LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment ACL 2026 PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models CVPR 2025 Docopilot: Improving Multimodal Models for Document-Level Understanding CVPR 2025 Diffuse&Refine: Intrinsic Knowledge Generation and Aggregation for Incremental Object Detection IJCAI 2025 MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost ICML 2025 CoMemo: LVLMs Need Image Context with Image Memory ICML 2025 UltraModel: A Modeling Paradigm for Industrial Objects IJCAI 2025 ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area AAAI 2025 Uncovering LLM-Generated Code: A Zero-Shot Synthetic Code Detector via Code Rewriting AAAI 2025 OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference ACL 2025 Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models ACL 2025 Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures ICLR 2025 Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction ICCV 2025 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text ICLR 2025 Lumina-Image 2.0: A Unified and Efficient Image Generative Framework ICCV 2025 HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding CVPR 2025 Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning NIPS 2024 Needle In A Multimodal Haystack NIPS 2024 InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD NIPS 2024 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks NIPS 2024 AVSegFormer: Audio-Visual Segmentation with Transformer AAAI 2024 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks CVPR 2024 Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications CVPR 2024 ControlLLM: Augment Language Models with Tools by Searching on Graphs ECCV 2024 The All-Seeing Project V2: Towards General Relation Comprehension of the Open World ECCV 2024 Distilling Knowledge from Large-Scale Image Models for Object Detection ECCV 2024 Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments ICLR 2024 The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World ICLR 2024 RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis ICML 2024 Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization NAACL 2024 EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought NIPS 2023 FB-BEV: BEV Representation from Forward-Backward View Transformations ICCV 2023 Vision Transformer Adapter for Dense Predictions ICLR 2023 InternImage: Exploring Large-Scale Vision Foundation Models With Deformable Convolutions CVPR 2023 VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks NIPS 2023 Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection NIPS 2023 CP-BCS: Binary Code Summarization Guided by Control Flow Graph and Pseudo Code EMNLP 2023 Planning-Oriented Autonomous Driving CVPR 2023 Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks CVPR 2023 Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization AAAI 2022 VL-LTR: Learning Class-Wise Visual-Linguistic Representation for Long-Tailed Visual Recognition ECCV 2022 Panoptic SegFormer: Delving Deeper Into Panoptic Segmentation With Transformers CVPR 2022 BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers ECCV 2022 Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs NIPS 2022 SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers NIPS 2021 DetCo: Unsupervised Contrastive Learning for Object Detection ICCV 2021 Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions ICCV 2021 Segmenting Transparent Objects in the Wild with Transformer IJCAI 2021 Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection CVPR 2021 Segmenting Transparent Objects in the Wild ECCV 2020 PolarMask: Single Shot Instance Segmentation With Polar Representation CVPR 2020 Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection NIPS 2020 Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation ECCV 2020 Scene Text Image Super-resolution in the wild ECCV 2020 AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting ECCV 2020 Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network ICCV 2019 Selective Kernel Networks CVPR 2019 Shape Robust Text Detection With Progressive Scale Expansion Network CVPR 2019 Mixed Link Networks IJCAI 2018