Pingchuan Ma
36 papers · 2019–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (13) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (7) π£ Hot Topic Early Bird
π
Interdisciplinary Bridge
π
Conference Polyglot
(12)
πΊοΈ
Taxonomy Completionist
(13)
π€
Dynamic Duo
(10)
π
Triple Crown
π
Grand Slam
ποΈ
Keyword Collector
(148)
π
Trend Setter
β‘
Prolific Year
(9)
π₯
Unstoppable
(7)
β
The Questioner
π
Century Club
(36)
Conferences
INTERSPEECH (7)
ICML (5)
CVPR (4)
ICLR (4)
NIPS (4)
ECCV (3)
AAAI (2)
EMNLP (2)
ICCV (2)
CORL (1)
IJCAI (1)
WACV (1)
Top co-authors
Keywords
diffusion model
(4)
generative model
(4)
differentiable simulation
(4)
lip reading
(3)
large language model
(2)
style transfer
(2)
end-to-end model
(2)
visual speech recognition
(2)
image generation
(2)
representation learning
(2)
speech recognition
(2)
audio-visual speech recognition
(2)
flow matching
(2)
automatic speech recognition
(2)
similarity learning
(1)
embedding space
(1)
image segmentation
(1)
feature extraction
(1)
transfer learning
(1)
semantic segmentation
(1)
Papers
TopoGaussian: Inferring Internal Topology Structures from Visual Clues
ICLR 2025
Stochastic Interpolants for Revealing Stylistic Flows across the History of Art
ICCV 2025
SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models
ICCV 2025
ROICtrl: Boosting Instance Control for Visual Generation
CVPR 2025
Fabrica: Dual-Arm Assembly of General Multi-Part Objects via Integrated Planning and Learning
CORL 2025
DepthFM: Fast Generative Monocular Depth Estimation with Flow Matching
AAAI 2025
Does VLM Classification Benefit from LLM Description Semantics?
AAAI 2025
LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
ICML 2024
NeuralFluid: Nueral Fluidic System Design and Control with Differentiable Simulation
NIPS 2024
Physically Compatible 3D Object Modeling from a Single Image
NIPS 2024
Dynamic Data Pruning for Automatic Speech Recognition
INTERSPEECH 2024
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
ECCV 2024
WaSt-3D: Wasserstein-2 Distance for Scene-to-Scene Stylization on 3D Gaussians
ECCV 2024
FMBoost: Boosting Latent Diffusion with Flow Matching
ECCV 2024
Split and Merge: Aligning Position Biases in LLM-based Evaluators
EMNLP 2024
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
INTERSPEECH 2024
Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning
CVPR 2023
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
CVPR 2023
Explain Any Concept: Segment Anything Meets Concept-Based Explanation
NIPS 2023
SoftZoo: A Soft Robot Co-design Benchmark For Locomotion In Diverse Environments
ICLR 2023
Jointly Learning Visual and Auditory Speech Representations from Raw Data
ICLR 2023
Streaming Audio-Visual Speech Recognition with Alignment Regularization
INTERSPEECH 2023
Learning Neural Constitutive Laws from Motion Observations for Generalizable PDE Dynamics
ICML 2023
SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
INTERSPEECH 2023
InsightPilot: An LLM-Empowered Automated Data Exploration System
EMNLP 2023
DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models
NIPS 2023
RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation
ICLR 2022
Fast Aquatic Swimmer Optimization with Differentiable Projective Dynamics and Neural Network Hydrodynamic Models
ICML 2022
LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision
INTERSPEECH 2021
Lip-Reading With Densely Connected Temporal Convolutional Networks
WACV 2021
Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control
ICML 2020
Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models
IJCAI 2020
Efficient Continuous Pareto Exploration in Multi-Task Learning
ICML 2020
Video-Driven Speech Reconstruction Using Generative Adversarial Networks
INTERSPEECH 2019
A Content Transformation Block for Image Style Transfer
CVPR 2019
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition
INTERSPEECH 2019