| layout | default |
|---|
Dr. rer. nat. · Postdoctoral Researcher
TU Dortmund University & Lamarr Institute for Machine Learning and AI
I am a Postdoctoral Researcher at TU Dortmund University and the Lamarr Institute for Machine Learning and AI, one of Germany's six national AI competence centers. I completed my PhD in Statistics in February 2026, with a dissertation on large-scale data reduction based on coresets.
My research sits at the intersection of scalable Bayesian inference, coreset theory, and computational statistics — developing methods that compress massive datasets while provably preserving the statistical structure needed for downstream inference. I apply these tools to high-dimensional generative models and particle physics simulations in collaboration with CERN/ATLAS.
I am currently open to opportunities in quantitative research, applied ML/statistics, and methodological roles in industry (pharmaceutical statistics, tech, and quantitative finance).
- Coreset Theory — data compression with statistical guarantees for Bayesian models
- Scalable Bayesian Inference — MCMC, variational methods, approximate posteriors
- Generative Models — normalizing flows, diffusion models, score matching
- Computational Statistics — Monte Carlo methods, benchmarking, high-performance computing
- Applications — particle physics (CERN/ATLAS), high-dimensional classification, risk modeling
| Year | Title | Venue |
|---|---|---|
| 2026 | Coreset Methods for Multivariate Distributions | AISTATS 2026 |
| 2024 | Scalable Bayesian p-Generalized Probit and Logistic Regression | Advances in Data Analysis and Classification |
| 2023 | Bayesian Analysis for Dimensionality and Complexity Reduction | ML under Resource Constraints, deGruyter |
Under Review
- A Benchmark Suite for Monte Carlo Sampling Algorithms — 2024
| Package | Language | Description |
|---|---|---|
| BayesPprobit | R (CRAN) | Scalable Bayesian estimation for p-generalized probit/logistic regression via coreset-accelerated MCMC |
| MCBench | Julia | Comprehensive benchmark suite for Monte Carlo sampling algorithms with standardized test distributions and convergence diagnostics |
| Degree | Institution | Year | Note |
|---|---|---|---|
| PhD in Statistics | TU Dortmund, Germany | 2021–2025 | magna cum laude |
| MSc in Quantitative Economics | Universität Göttingen, Germany | 2017–2020 | Young Statistician Award |
| BSc in Quantitative Economics | Xi'an Jiaotong University, China | 2011–2015 |
Languages: Python · R · Julia · SAS · SQL · PySpark · PyTorch
Bayesian Methods: MCMC · Coreset Theory · Prior Design · Variational Inference
ML / Deep Learning: Gradient Boosting · Normalizing Flows · Diffusion Models · GANs · BNNs
Tools: Git · AWS · Hadoop/Spark
Languages (Human): Chinese (Native) · German (C1) · English (Professional)
Personal website: zeyudsai.github.io