blog | Beier Zhu

On the Concentration of Empirical Means on the Simplex

April 03, 2026 · machine learning

This post starts from the familiar problem of estimating the mean of a bounded random variable, and then follows what changes as the object of interest becomes more structured. For simplex-valued means, it is more natural to think in terms of distributional concentration, which leads to sharper and more meaningful bounds. Weissman's inequality captures this especially well in the one-hot categorical setting, while a McDiarmid-based argument extends the picture to general soft simplex-valued samples.
Influence Functions and the Hessian Estimation

March 29, 2026 · machine learning

This blog reviews influence functions as a practical tool for understanding how individual training points affect a model’s parameters, loss, and predictions. Starting from empirical risk minimization, it derives first-order approximations for deleting or perturbing a training sample, and discusses several ways to estimate the inverse Hessian in practice, including direct inversion, conjugate gradients, stochastic estimation, diagonal approximation, and outer-product approximation.
Least Squares: Closed Form, QR, SVD, Gradient Descent, and Ridge Regression

March 18, 2026 · optimization

This blog gives a unified overview of five common methods for solving least-squares problems: the normal equations, QR decomposition, SVD, gradient descent, and ridge regression. We focus on the main issues that distinguish them in practice, including numerical stability, rank deficiency, minimum-norm solutions, and robustness to noise.

On the Concentration of Empirical Means on the Simplex

Influence Functions and the Hessian Estimation

Least Squares: Closed Form, QR, SVD, Gradient Descent, and Ridge Regression