Mean and Variance Simulator — Numerical Stability and Algorithm Comparison

What this page does

This page implements two methods for computing mean and variance: classical full-vector (sum & sumsq) and incremental (Welford). The goals are:

show the recurrences and derive them formally;
implement interactive demos where values can be entered one-by-one (streaming) or pasted as a vector;
benchmark accuracy and timings to illustrate the trade-off between speed and numerical stability.

Theory: recurrences, definitions and initialization

Mean recurrence (simple proof)

The arithmetic mean after n observations is

\(\displaystyle \mu_n = \frac{1}{n}\sum_{i=1}^n x_i.\)

Separating the last term and rearranging gives the incremental update:

\(\displaystyle \mu_n = \mu_{n-1} + \frac{x_n - \mu_{n-1}}{n}.\)

This requires only the previous mean and the new value (O(1) work per element) and is algebraically equivalent to recomputing the mean from scratch.

Variance recurrence (Welford) — definition of `M2`

Define the corrected second moment

\(\displaystyle M2_n = \sum_{i=1}^n (x_i - \mu_n)^2,\)

so that the unbiased sample variance is

\(\displaystyle s^2 = \frac{M2_n}{n-1},\quad n\ge2.\)

Compact update used in code

\(\displaystyle M2_n = M2_{n-1} + \delta\cdot(x_n - \mu_n),\qquad\delta = x_n - \mu_{n-1}.\)

This leads to the standard Welford update implemented in the script.

Step-by-step derivation (recommended for learning)

For completeness: for \(i

\[ x_i-\mu_n=(x_i-\mu_{n-1})+(\mu_{n-1}-\mu_n). \]

Expanding \((x_i-\mu_n)^2\) and summing over \(i=1\ldots n-1\), the cross term in \((x_i-\mu_{n-1})\) vanishes because \(\sum_{i=1}^{n-1}(x_i-\mu_{n-1})=0\). After algebraic simplification and using \(\mu_n=\mu_{n-1}+\delta/n\) one obtains

\[ M2_n = M2_{n-1} + \delta\,(x_n-\mu_n). \]

Initialization (streaming)

n = 0
mean = 0
M2 = 0

-- on first element x1:
n = 1
mean = x1
M2 = 0

From then on apply the incremental updates above for every new observation.

Numerical note (concrete example)

Suppose data are around \(10^8\) with very small variance (e.g. variance ≈ \(10^{-2}\)). The naive formula \(\frac{1}{n-1}\big(\sum x_i^2 - n\bar x^2\big)\) subtracts two large numbers (order \(10^{16}\)) that differ by a small quantity , this causes catastrophic cancellation and large relative errors. Welford's updates avoid that subtraction and keep rounding error small relative to the data scale.

How to use (modes)

Step-by-step : enter values one at a time with Add value to see incremental updates.
Vector mode: paste a comma-separated list; the page processes values element-by-element (keeps streaming semantics but stores values for plotting).
Benchmark: compare Welford (stable) vs sum/sumsq (fast but numerically fragile).

Simulator — Input & Controls (online)

Paste numbers separated by commas

Results:

No calculation performed yet.

Implementation notes

This page implements Welford's algorithm precisely as derived above (delta-based updates) and compares it with the classical sum/sumsq formula. Inputs are filtered to finite numbers only.

Optional Homework — comparing online methods

Goal: prove the recurrences, test online algorithms, and show the numerical advantages of Welford over simple incremental sum & sumsq.

Benchmark parameters

N = number of elements, Trials = how many runs to average. The benchmark reports ms per run and the absolute difference in variance: |Var_sum − Var_welford|.

Benchmark results (online methods)

Press Run benchmark to execute. Charts show timings and numeric differences against Welford.

Numerical discussion & summary

Main advantages of Welford's online algorithm:

Numerical stability: avoids cancellation and overflow/underflow issues typical of the naive sums formula.
Low memory: O(1) extra space — ideal for streams.
Mergeability: partial summaries (n, mean, M2) can be combined for distributed computation.
Efficiency: O(1) per update and O(n) overall.

Comparison table

Mode	Formula type	Requires all data?	Numerical stability	Typical use
Full Vector	classical mean & variance	Yes	Moderate	Small, static datasets
Incremental (Welford)	recurrence using δ, M2	No	Excellent	Streaming or large datasets

Complexity

Operation	Time	Space
Classical mean/variance	O(n)	O(n) (if storing data)
Welford incremental update	O(1) per element, O(n) total	O(1)