Statistics Homework 4 — Law of Large Numbers

Overview: Law of Large Numbers & Probabilistic Convergence

Let \(X_1, X_2, \ldots, X_t\) be independent and identically distributed (i.i.d.) Bernoulli(\(p\)) random variables. Define the relative frequency: \[ f_t = \frac{1}{t}\sum_{i=1}^t X_i \] The Weak Law of Large Numbers states that \(f_t\) converges in probability to \(p\) as \(t \to \infty\).

This interactive simulation visualizes this convergence by generating \(m\) independent trajectories, each of length \(n\). We observe:

Spaghetti plot: Multiple sample paths \(t \mapsto f_t\) showing how trajectories cluster around \(p\)
Histogram: Empirical distribution of \(f_t\) at each time step, narrowing as \(t\) increases
Central Limit Theorem connection: The distribution of \(f_t\) approaches \(\mathcal{N}\big(p, \frac{p(1-p)}{t}\big)\)

The variability of \(f_t\) shrinks at rate \(O(1/\sqrt{t})\), which can be observed by watching the histogram concentrate around \(p\) as the simulation progresses.

1. Simulation Parameters

Configure Parameters

p — Success Probability

Probability of success in each Bernoulli trial

n — Trials per Trajectory

Number of Bernoulli trials in each trajectory

m — Trajectories

Number of independent trajectories to simulate

Seed — Randomness

Random number generator seed

Bins — Histogram Resolution

Number of bins for histogram visualization

2. Trajectory Visualization (Spaghetti Plot)

This visualization shows \(m\) sample paths of \(f_t\) over time. Each line represents one independent trajectory. The cluster tightens around \(p\) as \(t\) increases, demonstrating convergence. The right-side histogram shows the current distribution of \(f_t\) across all trajectories at time \(t\), with a red curve overlaying the theoretical Normal approximation \(\mathcal{N}\big(p, \frac{p(1-p)}{t}\big)\) predicted by the Central Limit Theorem.

3. Empirical Distribution of \(f_t\)

This histogram provides a detailed view of the empirical distribution of \(f_t\) across all trajectories at the current time \(t\). The bars show observed frequencies, while the red curve overlays the theoretical Normal density \(\mathcal{N}\big(p, \frac{p(1-p)}{t}\big)\). The purple dashed line marks the reference probability \(p\). As \(t\) increases, observe the histogram narrow around \(p\) and converge to the theoretical curve.

4. Convergence to Reference Probability

This scatter plot displays the current relative frequency \(f_t\) for each trajectory (one dot per run) at time \(t\). The horizontal dashed line marks the reference probability \(p\). As \(t\) increases, the points cluster closer to \(p\), visually demonstrating the Law of Large Numbers: individual sample means converge to the population mean.

5. Theoretical Foundations

Mathematical Setup

Let \((X_i)_{i \geq 1}\) be independent and identically distributed Bernoulli(\(p\)) random variables. Define:

Cumulative count: \(S_t = \sum_{i=1}^{t} X_i \sim \text{Binomial}(t, p)\)
Relative frequency: \(f_t = \frac{S_t}{t} = \frac{1}{t}\sum_{i=1}^{t} X_i\) (sample mean)

Weak Law of Large Numbers

For any \(\varepsilon > 0\), Chebyshev's inequality gives: \[ \Pr\big(|f_t - p| \geq \varepsilon\big) \leq \frac{p(1-p)}{t\varepsilon^2} \xrightarrow{t \to \infty} 0 \] Thus \(f_t\) converges in probability to \(p\). For fixed \(\varepsilon\), the fraction of trajectories whose \(f_t\) deviates by at least \(\varepsilon\) vanishes as \(t\) increases.

Strong Law: Applying Kolmogorov's theorem, we obtain almost sure convergence: \(\lim_{t \to \infty} f_t = p\) with probability 1. See Durrett (2019, Chapter 2) or Billingsley (1995, Section 22).

Central Limit Theorem Connection

Since \(S_t \sim \text{Binomial}(t, p)\) with variance \(t \cdot p(1-p)\), the CLT states: \[ f_t = \frac{S_t}{t} \xrightarrow{d} \mathcal{N}\Big(p, \frac{p(1-p)}{t}\Big) \] This explains why:

The histogram of \(f_t\) becomes approximately Gaussian for large \(t\)
The spread (standard deviation) decays as \(\sqrt{\frac{p(1-p)}{t}} = O(1/\sqrt{t})\)
The red theoretical curve overlaid on histograms matches the empirical distribution

For comprehensive treatments, see Durrett (2019, Chapter 3) or Feller (1968, Chapter VIII).

Probabilistic vs Deterministic Convergence

Classical limits involve single numerical sequences. In contrast, \(f_t \to p\) in probability means: for any tolerance \(\varepsilon\), the probability over sample paths of exceeding that tolerance approaches zero. The spaghetti plot illustrates this—each path is random, but the mass of paths concentrates around \(p\).

Interpretation of Visualizations

Spaghetti plot: Multiple sample paths \(f_t\) over time, tightening around \(p\)
Histogram (inset & large): Empirical distribution of \(f_t\) at time \(t\), converging to Normal curve
Scatter plot: Individual trajectory endpoints at time \(t\), clustering near \(p\)

Experimental Suggestions

Observe convergence: Increase \(n\) to 500 or 1000 and watch the histogram pinch tightly around \(p\).

Test different probabilities: Try \(p = 0.25, 0.5, 0.75\) to see how the center shifts while maintaining the convergence pattern.

Vary trajectory count: Use \(m = 500\) for smoother histogram visualization and clearer Normal approximation.

Optimal settings: \(m \geq 100\) trajectories and \(n \geq 300\) trials provide clear convergence behavior. The simulation updates every 15ms for smooth real-time visualization.