MCMC methods enable sampling from complex distributions, often necessary in Bayesian statistics. The project explored three key MCMC algorithms—Random Walk Metropolis-Hastings (RWMH), Gibbs Sampling, and Hamiltonian Monte Carlo (HMC)—alongside two advanced extensions, the Metropolis Adjusted Langevin Algorithm (MALA) and Data Augmented Gibbs Sampling.
Methods Investigated
- Random Walk Metropolis-Hastings (RWMH):
RWMH samples points based on random walks, accepting or rejecting proposed moves using a probabilistic criterion. While simple and versatile, tuning the step size is critical to avoid poor exploration or inefficient sampling. - Gibbs Sampling:
Gibbs Sampling updates each variable conditionally, making it computationally efficient when conditionals are tractable. It struggles with highly correlated variables and multi-modal distributions unless augmented. - Hamiltonian Monte Carlo (HMC):
Inspired by physics, HMC leverages gradients of the target distribution to explore efficiently. It excels in high dimensions but requires careful tuning of step size and trajectory length.
Key Extensions
- Metropolis Adjusted Langevin Algorithm (MALA):
Incorporates gradients to refine proposals, improving efficiency in challenging distributions like heavy-tailed or multi-modal ones. - Data Augmented Gibbs Sampling:
Introduces latent variables to simplify sampling, particularly effective for distributions like the Lorentzian.
Insights and Conclusions
- All algorithms are effective for well-behaved distributions but face challenges with heavy tails or multi-modal targets.
- HMC generally outperformed others in higher dimensions, but its performance diminishes for heavy-tailed distributions.
- Advanced methods like MALA and Data Augmented Gibbs demonstrated superior performance for challenging cases, highlighting the importance of leveraging problem-specific features.
- Effective tuning, such as step size and acceptance ratios, significantly impacts performance and must be approached methodically.
Notably, we learned that whilst it is not mentioned in literature, in HMC you have another hyperparameter which is the noise scaling. In heavy tailed distributions, where the gradients may be small, you can often run away from the distribution, and enter random walk behaviour. This can be mitigated by scaling the noise by the modulus of the gradient.
Back to Home