Anyone who has ever used Hamiltonian Monte Carlo samplers before has probably encountered the so-called tuning problem: in short, it is not at all obvious how to pick the algorithm’s various tuning parameters including path length, step length, preconditioner, etc. Worse still, the algorithm’s performance is quite sensitive to the choice of these tuning parameters. In the specific context of path length tuning, the No-U-Turn Sampler employs a remarkably clever way to locally adapt the path length by eponymously avoiding U-turns in the underlying leapfrog integration legs, and at the same time, preserving detailed balance. Thanks to this self-tuning feature, the No-U-Turn Sampler has become the default sampler in many probabilistic programming languages like STAN, PyMC3, NIMBLE, Turing, and NumPyro. The invention of the No-U-Turn Sampler inevitably raises the question: is it possible to generalize this local adaptation strategy to the algorithm’s other tuning parameters somehow?
In a new preprint in collaboration with Bob Carpenter (Flatiron Institute) and Milo Marsden (Stanford) entitled “GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo”, we address this question by developing a new framework for self-tuning Hamiltonian Monte Carlo termed GIST. The resulting GIST samplers admit a relatively simple proof of correctness because they can be viewed as a Gibbs sampler on an enlarged space that includes the tuning parameter of interest as an auxiliary variable. On this enlarged space, an enlargement of the target measure is defined by simply specifying the conditional distribution of the tuning parameter given the position and momentum variables. The GIST sampler interleaves Gibbs momentum and tuning parameter refreshments with a Metropolis-within-Gibbs step that uses a proposal given by a measure-preserving involution on the enlarged space. If you’re thinking “Hey, wait a minute, isn’t that the essence of Hamiltonian Monte Carlo?”, then you’re spot on: the GIST sampler is a natural extension of the Hamiltonian Monte Carlo sampler to adaptively sampling tuning parameters as well.
More importantly, by carefully specifying (i) the conditional distribution of the path length given the position and momentum; and (ii) a suitable measure-preserving involution on the enlarged space, we show that the No-U-Turn Sampler, the Apogee-to-Apogee Path Sampler, and randomized Hamiltonian Monte Carlo are all special cases of the GIST sampler. This unifying framework immediately provides a systematic way to study (e.g. prove correctness) of these seemingly different locally adaptive Hamiltonian Monte Carlo samplers. In addition to being a useful theoretical tool, the GIST sampling framework also immediately opens the door to: (i) simpler alternatives to the No-U-Turn Sampler for locally adapting path lengths which the paper evaluates in detail (the companion code is linked here); but also (ii) local adaptation of the algorithm’s other parameters, which will be the subject of future work, so stay tuned!