Skip to main content

In the last five years, my research has contributed to Markov Chain Monte Carlo, Markov Chain Mixing Time, Couplings, Stochastic Differential Equations, and Piecewise-deterministic Markov Processes, among other topics. The stochastic processes considered in my research display characteristics that frequently present themselves in applications such as high-dimensionality, multi-modality and multi-scale behavior. As such, this research has led to constructive collaborations with computational statisticians (notably Bob Carpenter in the Center of Computational Mathematics at the Flatiron Institute) and computational chemists (notably Thomas F. Miller III in the Department of Chemistry at Caltech).

For a complete list of my publications, see my Google Scholar profile page. My research interests are also reflected on my MathOverflow profile page.

Tuning Hamiltonian Monte Carlo Samplers

The No-U-Turn Sampler employs a remarkably clever way to locally adapt the path length tuning parameter in Hamiltonian Monte Carlo (HMC) to eponymously avoid U-turns in the underlying leapfrog integration legs, and at the same time, preserve detailed balance. Thanks to this self-tuning feature, the No-U-Turn Sampler has become the workhorse sampler in many probabilistic programming languages like STAN, PyMC3, NIMBLE, Turing, and NumPyro. The invention of the No-U-Turn Sampler inevitably raises the question: is it possible to generalize this local adaptation strategy to the algorithm’s other tuning parameters? In Bou-Rabee, Carpenter & Marsden (2024), we address this question by developing a simple and flexible framework for self-tuning HMC termed GIST, signifying Gibbs self-tuning for HMC. Remarkably, this framework generalizes/unifies almost all locally adaptive HMC samplers in current use including the No-U-Turn Sampler, the Apogee-to-Apogee Path Sampler, and randomized Hamiltonian Monte Carlo. This unification points to an interesting universality of GIST samplers: correctness of a locally adaptive HMC sampler seems to nudge it to take the form of a GIST sampler. Besides this unification, the GIST sampling framework unlocks: (i) simpler alternatives to the No-U-Turn Sampler for path length tuning; but also (ii) local adaptation of HMC’s other parameters, which will be the subject of forthcoming work.

Hamiltonian Monte Carlo Mixing Time

The Hamiltonian Monte Carlo (HMC) sampler is based on a combination of Hamiltonian dynamics and momentum randomizations. The Hamiltonian dynamics is discretized, and the discretization bias can either be borne or corrected by a Metropolis-Hastings accept-reject step. Despite its empirical success, until a few years ago there have been almost no convergence bounds for HMC samplers. This situation has changed in the last years where approaches to quantify convergence to equilibrium based on coupling, conductance and hypocoercivity have been developed. In a series of works (see below), we developed a variety of novel coupling approaches to prove mixing and complexity upper bounds for uadjusted/adjusted HMC in representative models that display high-dimensionality and non-convexity. These couplings offer significant flexibility for extensions including as a potential diagnostic to evaluate HMC convergence.

Randomized Time Integrators for Hamiltonian MCMC

Measure-preserving ODEs, SDEs, and PDMPs play an important role in constructing non-reversible MCMC kernels, and among time integrators for the corresponding flows, randomized time integrators have stood out in terms of their complexity properties. The essential idea behind these randomized time integrators is to replace deterministic quadrature rules used per time integration step with a suitable Monte Carlo integration rule. The resulting randomized time integrator not only requires less regularity on the underlying ODE, SDE or PDMP, but remarkably, also leads to significantly better asymptotic bias properties for the corresponding Markov kernel, and hence, complexity. Variants of these schemes that are Metropolis-adjustable have been developed in Bou-Rabee & Marsden (2022). Moreover, the effectiveness of these schemes for mean-field models was demonstrated in Bou-Rabee & Schuh (2023), and under minimal regularity of the underlying drift coefficients, 2.5/3.5-order randomized Runge-Kutta-Nyström (rRKN) schemes were developed in Bou-Rabee & Kleppe (2023).

Piecewise Deterministic Markov Processes

Andersen dynamics describes a piecewise deterministic Markov process that follows Hamiltonian dynamics and randomizes the velocity of selected components after exponential waiting times. It is closely related to the kinetic Langevin diffusion except that it has neither diffusion nor explicit dissipation and involves a non-local jump operator. When the velocities of all components are randomized, Andersen dynamics reduces to Randomized Hamiltonian Monte Carlo. Building on the geometric ergodicity verification in Bou-Rabee & Sanz-Serna (2017), we used a maximal coupling of the underlying velocity randomizations to obtain an optimal dimension dependence for Andersen dynamics with weak interactions in Bou-Rabee & Eberle (2022).

Path Integral Molecular Dynamics

Path Integral Molecular Dynamics (PIMD) is based on Feynmann’s path-integral formulation of quantum statistical mechanics, and provides a practical tool to compute static and dynamic quantum statistics, especially when quantum nuclear effects are relevant. Applications include calculations of chemical reaction rates, diffusion coefficients, and absorption spectra. The appeal of PIMD is that it uses a classical ring-polymer-bead system to solve for properties of the quantum system. At the heart of every numerical integration for PIMD is a free ring-polymer update. However, due to fast harmonic motions present in the free ring-polymer system, a strongly stable approximation to the free ring-polymer update is essential. Another requirement is that the integration scheme remains accurate in the infinite bead limit. However, an unfortunate property of existing numerical integration schemes for PIMD — including essentially all existing ring-polymer molecular dynamics (RPMD) and thermostatted RPMD (T-RPMD) methods — is that for a given MD timestep, the overlap between the exact ring-polymer Boltzmann distribution and that sampled using the approximation becomes zero in the continuum or infinite bead limit. A key outcome of the papers referenced below is that these and other problems can be avoided through the introduction of strongly stable and “dimension-free” numerical integration schemes for which the sampled ring-polymer position distribution has non-zero overlap with the exact distribution in the infinite bead limit in relevant models; these conclusions are borne out numerically for quantum liquid water. These properties of strong stability and dimensionality freedom allow for remarkably accurate and stable ergodic time-averaging, and involve a very simple “Cayley” modification of standard “BAOAB” splitting schemes from molecular dynamics, and therefore, the resulting “BCOCB” methods are expected to be widely used for future PIMD simulations.

Numerical Solution of Stochastic Differential Equations

Sticky diffusion processes are ubiquitous in applications arising in a variety of models of physical and natural processes which are naturally described by a set of variables that can change dimension. However, until a few years ago, there were no methods to simulate them. This situation has changed. With Eric Vanden-Eijnden (NYU), we developed a discretization of SDEs in space, rather than time, leading to a Markov jump process (MJP) approximation. Since the increments of the MJP can be uniformly bounded, these algorithms are numerically stable by construction. Another nice feature of the MJP is that it naturally handles boundary conditions (e.g. positivity constraints in population or interest rate models), which is not at all obvious how to do using standard methods. With Miranda Holmes-Cerfon (UBC), we showed how one can naturally incorporate “sticky” boundary conditions into the MJP and demonstrated that MJP can accelerate simulations considerably, in some cases by orders of magnitude. Beyond sticky diffusions, this MJP allows one to approximate SDEs and SPDEs with trickier characteristics (such as interface conditions) or even in situations where the operators of the underlying processes are not yet well understood.