Skip to main content

I specialize in theoretical and computational probability, with a focus on topics such as Markov Chain Monte Carlo, Markov Chain Mixing Time, Couplings, Stochastic Differential Equations, and Piecewise-deterministic Markov Processes. My research tackles questions about complex stochastic processes that arise in high-dimensional, multi-modal, and multi-scale settings. These challenges carry far-reaching implications for both advancing theoretical understanding and solving practical problems.

I am especially interested in how randomness interacts with high-dimensional spaces, often revealing elegant, universal patterns, e.g., concentration of measure phenomena in MCMC, which play a critical role in understanding their behavior in these settings. Through collaborations with leading experts, including my current role as a Visiting Scholar with Bob Carpenter at the Flatiron Institute, I aim to connect cutting-edge theory with real-world applications in computational probability, driving progress and innovation in the field.

For a complete list of my publications, please visit my Google Scholar profile. You can also explore my contributions to discussions on related mathematical topics on my MathOverflow profile. Additionally, you can watch my recent talk on the reversibility and mixing time of the No-U-Turn Sampler, presented at the Online Seminar on Monte Carlo.

Mixing Time of the No-U-Turn Sampler

The No-U-Turn Sampler (NUTS) is the default sampler for continuously differentiable densities in probabilistic programming languages like Stan (with over 100K users), PyMC3, NIMBLE, Turing, and NumPyro. However, due to its recursive architecture, even proving its reversibility has been an ongoing challenge. For a concise proof based on interpreting NUTS as an auxiliary variable method, check out: Bou-Rabee, Carpenter, and Marsden [2024]. In Bou-Rabee & Oberdörster (2024), we prove that the mixing time of NUTS, when initialized in the concentration region of the canonical Gaussian measure, scales as d^{1/4}, up to logarithmic factors, where d is the dimension. This is the first mixing time guarantee for NUTS and the scaling is expected to be sharp. The proof is based on a coupling argument that leverages the geometric structure of the target distribution. Specifically, concentration of measure results in a uniformity in NUTS’ locally adapted transitions, which holds with high probability. This uniformity is formalized by interpreting NUTS as an accept/reject Markov chain, where the mixing properties for the more uniform accept chain are analytically tractable. Additionally, our analysis uncovers a previously unnoticed issue with the path length adaptation procedure of NUTS, specifically related to looping behavior, which we address in detail.

Advancing the No-U-Turn Sampler

The No-U-Turn Sampler (NUTS) employs a remarkably clever way to locally adapt the path length in Hamiltonian Monte Carlo (HMC), effectively avoiding U-turns during leapfrog integration while preserving detailed balance. This self-tuning capability has established NUTS as the default MCMC method for continuously differentiable densities in probabilistic programming languages such as STAN, PyMC3, NIMBLE, Turing, and NumPyro. Given its widespread success, a natural question arises: can this local adaptation strategy be extended to other HMC tuning parameters? In Bou-Rabee, Carpenter & Marsden (2024), we address this question by developing a flexible framework for self-tuning HMC called GIST –– Gibbs self-tuning for HMC. GIST not only generalizes the adaptation strategy of NUTS but also unifies a broad class of existing locally adaptive HMC samplers, including NUTS, the Apogee-to-Apogee Path Sampler, multinomial HMC and randomized HMC. What makes this unification particularly compelling is that it reveals an underlying universality: when reversibility is preserved, the transition step of any locally adaptive HMC sampler naturally aligns with the transition step of a GIST sampler. Beyond this unification, the GIST framework unlocks: (i) simpler alternatives to the No-U-Turn Sampler for path length tuning; but also (ii) local adaptation of NUTS’s other parameters. This makes GIST a powerful and versatile extension of NUTS, with broad implications for the future development of sampling algorithms.

Hamiltonian Monte Carlo Mixing Time

The Hamiltonian Monte Carlo (HMC) sampler combines Hamiltonian dynamics with momentum randomization. The Hamiltonian dynamics are discretized, and the resulting discretization bias is either accepted or corrected by a Metropolis-Hastings accept-reject step. Despite HMC’s empirical success, until recently, convergence bounds for HMC samplers were scarce. This has changed over the past few years with the development of new methods for quantifying convergence to equilibrium, including approaches based on coupling, conductance and hypocoercivity. In a series of works (referenced below), we introduced novel coupling techniques that establish mixing and complexity upper bounds for both uadjusted and adjusted HMC, particularly in models with high-dimensionality and non-convexity. These coupling techniques offer significant flexibility for further development, including potential applications as diagnostics to assess HMC convergence.

Randomized Time Integrators for Hamiltonian MCMC

Measure-preserving ODEs, SDEs, and PDMPs are key in constructing MCMC kernels. Among the time integrators for the corresponding flows, randomized time integrators have shown distinct advantages in terms of complexity. The core idea behind these randomized time integrators is to replace the deterministic quadrature rules typically used in each time integration step with a Monte Carlo integration rule. This approach not only reduces the regularity requirements on the underlying ODE, SDE or PDMP, but also significantly improves the asymptotic bias properties of the resulting Markov kernel, and thus, its complexity. Variants of these schemes that are Metropolis-adjustable were developed in Bou-Rabee & Marsden (2022). Furthermore, the effectiveness of these randomized integrators for mean-field models was demonstrated in Bou-Rabee & Schuh (2023). Additionally, under minimal regularity conditions for the drift coefficients, 2.5-order randomized Runge-Kutta-Nyström (rRKN) schemes were introduced in Bou-Rabee & Kleppe (2023).

Piecewise Deterministic Markov Processes

Andersen dynamics describes a piecewise deterministic Markov process that follows Hamiltonian dynamics, with the velocity of selected components randomized after exponentially distributed waiting times. It shares similarities with the kinetic Langevin diffusion but differs in that it lacks both diffusion and explicit dissipation, and instead involves a non-local jump operator. When the velocities of all components are randomized, it reduces to Randomized Hamiltonian Monte Carlo. In Bou-Rabee & Sanz-Serna (2017), we verified geometric ergodicity for Randomized Hamiltonian Monte Carlo, and in Bou-Rabee & Eberle (2022), we used a maximal coupling of the velocity randomizations to achieve optimal dimension dependence in Wasserstein convergence bounds for Andersen dynamics with weak interactions, without requiring global convexity of the potential energy.

Path Integral Molecular Dynamics

Path Integral Molecular Dynamics (PIMD) leverages Feynmann’s path-integral formulation of quantum statistical mechanics to compute static and dynamic quantum statistics, especially when quantum nuclear effects are significant. Applications include calculating chemical reaction rates, diffusion coefficients, and absorption spectra. PIMD’s appeal lies in its use of a classical ring-polymer-bead system to approximate quantum properties. Central to PIMD simulations is the free ring-polymer update, but fast harmonic motions present within the ring-polymer, necessitate a strongly stable approximation. Additionally, the integration scheme must remain accurate as the number of beads approaches infinity. Unfortunately, existing methods — including most ring-polymer molecular dynamics (RPMD) and thermostatted RPMD (T-RPMD) schemes — fail in this regard, as the overlap between the exact and sampled ring-polymer Boltzmann distributions tends to zero in the infinite bead limit. A key result of the papers referenced below is the development of strongly stable, “dimension-free” numerical integration schemes that preserve non-zero overlap with the exact distribution in the infinite bead limit. These methods, valued numerically for quantum liquid water, enable highly accurate and stable ergodic time-averaging. The approach involves a simple “Cayley” modification of standard “BAOAB” splitting schemes from molecular dynamics, leading to new “BCOCB” methods which are expected to be widely adopted for future PIMD simulations.

Numerical Solution of Stochastic Differential Equations

Sticky diffusion processes frequently arise in models of physical and natural phenomena where the system’s variables can change dimension. However, until recently, there were no effective methods to simulate these processes. This changed with our work with Eric Vanden-Eijnden (NYU), where we developed a method that discretizes SDEs in space rather than time, resulting in a Markov jump process (MJP) approximation. Because the increments of the MJP can be uniformly bounded, the algorithm is numerically stable by construction. Additionally, MJPs naturally handle boundary conditions, such as positivity constraints in population or interest rate models, which are challenging with standard methods. In collaboration with Miranda Holmes-Cerfon (UBC), we extended this approach to include “sticky” boundary conditions, demonstrating that MJPs can dramatically speed up simulations, sometimes by orders of magnitude. Beyond sticky diffusions, MJPs offer a practical approach for approximating SDEs and SPDEs with complex conditions, such as interface conditions, and can be applied to processes where the underlying differential operators are perhaps not fully understood, making them a versatile tool for simulating various stochastic processes.