Research – Nawaf Bou-Rabee

The No-Underrun Sampler

The No-Underrun Sampler (NURS) is a locally adaptive, gradient-free Markov Chain Monte Carlo method that approximates Hit-and-Run while incorporating orbit-based exploration, similar to the No-U-Turn Sampler, but without requiring gradients. Like Hit-and-Run, NURS selects a random direction uniformly from the unit sphere, avoiding artifacts from fixed coordinate systems. It then generates an orbit along this direction and dynamically adjusts the length of this orbit to match the local scale of the target distribution. State selection is performed using exact density evaluations along the orbit via categorical sampling. NURS has been rigorously analyzed, with formalized connections to Hit-and-Run, quantitative tuning guidelines, and an extension of coupling results for Hit-and-Run in the Gaussian case. Empirical tests on Neal’s funnel, a challenging multiscale distribution, show that while NURS is diffusive in narrow regions, its ballistic movement in broader regions offsets this, enabling efficient sampling across the distribution’s different scales.

The No-Underrun Sampler: A Locally-Adaptive, Gradient-Free MCMC Method
N. Bou-Rabee, B. Carpenter, S. Liu, & S. Oberdörster, arXiv:2501.18548 [math.ST], 2025.
Ballistic Convergence in Hit-and-Run Monte Carlo and a Coordinate-free Randomized Kaczmarz Algorithm
N. Bou-Rabee, A. Eberle, & S. Oberdörster, arXiv:2412.07643 [math.PR], 2024.

Mixing Time of the No-U-Turn Sampler

The No-U-Turn Sampler (NUTS) is the default sampler for continuously differentiable densities in probabilistic programming languages like Stan (with over 100K users), PyMC3, NIMBLE, Turing, and NumPyro. However, due to its recursive architecture, even proving its reversibility has been an ongoing challenge. For a concise proof based on interpreting NUTS as an auxiliary variable method, check out: Bou-Rabee, Carpenter, and Marsden [2024]. In Bou-Rabee & Oberdörster (2024), we prove that the mixing time of NUTS, when initialized in the concentration region of the canonical Gaussian measure, scales as d^{1/4}, up to logarithmic factors, where d is the dimension. This is the first mixing time guarantee for NUTS and the scaling is expected to be sharp. The proof is based on a coupling argument that leverages the geometric structure of the target distribution. Specifically, concentration of measure results in a uniformity in NUTS’ locally adapted transitions, which holds with high probability. This uniformity is formalized by interpreting NUTS as an accept/reject Markov chain, where the mixing properties for the more uniform accept chain are analytically tractable. Additionally, our analysis uncovers a previously unnoticed issue with the path length adaptation procedure of NUTS, specifically related to looping behavior, which we address in detail.

Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration
N. Bou-Rabee, & S. Oberdörster, arXiv:2410.06978 [math.PR], 2024.

Advancing the No-U-Turn Sampler

The No-U-Turn Sampler (NUTS) employs a remarkably clever way to locally adapt the path length in Hamiltonian Monte Carlo (HMC), effectively avoiding U-turns during leapfrog integration while preserving detailed balance. This self-tuning capability has established NUTS as the default MCMC method for continuously differentiable densities in probabilistic programming languages such as STAN, PyMC3, NIMBLE, Turing, and NumPyro. Given its widespread success, a natural question arises: can this local adaptation strategy be extended to other HMC tuning parameters? In Bou-Rabee, Carpenter & Marsden (2024), we address this question by developing a flexible framework for self-tuning HMC called GIST –– Gibbs self-tuning for HMC. GIST not only generalizes the adaptation strategy of NUTS but also unifies a broad class of existing locally adaptive HMC samplers, including NUTS, the Apogee-to-Apogee Path Sampler, multinomial HMC and randomized HMC. What makes this unification particularly compelling is that it reveals an underlying universality: when reversibility is preserved, the transition step of any locally adaptive HMC sampler naturally aligns with the transition step of a GIST sampler. Beyond this unification, the GIST framework unlocks: (i) simpler alternatives to the No-U-Turn Sampler for path length tuning; but also (ii) local adaptation of NUTS’s other parameters. This makes GIST a powerful and versatile extension of NUTS, with broad implications for the future development of sampling algorithms.

Incorporating Local Step-Size Adaptivity into the No-U-Turn Sampler using Gibbs Self Tuning
N. Bou-Rabee, B. Carpenter, T. S. Kleppe & M. Marsden, arXiv:2408.08259, 2024.
GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo
N. Bou-Rabee, B. Carpenter, & M. Marsden, arXiv:2404.15253, 2024.

Hamiltonian Monte Carlo Mixing Time

The Hamiltonian Monte Carlo (HMC) sampler combines Hamiltonian dynamics with momentum randomization. The Hamiltonian dynamics are discretized, and the resulting discretization bias is either accepted or corrected by a Metropolis-Hastings accept-reject step. Despite HMC’s empirical success, until recently, convergence bounds for HMC samplers were scarce. This has changed over the past few years with the development of new methods for quantifying convergence to equilibrium, including approaches based on coupling, conductance and hypocoercivity. In a series of works (referenced below), we introduced novel coupling techniques that establish mixing and complexity upper bounds for both uadjusted and adjusted HMC, particularly in models with high-dimensionality and non-convexity. These coupling techniques offer significant flexibility for further development, including potential applications as diagnostics to assess HMC convergence.

Mixing of Metropolis-Adjusted Markov Chains via Couplings: The High Acceptance Regime
N. Bou-Rabee & S. Oberdörster, Electronic Journal of Probability, Vol. 29, paper no. 89, pp. 1-27, 2024.
Mixing Time Guarantees for Unadjusted Hamiltonian Monte Carlo
N. Bou-Rabee & A. Eberle, Bernoulli, Vol. 29, pp. 75-104, 2023.
Convergence of Unadjusted Hamiltonian Monte Carlo for mean-field models
N. Bou-Rabee & K. Schuh, Electronic Journal of Probability, Vol. 28, paper no. 91, pp. 1-40, 2023.
Coupling and Convergence for Hamiltonian Monte Carlo
N. Bou-Rabee, A. Eberle, & R. Zimmer, Annals of Applied Probability, Vol. 30, No. 3, pp. 1209-1250, 2020.

Randomized Time Integrators for Hamiltonian MCMC

Measure-preserving ODEs, SDEs, and PDMPs are key in constructing MCMC kernels. Among the time integrators for the corresponding flows, randomized time integrators have shown distinct advantages in terms of complexity. The core idea behind these randomized time integrators is to replace the deterministic quadrature rules typically used in each time integration step with a Monte Carlo integration rule. This approach not only reduces the regularity requirements on the underlying ODE, SDE or PDMP, but also significantly improves the asymptotic bias properties of the resulting Markov kernel, and thus, its complexity. Variants of these schemes that are Metropolis-adjustable were developed in Bou-Rabee & Marsden (2022). Furthermore, the effectiveness of these randomized integrators for mean-field models was demonstrated in Bou-Rabee & Schuh (2023). Additionally, under minimal regularity conditions for the drift coefficients, 2.5-order randomized Runge-Kutta-Nyström (rRKN) schemes were introduced in Bou-Rabee & Kleppe (2023).

Randomized Runge-Kutta-Nyström Methods for Unadjusted Hamiltonian and Kinetic Langevin Monte Carlo
N. Bou-Rabee & T. S. Kleppe, to appear in Mathematics of Computation, arXiv:2310.07399, 2023.
Nonlinear Hamiltonian Monte Carlo & its Particle Approximation
N. Bou-Rabee & K. Schuh, arXiv:2308.11491, 2023.
Unadjusted Hamiltonian MCMC with Stratified Monte Carlo Time Integration
N. Bou-Rabee & M. Marsden, Annals of Applied Probability 2025, Vol. 35, No. 1, 360-392.

Piecewise Deterministic Markov Processes

Andersen dynamics describes a piecewise deterministic Markov process that follows Hamiltonian dynamics, with the velocity of selected components randomized after exponentially distributed waiting times. It shares similarities with the kinetic Langevin diffusion but differs in that it lacks both diffusion and explicit dissipation, and instead involves a non-local jump operator. When the velocities of all components are randomized, it reduces to Randomized Hamiltonian Monte Carlo. In Bou-Rabee & Sanz-Serna (2017), we verified geometric ergodicity for Randomized Hamiltonian Monte Carlo, and in Bou-Rabee & Eberle (2022), we used a maximal coupling of the velocity randomizations to achieve optimal dimension dependence in Wasserstein convergence bounds for Andersen dynamics with weak interactions, without requiring global convexity of the potential energy.

Couplings for Andersen Dynamics
N. Bou-Rabee & A. Eberle, Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques, Vol. 58, No. 2, pp. 916-944, 2022.
Randomized Hamiltonian Monte Carlo
N. Bou-Rabee & J. M. Sanz-Serna, Annals of Applied Probability, Vol. 27, No. 4, pp. 2159-2194, 2017.

Path Integral Molecular Dynamics

Path Integral Molecular Dynamics (PIMD) leverages Feynmann’s path-integral formulation of quantum statistical mechanics to compute static and dynamic quantum statistics, especially when quantum nuclear effects are significant. Applications include calculating chemical reaction rates, diffusion coefficients, and absorption spectra. PIMD’s appeal lies in its use of a classical ring-polymer-bead system to approximate quantum properties. Central to PIMD simulations is the free ring-polymer update, but fast harmonic motions present within the ring-polymer, necessitate a strongly stable approximation. Additionally, the integration scheme must remain accurate as the number of beads approaches infinity. Unfortunately, existing methods — including most ring-polymer molecular dynamics (RPMD) and thermostatted RPMD (T-RPMD) schemes — fail in this regard, as the overlap between the exact and sampled ring-polymer Boltzmann distributions tends to zero in the infinite bead limit. A key result of the papers referenced below is the development of strongly stable, “dimension-free” numerical integration schemes that preserve non-zero overlap with the exact distribution in the infinite bead limit. These methods, valued numerically for quantum liquid water, enable highly accurate and stable ergodic time-averaging. The approach involves a simple “Cayley” modification of standard “BAOAB” splitting schemes from molecular dynamics, leading to new “BCOCB” methods which are expected to be widely adopted for future PIMD simulations.

A generalized class of strongly stable and dimension-free T-RPMD integrators
J. L. Rosa-Raíces, J. Sun, N. Bou-Rabee, & T. F. Miller III, J. Chem. Phys. 154, 024106, 2021.
Two-scale coupling for preconditioned Hamiltonian Monte Carlo in infinite dimensions
N. Bou-Rabee & A. Eberle, Stoch PDE: Anal Comp, Vol. 9, pp. 207-242, 2021.
Dimension-free path-integral molecular dynamics without preconditioning [Editor’s Pick]
R. Korol, J. L. Rosa-Raíces, N. Bou-Rabee & T. F. Miller III, J. Chem. Phys. 152, 104102, 2020.
Cayley modification for strongly stable path-integral and ring-polymer molecular dynamics [Editor’s Pick]
R. Korol, N. Bou-Rabee & T. F. Miller III, J. Chem. Phys. 151, 124103, 2019.

Numerical Solution of Stochastic Differential Equations

Sticky diffusion processes frequently arise in models of physical and natural phenomena where the system’s variables can change dimension. However, until recently, there were no effective methods to simulate these processes. This changed with our work with Eric Vanden-Eijnden (NYU), where we developed a method that discretizes SDEs in space rather than time, resulting in a Markov jump process (MJP) approximation. Because the increments of the MJP can be uniformly bounded, the algorithm is numerically stable by construction. Additionally, MJPs naturally handle boundary conditions, such as positivity constraints in population or interest rate models, which are challenging with standard methods. In collaboration with Miranda Holmes-Cerfon (UBC), we extended this approach to include “sticky” boundary conditions, demonstrating that MJPs can dramatically speed up simulations, sometimes by orders of magnitude. Beyond sticky diffusions, MJPs offer a practical approach for approximating SDEs and SPDEs with complex conditions, such as interface conditions, and can be applied to processes where the underlying differential operators are perhaps not fully understood, making them a versatile tool for simulating various stochastic processes.

Sticky Brownian Motion and its Numerical Solution
N. Bou-Rabee & M. Holmes-Cerfon, SIAM Review, Vol. 62, No. 1, pp. 164-195, 2020.
Continuous-Time Random Walks for the Numerical Solution of Stochastic Differential Equations
N. Bou-Rabee & E. Vanden-Eijnden, Memoirs of the American Mathematical Society, Vol. 256, 2018.
SPECTRWM: spectral random walk method for the numerical solution of stochastic partial differential equations
N. Bou-Rabee, SIAM Review, Vol. 60, No. 2, pp. 386-406, 2018.