### Preprint: Weak and strong modes

Han Cheng Lie and I have just uploaded a revised preprint of our paper, “Equivalence of weak and strong modes of measures on topological vector spaces”, to the arXiv.
This addresses a natural question in the theory of modes (or maximum a posteriori estimators, in the case of posterior measure for a Bayesian inverse problem) in an infinite-dimensional space \(X\).
Such modes can be defined either *strongly* (a la Dashti et al. (2013), via a global maximisation) or *weakly* (a la Helin & Burger (2015), via a dense subspace \(E \subset X\)).
The question is, when are strong and weak modes equivalent?
The answer turns out to be rather subtle:
under reasonable uniformity conditions, the two kinds of modes are indeed equivalent, but finite-dimensional counterexamples exist when the uniformity conditions fail.

**Abstract.** A strong mode of a probability measure on a normed space \(X\) can be defined as a point \(u \in X\) such that the mass of the ball centred at \(u\) uniformly dominates the mass of all other balls in the small-radius limit. Helin and Burger weakened this definition by considering only pairwise comparisons with balls whose centres differ by vectors in a dense, proper linear subspace \(E\) of \(X\), and posed the question of when these two types of modes coincide. We show that, in a more general setting of metrisable vector spaces equipped with non-atomic measures that are finite on bounded sets, the density of \(E\) and a uniformity condition suffice for the equivalence of these two types of modes. We accomplish this by introducing a new, intermediate type of mode. We also show that these modes can be inequivalent if the uniformity condition fails. Our results shed light on the relationships between among various notions of maximum a posteriori estimator in non-parametric Bayesian inference.

Published on Monday 9 July 2018 at 08:00 UTC #publication #preprint #inverse-problems #modes #map-estimator

### Preprint: Random Bayesian inverse problems

Han Cheng Lie, Aretha Teckentrup, and I have just uploaded a preprint of our latest article, “Random forward models and log-likelihoods in Bayesian inverse problems”, to the arXiv. This paper considers the effect of approximating the likelihood in a Bayesian inverse problem by a random surrogate, as frequently happens in applications, with the aim of showing that the perturbed posterior distribution is close to the exact one in a suitable sense. This article considers general randomisation models, and thereby expands upon the previous investigations of Stuart and Teckentrup (2017) in the Gaussian setting.

**Abstract.** We consider the use of randomised forward models and log-likelihoods within the Bayesian approach to inverse problems. Such random approximations to the exact forward model or log-likelihood arise naturally when a computationally expensive model is approximated using a cheaper stochastic surrogate, as in Gaussian process emulation (kriging), or in the field of probabilistic numerical methods. We show that the Hellinger distance between the exact and approximate Bayesian posteriors is bounded by moments of the difference between the true and approximate log-likelihoods. Example applications of these stability results are given for randomised misfit models in large data applications and the probabilistic solution of ordinary differential equations.

Published on Tuesday 19 December 2017 at 08:30 UTC #publication #preprint #inverse-problems

### Preprint: Active subspace Metropolis-Hastings

Ingmar Schuster, Paul Constantine and I have just uploaded a preprint of our latest article, “Exact active subspace Metropolis–Hastings, with applications to the Lorenz-96 system”, to the arXiv. This paper reports on our first investigations into the acceleration of Markov chain Monte Carlo methods using active subspaces as compared to other adaptivity techniques, and is supported by the DFG through SFB 1114 *Scaling Cascades in Complex Systems*.

**Abstract.** We consider the application of active subspaces to inform a Metropolis–Hastings algorithm, thereby aggressively reducing the computational dimension of the sampling problem. We show that the original formulation, as proposed by Constantine, Kent, and Bui-Thanh (*SIAM J. Sci. Comput.*, 38(5):A2779–A2805, 2016), possesses asymptotic bias. Using pseudo-marginal arguments, we develop an asymptotically unbiased variant. Our algorithm is applied to a synthetic multimodal target distribution as well as a Bayesian formulation of a parameter inference problem for a Lorenz-96 system.

Published on Friday 8 December 2017 at 08:00 UTC #publication #preprint #mcmc #sfb1114

### Preprint: Computing with dense kernel matrices at near-linear cost

Florian Schäfer, Houman Owhadi and I have just uploaded a preprint of our latest paper, “Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity” to the arXiv. This paper builds upon the probabilistic-numerical ideas of “gamblets” (elementary gables upon the solution of a PDE) introduced by Owhadi (2016) to provide near-linear cost \(\varepsilon\)-approximate compression, inversion and principal component analysis of dense kernel matrices, the entries of which come from Green's functions of suitable differential operators.

**Abstract.** Dense kernel matrices \(\Theta \in \mathbb{R}^{N \times N}\) obtained from point evaluations of a covariance function \(G\) at locations \(\{x_{i}\}_{1 \leq i \leq N}\) arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset \(S \subset \{ 1,\dots, N \} \times \{ 1,\dots,N \}\), with \(\#S = O(N \log(N)\log^{d}(N/\varepsilon))\), such that the zero fill-in block-incomplete Cholesky decomposition of \(\Theta_{i,j} 1_{(i,j) \in S}\) is an \(\varepsilon\)-approximation of \(\Theta\). This block-factorisation can provably be obtained in \(O(N \log^{2}(N)(\log(1/\varepsilon)+\log^{2}(N))^{4d+1})\) complexity in time. Numerical evidence further suggests that element-wise Cholesky decomposition with the same ordering constitutes an \(O(N \log^{2}(N) \log^{2d}(N/\varepsilon))\) solver. The algorithm only needs to know the spatial configuration of the \(x_{i}\) and does not require an analytic representation of \(G\). Furthermore, an approximate PCA with optimal rate of convergence in the operator norm can be easily read off from this decomposition. Hence, by using only subsampling and the incomplete Cholesky decomposition, we obtain at nearly linear complexity the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky decomposition we also obtain a near-linear-time solver for elliptic PDEs.

Published on Tuesday 13 June 2017 at 07:00 UTC #publication #preprint #prob-num

### Preprint: Strong convergence rates of probabilistic integrators for ODEs

Han Cheng Lie, Andrew Stuart and I have just uploaded a preprint of our latest paper, “Strong convergence rates of probabilistic integrators for ordinary differential equations” to the arXiv. This paper is a successor to the convergence results presented by Conrad et al. (2016). We consider a generic probabilistic solver for an ODE, using a fixed time step of length \(\tau > 0\), where the mean of the solver has global error of order \(\tau^{q}\) and the variance of the truncation error model has order \(\tau^{1 + 2 p}\). Whereas Conrad et al. showed, for a Lipschitz driving vector field, that the mean-square error between the numerical solution \(U_{k}\) and the true solution \(u_{k}\) is bounded uniformly in time as

\( \displaystyle \max_{0 \leq k \leq T / \tau} \mathbb{E} [ \| U_{k} - u_{k} \|^{2} ] \leq C \tau^{2 \min \{ p, q \}} \)

(i.e. has the same order of convergence as the underlying deterministic method), we are able to relax the regularity assumptions on the vector field / flow an obtain a stronger mode of convergence (mean-square in the uniform norm) with the same convergence rate:

\( \displaystyle \mathbb{E} \left[ \max_{0 \leq k \leq T / \tau} \| U_{k} - u_{k} \|^{2} \right] \leq C \tau^{2 \min \{ p, q \}} \)

**Abstract.** Probabilistic integration of a continuous dynamical system is a way of systematically introducing model error, at scales no larger than errors inroduced by standard numerical discretisation, in order to enable thorough exploration of possible responses of the system to inputs.
It is thus a potentially useful approach in a number of applications such as forward uncertainty quantification, inverse problems, and data assimilation.
We extend the convergence analysis of probabilistic integrators for deterministic ordinary differential equations, as proposed by Conrad et al. (*Stat. Comput.*, 2016), to establish mean-square convergence in the uniform norm on discrete- or continuous-time solutions under relaxed regularity assumptions on the driving vector fields and their induced flows.
Specifically, we show that randomised high-order integrators for globally Lipschitz flows and randomised Euler integrators for dissipative vector fields with polynomially-bounded local Lipschitz constants all have the same mean-square convergence rate as their deterministic counterparts, provided that the variance of the integration noise is not of higher order than the corresponding deterministic integrator.

Published on Monday 13 March 2017 at 13:00 UTC #publication #preprint #prob-num