BETANCOURT ET AL. 1.2 Markov Kernels Induced From Measure-Preserving Maps Directly constructing a markoy kernel that targets w.let alone an efficient markov kernel,can be difficult.Instead of constructing a kernel directly,however,we can construct space into itself, t:Q→Q,te「, that each preserves the target measure, t,=可, where the pushforward measure,,is defined as (t,)(A)=(ot-1)(A),A∈B(Q). (C,9, which induces a Markov kernel by 9 ra,A)=fa14t(g》, where I is the indicator function, u{ggeQ4e@ In other words,the kernel assignsa probability toset,AB),by computing the measure of the preimage of that set,t(A),averaged over all isomorphisms in I.Because each t preserves the target measure,so too will their convolution and,consequently,the Markov transition induced by the kernel. This construction provides a new perspective on the limited performance of existing algorithms. 9→g+e(<fg+9) f(a E~N(0,) n~U0,1,6 BETANCOURT ET AL. 1.2 Markov Kernels Induced From Measure-Preserving Maps Directly constructing a Markov kernel that targets ̟, let alone an efficient Markov kernel, can be difficult. Instead of constructing a kernel directly, however, we can construct one indirectly be defining a family of measure-preserving maps (Petersen, 1989). Formally, let Γ be some space of continuous, bijective maps, or isomorphisms, from the space into itself, t : Q → Q, ∀t ∈ Γ, that each preserves the target measure, t∗̟ = ̟, where the pushforward measure, t∗̟, is defined as (t∗̟)(A) ≡ ̟ ◦ t −1 (A), ∀A ∈ B(Q). If we can define a σ-algebra, G, over this space then the choice of a distinguished measure over G, γ, defines a probability space, (Γ, G, γ), which induces a Markov kernel by (1) τ (q, A) ≡ Z Γ γ(dt)IA(t(q)), where I is the indicator function, IA(q) ∝ 0, q /∈ A 1, q ∈ A , q ∈ Q, A ∈ B(Q). In other words, the kernel assigns a probability to a set, A ∈ B(Q), by computing the measure of the preimage of that set, t −1 (A), averaged over all isomorphisms in Γ. Because each t preserves the target measure, so too will their convolution and, consequently, the Markov transition induced by the kernel. This construction provides a new perspective on the limited performance of existing algorithms. Example 1. We can consider Gaussian Random Walk Metropolis, for example, as being generated by random, independent translations of each point in the sample space, tǫ,η : q 7→ q + ǫ I η < f(q + ǫ) f(q) ǫ ∼ N (0, Σ) η ∼ U[0, 1]