Construction of Transport Maps

Transport maps are nonlinear random variable transformations that can be used in many ways, including the acceleration of Bayesian inference and uncertainty quantification. In many cases, a complicated non-Gaussian distribution (called the target distribution) can be transformed into a Gaussian distribution (or something that is approximately Gaussian) which is easier to sample. This is illustrated in the figure below.

Transport maps are primarily constructed in one of two ways; the appropriate choice depends on how the target distribution, $\pi(\theta)$, is characterized. One approach uses target density evaluations (as in [1]) while the other approach requires only samples of the target density (as in [2] or [3]). MUQ currently has capabilities for sample-based construction, which is illustrated in this example notebook. For demonstration, we will use MUQ to construct a transport map from samples of a commonly used non-Gaussian distribution often called the "banana" or "boomerang" distribution.

  1. El Moselhy, T.A., & Marzouk, Y.M. (2012). Bayesian inference with optimal maps. Journal of Computational Physics, 231(23), 7815-7850.
  2. Parno, M., Moselhy, T., & Marzouk, Y. (2015). A multiscale strategy for Bayesian inference using transport maps. arXiv preprint arXiv:1507.07024.
  3. Parno, M., & Marzouk, Y. (2014). Transport map accelerated Markov chain Monte Carlo. arXiv preprint arXiv:1412.5492.

The mathematical problem

Let $\theta$ be a two dimensional random variable with density $\pi(\theta)$ given by [ \pi(\theta) \propto \exp\left(-0.5\theta_1^2 + (\theta_2-\theta_1^2)^2\right). ] Samples of $\pi(\theta)$ can be easily generated by sampling two independent standard normal random variables $[r_1,r_2]$ and recognizing that \begin{eqnarray} \theta_1 &=& r_1\ \theta_2 &=& r_1^2 + r_2 \end{eqnarray}

Our goal is to use samples of $\theta=[\theta_1,\theta_2]$ to construct a transformation $S(\theta)$ such that $$ S(\theta) = r, $$ where the equality is in distribution. In practice, we can only expect to recover an approximate map $\tilde{S}(\theta)$.

We will represent the approximate map $\tilde{S}(\theta)$ with a finite set of multivariate Hermite polynomials. Multivariate polynomials are generally expressed as a product of one dimensional polynomials. For example, let $h_p(\theta_i)$ be a scalar Hermite polynomial of order $p$. Then, a $d$-dimensional Hermite polynomial $H_{\mathbf{j}}(\theta)$ can be defined as $$ H_{\mathbf{j}}(\theta) = \prod_{i=1}^dh_{j_i}(\theta_i), $$ where $\mathbf{j} = [j_1,j_2,\dots,j_D]$ is the multiindex that contains the polynomial order in each dimension. We represent each dimension of transport map $\tilde{S}_d(\theta)$ with a finite basis of multivariate polynomials, yielding a transformation of the form $$ \tilde{S}_d(\theta) = \sum_{\mathbf{j}\in \mathcal{J}_d} c_{\mathbf{j}}H_{\mathbf{j}}(\theta), $$ where $c_{\mathbf{j}}\in \mathbb{R}$ is a coefficent for the polynomial defined by the multiindex $\mathbf{j}$. In this setting, constructing the map $\tilde{S}(\theta)$ boils down to finding coefficients in this expansion $\tilde{S}(\theta) \approx r$. See [2] or [3] for details on how this can be done by minimizing $d$ convex optimization problems.

The MUQ solution

To construct $\tilde{S}$, we first need to define the multiindex sets $\mathcal{J}_d$ for each dimension of the map ($d=2$ in this example). Then we can find each coefficient $c_{\mathbf{j}}$. Once constructed, we can evaluate the map to judge the accuracy of $\tilde{S}(\theta)$.

To achieve these steps in our c++ program, we will:

  1. Include necessary header files.
  2. Generate $N$ samples $\{\theta_1,\theta_2,\dots,\theta_N\}$ of $\pi(\theta)$.
  3. Create the multiindex sets $\mathcal{J}_1$ and $\mathcal{J}_2$.
  4. Construct $\tilde{S}(r)$ (MUQ will internally find each coefficient $c_{\mathbf{j}}$).
  5. Evaluate $r_i=\tilde{S}(\theta_i)$ for $i\in\{1,2,\dots,N\}$ and save the samples to text files.
  6. Evaluate the map-induced approximation of the target density and save to text file.

After compiling and executing the c++ code (which actually does all the work), we will use a bit of python to plot the results.

The c++ code

Necessary include files

The transport map capabilities in MUQ are part of the Inference module. The main file needed is MapFactory.h. In addition, the multivariate polynomials used to characterize the transport map in this example are defined by multiindices from the MultiIndex class. This class and tools for constructing sets of multiindices is included in the MultiIndexFactory.h file.


#include "MUQ/Inference/TransportMaps/MapFactory.h"
#include "MUQ/Utilities/MultiIndex/MultiIndexFactory.h"

#include "MUQ/Utilities/RandomGenerator.h"

#include <iostream>
#include <fstream>

using namespace std;
using namespace muq::Utilities;
using namespace muq::Inference;

Some initial setup

Here we start the main function and set some parameters. While the problem dimension is fixed in this example, you can play with the maximum order and number of samples; each will directly impact the map accuracy. In particular, try changing maxOrder to 1 or numSamps to 100000.

int main(){

  const int maxOrder = 3;    // Maximum polynomial order used in polynomial basis
  const int dim      = 2;    // Parameter dimension (stored in a variable for code readability)
  const int numSamps = 5000; // The number of samples used to construct the map
  

Generate target samples

Here we generate the initial $\theta$ samples and store them in a matrix called samps. Each column of samps corresponds to a sample.

  Eigen::MatrixXd samps = RandomGenerator::GetNormalRandomMatrix(dim,numSamps);
  samps.row(1) += samps.row(0).array().pow(2).matrix();
  

The samples are store in a file called TargetSamps.txt. This file will be read from python later on to plot the results.

  ofstream fout("TargetSamps.txt");
  fout << samps;
  fout.close();
  

Define the multiindex sets

For each dimension, we need to define the terms in the map's polynomial basis by defining a set of multiindices. MUQ will create lower trianagular transport maps, meaning that $\tilde{S}_d(\theta)$ only depends on the first $d$ components of $\theta$, i.e., $[\theta_1,\theta_2,\dots,\theta_d]$. In this example, we will also limit the total order of a multiindex, defined by $\sum_{i=1}^d \mathbf{j}_i$.

In this cell, we use the CreateTriTotalOrder function to construct multiindex sets for a lower triangular map with maximum total order defined by the variable maxOrder.

Total order maps can contain many terms (and subsequently many coefficients to estimate). More parsimonious multiindex sets can be constructed using the CreateTriHyperbolic function. It is also possible to define custom multiindex sets by defining a custom MultiIndexLimiter, which filters out unnecesary terms from a larger set formed by CreateTriTotalOrder or CreateTriHyperbolic.

Note that we explicitly define the typ of the multis variable to be vector<shared_ptr<MultiIndexSet>>, but more readable code may be possible using the c++11 auto keyword.

  vector< shared_ptr< MultiIndexSet > >  multis = MultiIndexFactory::CreateTriTotalOrder(dim,maxOrder);
  

Construct the map, i.e. do the work!

The BuildToNormal function is used here to find the coefficients $c_{\mathbf{j}}$ that define the map; see [2] or [3] for more details on how this is done. This function requires the target distribution samples and multiindex sets as input. It is also possible to pass additional optional parameters using a boost ptree. See the doxygen documentation for details.

The result of BuildToNormal is a smart pointer to a TransportMap instance. The TransportMap class is a child of the ModPiece class, but with additional functionality for computing the map's inverse.

  shared_ptr< TransportMap > map = MapFactory::BuildToNormal(samps, multis);
  

Test the accuracy: Gaussian samples

Now that we have constructed the map, let's evaluate it at all of our target samples to assess the map's accuracy. For an exact map, the output samples (gaussSamps in this cell) will be from an isotropic standard normal distribution.

  Eigen::MatrixXd gaussSamps = map->EvaluateMulti(samps);
  
  fout.open("GaussSamps.txt");
  fout << gaussSamps;
  fout.close();
  

Test the accuracy: Map-Induced density

The map $\tilde{S}(\theta)$ can be used to "pull back" the Gaussian reference density to define a map-induced approximation to the true target $\pi(\theta)$. This map-induced density, call it $q(\theta)$, has the form: $$ q(\theta) \propto \exp\left(-0.5\|\tilde{S}(\theta)\|\right)\left|\text{det}D\tilde{S}(\theta)\right|, $$ where \left|\text{det}D\tilde{S}(\theta)\right| is the determinant of the map Jacobian. The following cells set up a grid over $\theta$ (the x and y vectors below), and evaluate this map-induced density.

Later, we will compare contour plots of the map-induced density with the true target density.

  const int N = 40;
  Eigen::VectorXd x = Eigen::VectorXd::LinSpaced(N,-3,3);
  Eigen::VectorXd y = Eigen::VectorXd::LinSpaced(N,-3,5);

  Eigen::MatrixXd logTarget(N,N);
  Eigen::MatrixXd X(N,N);
  Eigen::MatrixXd Y(N,N);
  Eigen::VectorXd targetVec(2);
  Eigen::VectorXd refVec(2);

  for(int i=0; i<N; ++i){
    for(int j=0; j<N; ++j){
      X(j,i) = x(i);
      Y(j,i) = y(j);
      targetVec << x(i), y(j);
      refVec = map->Evaluate(targetVec);
      logTarget(j,i) = -0.5*refVec.squaredNorm() + map->Jacobian(targetVec).diagonal().array().log().sum();
    }
  }
  fout.open("ApproximateTarget.txt");
  fout << logTarget;
  fout.close();

  fout.open("TargetX.txt");
  fout << X;
  fout.close();

  fout.open("TargetY.txt");
  fout << Y;
  fout.close();

Finish the main function


  return 0;
}// end of int main()

Compile the executable

See the CMakeLists.txt file for more details.

cd build; cmake ../ > BuildLog.txt; make; cd ../
Scanning dependencies of target BasicConstruction
[100%] Building CXX object CMakeFiles/BasicConstruction.dir/BasicConstruction.cpp.o
Linking CXX executable BasicConstruction
[100%] Built target BasicConstruction

Run the executable

build/BasicConstruction

Look at the results with Python

Throughout the c++ code above, we wrote information to simple text files. Here we will read in those text files with Python and use matplotlib to visualize the results.

import matplotlib.pyplot as plt
import math
import numpy as np

Sample comparison

First, read the reference samples. These were produced by evaluating the the map on all of the original target samples.

f = open('GaussSamps.txt', 'r')
sampStrings = [temp.split() for temp in f.read().split('\n')]
gsamps = [[float(s) for s in sampStrings[0]], [float(s) for s in sampStrings[1]]]
f.close()

Now read the original target samples.

f = open('TargetSamps.txt', 'r')
sampStrings = [temp.split() for temp in f.read().split('\n')]
samps = [[float(s) for s in strRow] for strRow in sampStrings]

Make scatter plots of both the target samples and reference samples. If the map is accurate, the reference samples should resemble a standard normal distribution.

plt.figure(figsize=(15,7))

plt.subplot(1,2,1)
plt.plot(samps[0],samps[1],'.')
plt.title('Target Samples')
plt.xlim([-3,3])
plt.ylim([-3,6])

plt.subplot(1,2,2)
plt.plot(gsamps[0],gsamps[1],'.')
plt.title('Reference samples from map output')
plt.ylim([-3,3])
plt.xlim([-3,3])
plt.show()

Density comparison

Now let's read in the map-induced density and compare it to the true target density.

f = open('ApproximateTarget.txt', 'r')
logEvalStrings = [temp.split() for temp in f.read().split('\n')]
targetEvals = [[math.exp(float(s)) for s in strRow] for strRow in logEvalStrings]

f = open('TargetX.txt', 'r')
xStrings = [temp.split() for temp in f.read().split('\n')]
X = [[float(s) for s in strRow] for strRow in xStrings]

f = open('TargetY.txt', 'r')
yStrings = [temp.split() for temp in f.read().split('\n')]
Y = [[float(s) for s in strRow] for strRow in yStrings]
plt.figure(figsize=(15,7))

plt.subplot(1,2,1)
truePdf = np.exp(-0.5*(np.power(X,2) + np.power(Y-np.power(X,2.0),2.0)))
plt.contour(X,Y,truePdf)
plt.xlim([-3,3])
plt.ylim([-3,5])
plt.title('True target density')

plt.subplot(1,2,2)
plt.contour(X,Y,targetEvals)
plt.xlim([-3,3])
plt.ylim([-3,5])
plt.title('Map-Induced approximation to target')
plt.show()

Completed code:

Python

import matplotlib.pyplot as plt
import math
import numpy as np

f = open('GaussSamps.txt', 'r')
sampStrings = [temp.split() for temp in f.read().split('\n')]
gsamps = [[float(s) for s in sampStrings[0]], [float(s) for s in sampStrings[1]]]
f.close()

f = open('TargetSamps.txt', 'r')
sampStrings = [temp.split() for temp in f.read().split('\n')]
samps = [[float(s) for s in strRow] for strRow in sampStrings]

plt.figure(figsize=(15,7))

plt.subplot(1,2,1)
plt.plot(samps[0],samps[1],'.')
plt.title('Target Samples')
plt.xlim([-3,3])
plt.ylim([-3,6])

plt.subplot(1,2,2)
plt.plot(gsamps[0],gsamps[1],'.')
plt.title('Reference samples from map output')
plt.ylim([-3,3])
plt.xlim([-3,3])
plt.show()

f = open('ApproximateTarget.txt', 'r')
logEvalStrings = [temp.split() for temp in f.read().split('\n')]
targetEvals = [[math.exp(float(s)) for s in strRow] for strRow in logEvalStrings]

f = open('TargetX.txt', 'r')
xStrings = [temp.split() for temp in f.read().split('\n')]
X = [[float(s) for s in strRow] for strRow in xStrings]

f = open('TargetY.txt', 'r')
yStrings = [temp.split() for temp in f.read().split('\n')]
Y = [[float(s) for s in strRow] for strRow in yStrings]

plt.figure(figsize=(15,7))

plt.subplot(1,2,1)
truePdf = np.exp(-0.5*(np.power(X,2) + np.power(Y-np.power(X,2.0),2.0)))
plt.contour(X,Y,truePdf)
plt.xlim([-3,3])
plt.ylim([-3,5])
plt.title('True target density')

plt.subplot(1,2,2)
plt.contour(X,Y,targetEvals)
plt.xlim([-3,3])
plt.ylim([-3,5])
plt.title('Map-Induced approximation to target')
plt.show()


BasicConstruction.cpp


#include "MUQ/Inference/TransportMaps/MapFactory.h"
#include "MUQ/Utilities/MultiIndex/MultiIndexFactory.h"

#include "MUQ/Utilities/RandomGenerator.h"

#include <iostream>
#include <fstream>

using namespace std;
using namespace muq::Utilities;
using namespace muq::Inference;


int main(){

  const int maxOrder = 3;    // Maximum polynomial order used in polynomial basis
  const int dim      = 2;    // Parameter dimension (stored in a variable for code readability)
  const int numSamps = 5000; // The number of samples used to construct the map
  

  Eigen::MatrixXd samps = RandomGenerator::GetNormalRandomMatrix(dim,numSamps);
  samps.row(1) += samps.row(0).array().pow(2).matrix();
  

  ofstream fout("TargetSamps.txt");
  fout << samps;
  fout.close();
  

  vector< shared_ptr< MultiIndexSet > >  multis = MultiIndexFactory::CreateTriTotalOrder(dim,maxOrder);
  

  shared_ptr< TransportMap > map = MapFactory::BuildToNormal(samps, multis);
  

  Eigen::MatrixXd gaussSamps = map->EvaluateMulti(samps);
  

  fout.open("GaussSamps.txt");
  fout << gaussSamps;
  fout.close();
  

  const int N = 40;
  Eigen::VectorXd x = Eigen::VectorXd::LinSpaced(N,-3,3);
  Eigen::VectorXd y = Eigen::VectorXd::LinSpaced(N,-3,5);


  Eigen::MatrixXd logTarget(N,N);
  Eigen::MatrixXd X(N,N);
  Eigen::MatrixXd Y(N,N);
  Eigen::VectorXd targetVec(2);
  Eigen::VectorXd refVec(2);

  for(int i=0; i<N; ++i){
    for(int j=0; j<N; ++j){
      X(j,i) = x(i);
      Y(j,i) = y(j);
      targetVec << x(i), y(j);
      refVec = map->Evaluate(targetVec);
      logTarget(j,i) = -0.5*refVec.squaredNorm() + map->Jacobian(targetVec).diagonal().array().log().sum();
    }
  }

  fout.open("ApproximateTarget.txt");
  fout << logTarget;
  fout.close();

  fout.open("TargetX.txt");
  fout << X;
  fout.close();

  fout.open("TargetY.txt");
  fout << Y;
  fout.close();


  return 0;
}// end of int main()


Test Status
  • Develop Branch
    OS Compiler
    OSX Clang Test Status
    Ubuntu Clang Test Status
    Ubuntu g++ 4.7 Test Status
    Ubuntu g++ 4.8 Test Status
    Ubuntu g++ 4.9 Test Status