Multiscale inference with transport maps

This notebook uses the latest version of MUQ to illustrate concepts from A multiscale strategy for Bayesian inference using transport maps by Matthew Parno, Tarek Moselhy, and Youssef Marzouk.

The general multiscale setting

The idea is to take advantage of multiscale structure to decompose the usual Bayesian inference problem into two components: a coarse problem and a fine problem. To illustrate this mathematically, let $\theta$ be a random variable taking values in $\mathbb{R}^{d_\theta}$ and let $y$ be an observable random variable (e.g., the data) taking values in $\mathbb{R}^{d_y}$. Our goal is then to generate samples of the Bayesian posterior

$$ \pi(\theta | y) \propto \pi(y|\theta)\pi(\theta). $$

However, as [Parno et al.] outlines, additional structure can be exploited when the relationship between $\theta$ and $y$ exhibits multiscale behavior. Let $\gamma$ be a intermediate coarse random variable taking values in $\mathbb{R}^{d_\gamma}$ and assume that $\theta$ and $y$ are conditionally independent given $\gamma$. That is,

$$ \pi(\theta,y|\gamma) = \pi(\theta|\gamma)\pi(y|\gamma). $$

Using this fact for the joint posterior $\pi(\theta,\gamma | y)$, yields

\begin{eqnarray*} \pi(\theta,\gamma | y) &\propto& \pi(y | \theta,\gamma)\pi(\theta,\gamma)\\ &=& \pi(y|\gamma)\pi(\gamma)\pi(\theta|\gamma) \end{eqnarray*}

Looking at this expression closer, we can see that $\pi(\theta,\gamma | y)$ contains a coarse posterior $\pi(y|\gamma)\pi(\gamma)$ and a coarse-to-fine distribution $\pi(\theta|\gamma)$. Following [Parno et al.] we will use MCMC to sample the coarse posterior and then "prolongate" those coarse samples back to the fine scale using $\pi(\theta | y)$.

A solution strategy

There are two challenges in performing multiscale inference as described above: first, we need to know the coarsescale prior so that we can sample $\pi(\gamma|y) \propto \pi(y|\gamma)\pi(\gamma)$, and second, we need to be able to sample
$\pi(\theta|\gamma)$. Both of these challenges can be addressed with an appropriately constructed transport map.

Consider the map

$$ S(\gamma,\theta) = \left[\begin{array}{l} S_c(\gamma) \\ S_f(\gamma,\theta)\end{array}\right] = \left[\begin{array}{c}r_c\\r_f\end{array}\right], $$

and its inverse

$$ T(r_c,r_f) = \left[\begin{array}{l} T_c(r_c) \\ T_f(r_c,r_f)\end{array}\right] = \left[\begin{array}{c}\gamma\\\theta\end{array}\right], $$

where $r_c$ and $r_f$ are IID standard normal random variables with the same dimensions as $\gamma$ and $\theta$, respectively. Using $T_c$, we can sample the posterior $\pi(r_c|y)$ (instead of $\pi(\gamma|y)$) and then use the coarse map $T_f(r_c,r_f)$ to generate samples of $\pi(\theta | r_c)$. See [Parno et al.] for more details of this approach.

A specific example

In this notebook, we are concerned with a specific example of multiscale, which was first used for illustration in [Parno et al.]. In this example, $\theta$ is two-dimensional, $\gamma$ is one-dimensional, and they are deterministically related through

$$ \gamma = log\left(\frac{1.0}{\exp(-x_1) + \exp(-x_2)}\right) $$

Also, $\theta$ has a multivariate Gaussian prior defined by

$$ \theta \sim N\left(\left[\begin{array}{c}0 \\ 0\end{array}\right],\left[\begin{array}{cc}2.0 & 0.6 \\ 0.6 & 2.0\end{array}\right]\right) $$

The data, $y$, is related to to $\gamma$ by

$$ y = \exp(3\gamma) - 2 + \eta $$

where $\eta\sim N(0,0.1)$ is a scalar Gaussian random variable. Notice that $y$ does not explicitly depend on $\theta$; if $\gamma$ is known, $\theta$ is not needed to calculate $y$. Thus, this problem satisfies the conditional independence assumption needed in the derivation of $\pi(\theta,\gamma | y)$ above.

Setting up the problem in MUQ

There are three main components of MUQ that are needed to sample the posterior $\pi(\theta|y)$ using this multiscale approach. First is the modeling module, which we use to define ModPieces and ModGraphs describing the relationships between $\theta$, $\gamma$, and $y$. Second is the use of transport maps, which are needed to set up the coarse scale inference problem and also to generate finescale samples from coarsescale samples. And third, is the MCMC model for coarse scale sampling.

Below, we will:

  1. Define children of the ModPiece class describing the relationships between $\theta$, $\gamma$, and $y$.
  2. Sample the prior to generate samples of the joint distribution $\pi(\gamma,\theta)$.
  3. Construct a transport map from $(\gamma,\theta)$ to $(r_c,r_f)$ and use regression to approximate its inverse. This will create the maps $S(\gamma,\theta)$ and $T(r_c,r_f)$.
  4. Set up the coarse reference posterior $\pi(r_c|y)$ and sample it using a MALA MCMC approach. The result will be a set of coarse posterior samples $\{r_c^{(1)},r_c^{(2)},...,r_c^{(K)}\}$.
  5. Evaluate the fine map $\theta^{(i)}=T_f(r_c^{(i)},r_f)$ at each posterior sample with a random $r_f$ to generate a set of finescale samples $\{\theta^{(1)},\theta^{(2)},...,\theta^{(K)}\}$.

Include the necessary header files


// std library includes
#include <fstream>
#include <iostream>

// boost includes
#include <Eigen/Core>
#include <Eigen/Sparse>

// muq utilities includes
#include "MUQ/Utilities/HDF5File.h"
#include "MUQ/Utilities/RandomGenerator.h"
#include "MUQ/Utilities/MultiIndex/MultiIndexFactory.h"

// muq inference includes
#include "MUQ/Inference/MCMC/MCMCBase.h"
#include "MUQ/Inference/ProblemClasses/InferenceProblem.h"
#include "MUQ/Inference/MAP/MAPbase.h"
#include "MUQ/Inference/TransportMaps/MapFactory.h"

// muq modelling includes
#include "MUQ/Modelling/ModPieceTemplates.h"
#include "MUQ/Modelling/ModGraphPiece.h"
#include "MUQ/Modelling/GaussianPair.h"
#include "MUQ/Modelling/DensityProduct.h"
#include "MUQ/Modelling/EmpiricalRandVar.h"
#include "MUQ/Modelling/VectorPassthroughModel.h"

// namespaces
using namespace std;
using namespace muq::Modelling;
using namespace muq::Utilities;
using namespace muq::Inference;

Define a fine-to-coarse model

Here we define the fine-to-coarse model by creating a child of the ModPiece class. Notice that there is only one input $\theta$, and one output $\gamma$, so we used the template class OneInputNoDerivModPiece.

class Fine2Coarse : public OneInputNoDerivModPiece {
public:

  Fine2Coarse() : OneInputNoDerivModPiece(2, 1) {}

  virtual Eigen::VectorXd EvaluateImpl(const Eigen::VectorXd& x) override
  {
    Eigen::VectorXd y(1);

    y(0) = log(1.0 / (exp(-1.0 * x(0)) + exp(-1.0 * x(1))));
    return y;
  }
};

Define a coarse-to-data model

The following cell creates a child of ModPiece describing the deterministic relationship between the coarse variable $\gamma$ and the data $y$. The additive error $\eta$ will come into play later on in the likelihood definition.


class Coarse2Data : public OneInputNoDerivModPiece {
public:

  Coarse2Data(double bIn = 1.0) : OneInputNoDerivModPiece(1, 1), b(bIn) {}

  virtual Eigen::VectorXd EvaluateImpl(const Eigen::VectorXd& x) override
  {
    Eigen::VectorXd y(1);

    y(0) = pow(exp(x(0)), 3) - b;
    return y;
  }

private:

  double b;
};

A function to generate fine and coarse samples


Eigen::MatrixXd GenerateSamples(int numSamps, shared_ptr<RandVar> priorRv, shared_ptr<ModPiece> f2c)
{
   
    Eigen::MatrixXd allSamples(3,numSamps);
    
    // first generate the prior samples
    allSamples.bottomRows(2) = priorRv->Sample(numSamps);
    
    // now generate the coarse samples
    allSamples.topRows(1) = f2c->EvaluateMulti(allSamples.bottomRows(2));

    return allSamples;
}

A function to construct the transport maps

A total order multiindex set is used here to construct the transport map. Notice that the multiindex set returned by CreateTriTotalOrder guarantees that the map is lower triangular, thus allowing us to split it into $T_c$ and $T_f$. The CreateTriHyperbolic function is another alternative, which produces more parsimonious multiindex sets.

To construct $T$ from $S$, we use regression. In this cell, we first build $S$ with a call to BuildToNormal. Then we evaluate $S$ at all the prior samples with the EvaluateMulti member function. Finally, the RegressSampsToSamps function uses the inputs and outputs of the $S$ in a regression framework to approximate $T$.


shared_ptr<TransportMap> BuildMap(int maxOrder, Eigen::MatrixXd const& allSamples)
{

    auto multis = MultiIndexFactory::CreateTriTotalOrder(allSamples.rows(),maxOrder);
    auto invmap = MapFactory::BuildToNormal(allSamples, multis);
    
    Eigen::MatrixXd refSamps = invmap->EvaluateMulti(allSamples);
    
    return MapFactory::RegressSampsToSamps(refSamps, allSamples, multis);
}

Function to perform coarse MCMC

The following function sets of the coarse inference problem and then samples it using MCMC.

Step 1: Create a ModGraph to describe the posterior.

Notice that the coarse map $T_c$ is separated from the entire map $T$ using the head function. Like the Eigen::VectorXd class, the TransportMap class has functions like head, tail, and segment. All of these slices are performed on the map output.


Eigen::MatrixXd RunCoarseMCMC(Eigen::VectorXd const& data, shared_ptr<TransportMap> jointMap)
{
  auto graph = make_shared<ModGraph>();
  
  graph->AddNode(make_shared<VectorPassthroughModel>(1),"inferenceTarget");
  graph->AddNode(jointMap->head(1),"CoarseMap");
  graph->AddEdge("inferenceTarget","CoarseMap",0);
  
  graph->AddNode(make_shared<Coarse2Data>(2.0), "forwardModel");
  graph->AddEdge("CoarseMap","forwardModel",0);
  
  double dataVar = 0.1;
  graph->AddNode(make_shared<GaussianDensity>(data,dataVar),"likelihood");
  graph->AddEdge("forwardModel","likelihood",0);
  
  Eigen::VectorXd priorMu = Eigen::VectorXd::Ones(1);
  graph->AddNode(make_shared<GaussianDensity>(priorMu,1.0),"prior");
  graph->AddEdge("inferenceTarget","prior",0);
  
  graph->AddNode(make_shared<DensityProduct>(2),"posterior");
  graph->AddEdge("prior","posterior",0);
  graph->AddEdge("likelihood","posterior",1);
  

Step 2: Create an inference problem

Here the ModGraph is translated into an InferenceProblem, which the MCMC algorithms know how to work with.


  auto problem = make_shared<InferenceProblem>(graph);

Step 3: Define the MCMC parameters

See our list of parameters for a more comprehensive list of options. Note that these options can also be conveniently stored in an XML or JSON file and read with boost's xml parser or json parser


  // define the properties and tuning parameters for preconditioned MALA
  boost::property_tree::ptree params;
  params.put("MCMC.Method", "SingleChainMCMC");
  params.put("MCMC.Kernel", "MHKernel");
  params.put("MCMC.Proposal", "PreMALA");
  params.put("MCMC.Steps", 20000);
  params.put("MCMC.BurnIn", 5000);
  params.put("MCMC.MH.PropSize", 0.01);
  params.put("Verbose", 3);

Step 4: Construct the MCMC sampler

Use the options defined in the params ptree to construct an MCMC sampler.


  // construct a MCMC sampling task from the parameters and the inference problem.
  auto mcmcSampler = MCMCBase::ConstructMCMC(problem, params);
  
  

Step 5: Sample the coarse posterior

Here is where all the work actually happens. From an initial starting point of $r_c=0.5$, this cell runs the MCMC chain and returns and Eigen::MatrixXd with the samples. Each column of the matrix is a sample.


  Eigen::VectorXd startingPoint(1);
  startingPoint << 0.5;
  
  auto mcmcChain = mcmcSampler->Sample(startingPoint);

  return mcmcChain->GetAllSamples();
}

A function to generate finescale samples

After running the coarsescale MCMC, this function will generate realizations of the finescale posterior $\pi(\theta|y)$ using $T_f$.


Eigen::MatrixXd SampleFinescale(Eigen::MatrixXd const& coarseSamples, shared_ptr<TransportMap> jointMap)
{
  int numSamps = coarseSamples.cols();
  Eigen::MatrixXd refSamps(3,numSamps);
  
  // set r_c to come from the MCMC samples
  refSamps.row(0) = coarseSamples;
  
  // generate independent samples of r_f
  refSamps.bottomRows(2) = RandomGenerator::GetNormalRandomMatrix(2,numSamps);
  
  return jointMap->EvaluateMulti(refSamps);
}

Put all the functions together

Here is where the rubber meets the road! First we define the prior $\pi(\theta)$. Then we use the classes and functions above to generate samples of $\pi(\theta|y)$.


int main(){
  
  // Define the prior
  Eigen::VectorXd mu = Eigen::VectorXd::Zero(2);
  Eigen::MatrixXd cov(2,2);
  cov << 2.0, 0.6,
         0.6, 2.0;
  auto priorRv = make_shared<GaussianRV>(mu,cov);
  

  auto f2c = make_shared<Fine2Coarse>();

  int numSamps = 5e4;
  Eigen::MatrixXd allSamps = GenerateSamples(numSamps, priorRv, f2c);

  int maxOrder = 7;
  auto map = BuildMap(maxOrder, allSamps);

  Eigen::VectorXd data(1);
  data << -1.8;
  Eigen::MatrixXd coarsePostSamps = RunCoarseMCMC(data, map);

  Eigen::MatrixXd finePostSamps = SampleFinescale(coarsePostSamps, map);

Save the samples

The following cell uses an instance of the HDF5File class to store the samples. Note that older version of MUQ only had the HDF5Wrapper class, which creates a persistent interface to a single global HDF5 file. The HDF5File, on the other hand, allows opening and closing several HDF5 files simultaneously within a single scope.


  HDF5File fout("MultiscaleResults.h5");
  fout.WriteMatrix("/Training/Samples",allSamps);
  fout.WriteScalarAttribute("/Training", "Number of Samples", numSamps);
  fout.WriteScalarAttribute("/Training", "Maximum Polynomial Order", maxOrder);
  
  fout.WriteMatrix("/Posterior/CoarseSamples", coarsePostSamps);
  fout.WriteMatrix("/Posterior/FineSamples",finePostSamps);
  
  fout.CloseFile();
  return 0;
}

If you build it, the samples will come...

See the cmake file for more details.

cd build; cmake ..; make; cd ../
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/mparno/Documents/Repositories/MUQ/MUQ/MUQ/examples/Inference/TransportMaps/MultiscaleInference/build
Scanning dependencies of target SmallMultiscale
[100%] Building CXX object CMakeFiles/SmallMultiscale.dir/SmallMultiscale.cpp.o
Linking CXX executable SmallMultiscale
[100%] Built target SmallMultiscale

Run the code

build/SmallMultiscale
10% Complete
  MH: Acceptance rate =  47.9%
20.0% Complete
  MH: Acceptance rate =  48.5%
30.0% Complete
  MH: Acceptance rate =  48.1%
40.0% Complete
  MH: Acceptance rate =  50.9%
50.0% Complete
  MH: Acceptance rate =  51.7%
60.0% Complete
  MH: Acceptance rate =  53.1%
70.0% Complete
  MH: Acceptance rate =  54.2%
80.0% Complete
  MH: Acceptance rate =  54.7%
90.0% Complete
  MH: Acceptance rate =  54.8%
100.0% Complete
  MH: Acceptance rate =  55.0%

Analyze the results

Now we switch over to Python to plot the results. The finescale samples are read from the HDF5 file created in c++ and matplotlib is used for plotting.

import h5py
import matplotlib.pyplot as plt

f = h5py.File('MultiscaleResults.h5')
postSamps = f['/Posterior/FineSamples']

plt.plot(postSamps[1,0:15000:2],postSamps[2,0:15000:2],'.')
plt.xlim([-2,2])
plt.ylim([-2,2])
plt.show()

Completed code:

Python

import h5py
import matplotlib.pyplot as plt

f = h5py.File('MultiscaleResults.h5')
postSamps = f['/Posterior/FineSamples']

plt.plot(postSamps[1,0:15000:2],postSamps[2,0:15000:2],'.')
plt.xlim([-2,2])
plt.ylim([-2,2])
plt.show()


SmallMultiscale.cpp


// std library includes
#include <fstream>
#include <iostream>

// boost includes
#include <Eigen/Core>
#include <Eigen/Sparse>

// muq utilities includes
#include "MUQ/Utilities/HDF5File.h"
#include "MUQ/Utilities/RandomGenerator.h"
#include "MUQ/Utilities/MultiIndex/MultiIndexFactory.h"

// muq inference includes
#include "MUQ/Inference/MCMC/MCMCBase.h"
#include "MUQ/Inference/ProblemClasses/InferenceProblem.h"
#include "MUQ/Inference/MAP/MAPbase.h"
#include "MUQ/Inference/TransportMaps/MapFactory.h"

// muq modelling includes
#include "MUQ/Modelling/ModPieceTemplates.h"
#include "MUQ/Modelling/ModGraphPiece.h"
#include "MUQ/Modelling/GaussianPair.h"
#include "MUQ/Modelling/DensityProduct.h"
#include "MUQ/Modelling/EmpiricalRandVar.h"
#include "MUQ/Modelling/VectorPassthroughModel.h"

// namespaces
using namespace std;
using namespace muq::Modelling;
using namespace muq::Utilities;
using namespace muq::Inference;

class Fine2Coarse : public OneInputNoDerivModPiece {
public:

  Fine2Coarse() : OneInputNoDerivModPiece(2, 1) {}

  virtual Eigen::VectorXd EvaluateImpl(const Eigen::VectorXd& x) override
  {
    Eigen::VectorXd y(1);

    y(0) = log(1.0 / (exp(-1.0 * x(0)) + exp(-1.0 * x(1))));
    return y;
  }
};


class Coarse2Data : public OneInputNoDerivModPiece {
public:

  Coarse2Data(double bIn = 1.0) : OneInputNoDerivModPiece(1, 1), b(bIn) {}

  virtual Eigen::VectorXd EvaluateImpl(const Eigen::VectorXd& x) override
  {
    Eigen::VectorXd y(1);

    y(0) = pow(exp(x(0)), 3) - b;
    return y;
  }

private:

  double b;
};


Eigen::MatrixXd GenerateSamples(int numSamps, shared_ptr<RandVar> priorRv, shared_ptr<ModPiece> f2c)
{
   
    Eigen::MatrixXd allSamples(3,numSamps);
    
    // first generate the prior samples
    allSamples.bottomRows(2) = priorRv->Sample(numSamps);
    
    // now generate the coarse samples
    allSamples.topRows(1) = f2c->EvaluateMulti(allSamples.bottomRows(2));

    return allSamples;
}


shared_ptr<TransportMap> BuildMap(int maxOrder, Eigen::MatrixXd const& allSamples)
{

    auto multis = MultiIndexFactory::CreateTriTotalOrder(allSamples.rows(),maxOrder);
    auto invmap = MapFactory::BuildToNormal(allSamples, multis);
    
    Eigen::MatrixXd refSamps = invmap->EvaluateMulti(allSamples);
    
    return MapFactory::RegressSampsToSamps(refSamps, allSamples, multis);
}


Eigen::MatrixXd RunCoarseMCMC(Eigen::VectorXd const& data, shared_ptr<TransportMap> jointMap)
{
  auto graph = make_shared<ModGraph>();
  
  graph->AddNode(make_shared<VectorPassthroughModel>(1),"inferenceTarget");
  graph->AddNode(jointMap->head(1),"CoarseMap");
  graph->AddEdge("inferenceTarget","CoarseMap",0);
  
  graph->AddNode(make_shared<Coarse2Data>(2.0), "forwardModel");
  graph->AddEdge("CoarseMap","forwardModel",0);
  
  double dataVar = 0.1;
  graph->AddNode(make_shared<GaussianDensity>(data,dataVar),"likelihood");
  graph->AddEdge("forwardModel","likelihood",0);
  
  Eigen::VectorXd priorMu = Eigen::VectorXd::Ones(1);
  graph->AddNode(make_shared<GaussianDensity>(priorMu,1.0),"prior");
  graph->AddEdge("inferenceTarget","prior",0);
  
  graph->AddNode(make_shared<DensityProduct>(2),"posterior");
  graph->AddEdge("prior","posterior",0);
  graph->AddEdge("likelihood","posterior",1);
  


  auto problem = make_shared<InferenceProblem>(graph);


  // define the properties and tuning parameters for preconditioned MALA
  boost::property_tree::ptree params;
  params.put("MCMC.Method", "SingleChainMCMC");
  params.put("MCMC.Kernel", "MHKernel");
  params.put("MCMC.Proposal", "PreMALA");
  params.put("MCMC.Steps", 20000);
  params.put("MCMC.BurnIn", 5000);
  params.put("MCMC.MH.PropSize", 0.01);
  params.put("Verbose", 3);


  // construct a MCMC sampling task from the parameters and the inference problem.
  auto mcmcSampler = MCMCBase::ConstructMCMC(problem, params);
  
  


  Eigen::VectorXd startingPoint(1);
  startingPoint << 0.5;
  
  auto mcmcChain = mcmcSampler->Sample(startingPoint);

  return mcmcChain->GetAllSamples();
}


Eigen::MatrixXd SampleFinescale(Eigen::MatrixXd const& coarseSamples, shared_ptr<TransportMap> jointMap)
{
  int numSamps = coarseSamples.cols();
  Eigen::MatrixXd refSamps(3,numSamps);
  
  // set r_c to come from the MCMC samples
  refSamps.row(0) = coarseSamples;
  
  // generate independent samples of r_f
  refSamps.bottomRows(2) = RandomGenerator::GetNormalRandomMatrix(2,numSamps);
  
  return jointMap->EvaluateMulti(refSamps);
}


int main(){
  
  // Define the prior
  Eigen::VectorXd mu = Eigen::VectorXd::Zero(2);
  Eigen::MatrixXd cov(2,2);
  cov << 2.0, 0.6,
         0.6, 2.0;
  auto priorRv = make_shared<GaussianRV>(mu,cov);
  


  auto f2c = make_shared<Fine2Coarse>();


  int numSamps = 5e4;
  Eigen::MatrixXd allSamps = GenerateSamples(numSamps, priorRv, f2c);


  int maxOrder = 7;
  auto map = BuildMap(maxOrder, allSamps);


  Eigen::VectorXd data(1);
  data << -1.8;
  Eigen::MatrixXd coarsePostSamps = RunCoarseMCMC(data, map);


  Eigen::MatrixXd finePostSamps = SampleFinescale(coarsePostSamps, map);


  HDF5File fout("MultiscaleResults.h5");
  fout.WriteMatrix("/Training/Samples",allSamps);
  fout.WriteScalarAttribute("/Training", "Number of Samples", numSamps);
  fout.WriteScalarAttribute("/Training", "Maximum Polynomial Order", maxOrder);
  
  fout.WriteMatrix("/Posterior/CoarseSamples", coarsePostSamps);
  fout.WriteMatrix("/Posterior/FineSamples",finePostSamps);
  
  fout.CloseFile();
  return 0;
}


Test Status
  • Develop Branch
    OS Compiler
    OSX Clang Test Status
    Ubuntu Clang Test Status
    Ubuntu g++ 4.7 Test Status
    Ubuntu g++ 4.8 Test Status
    Ubuntu g++ 4.9 Test Status