I'm still learning the basics of working with PyMC3 so hopefully this isn't too painfully obvious in the documentation already. The basic idea is that I have put together my model, sampled it a bunch to build up my posterior distribution and saved the chains. If I follow the suggestion of the Backends page to load the chains like trace = pm.backends.text.load('test_txt')
then I get TypeError: No context on context stack
. What I'd expect is that I'd be able to point the text.load
method at the saved database and I would get back numpy arrays with all the trace values, i.e. the database would contain all the information needed to access the chain values.
A little hunting and the only example of loading a trace in PyMC3 I could find was here, and that shows the same model variable being used to load the trace as was used to create it. If I wanted to have a script to run my chains and a separate script to load and analyze the traces it appears the only way is to have the model initialized with the same commands in both files. That sounds prone to creating inconsistencies between the files however since I would have to keep the models identical manually.
Here's an example taken from the PyMC getting started page where I save the chain. I saved the following code in a short script.
import numpy as np
import pymc3 as pm
from scipy import optimize
# Initialize random number generator
np.random.seed(123)
# True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]
# Size of dataset
size = 100
# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
# Simulate outcome variable
Y = alpha + beta[0]*X1 + beta[1]*X2 + np.random.randn(size)*sigma
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sd=10)
beta = pm.Normal('beta', mu=0, sd=10, shape=2)
sigma = pm.HalfNormal('sigma', sd=1)
# Expected value of outcome
mu = alpha + beta[0]*X1 + beta[1]*X2
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
# obtain starting values via MAP
start = pm.find_MAP(fmin=optimize.fmin_powell)
# instantiate sampler
step = pm.Slice(vars=[sigma])
# instantiate database
db = pm.backends.Text('so_save')
# draw 5000 posterior samples
trace = pm.sample(5000, step=step, start=start, trace=db)
Then running these next lines (at the Python CLI or in a separate script) gives
trace = pm.backends.text.load('so_save')
# TypeError: No context on context stack
trace = pm.backends.text.load('so_save', model=pm.Model())
print trace
print trace.varnames
# <MultiTrace: 1 chains, 5000 iterations, 0 variables>
# []
# run same first 36 lines from the big code block above
trace = pm.backends.text.load('so_save', model=basic_model)
print trace
print trace.varnames
# <MultiTrace: 1 chains, 5000 iterations, 4 variables>
# ['alpha', 'beta', 'sigma_log_', 'sigma']
For a little more motivation/context, I'm experimenting with modelling the same data in a couple slightly different ways. I would like to have nice long chains for each model on disk that I only have to generate once. Then I can play around with comparing them down the line as I think of ways I want to analyze the traces.