Manipulating mcmc.list object in R
Asked Answered
S

2

6

I have used JAGS called via rjags to produce the mcmc.list object foldD_samples, which contains trace monitors for a large number of stochastic nodes (>800 nodes).

I would now like to use R to compute a fairly complicated, scalar-valued function of these nodes, and write the output to an mcmc object so that I can use coda to summarize the posterior and run convergence diagnostics.

However, I haven't been able to figure out how get the posterior draws from foldD_samples into a dataframe. Any help much appreciated.

Here is the structure of the mcmc.list:

str(foldD_samples)
List of 3
 $ : mcmc [1:5000, 1:821] -0.667 -0.197 -0.302 -0.204 -0.394 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:821] "beta0" "beta1" "beta2" "dtau" ...
  ..- attr(*, "mcpar")= num [1:3] 4100 504000 100
 $ : mcmc [1:5000, 1:821] -0.686 -0.385 -0.53 -0.457 -0.519 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:821] "beta0" "beta1" "beta2" "dtau" ...
  ..- attr(*, "mcpar")= num [1:3] 4100 504000 100
 $ : mcmc [1:5000, 1:821] -0.492 -0.679 -0.299 -0.429 -0.421 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:821] "beta0" "beta1" "beta2" "dtau" ...
  ..- attr(*, "mcpar")= num [1:3] 4100 504000 100
 - attr(*, "class")= chr "mcmc.list"

Cheers, Jacob

Sideward answered 15/11, 2015 at 15:54 Comment(4)
There may be method in rjags, but i seem to remember that you can do do.call(rbind.data.frame, foldD_samples). Perhaps fatser more efficient to use data.table::rbindlistSynchroscope
ps you can apply code.samples on the list, without coercing to a dataframeSynchroscope
Thanks user20650 ! do.call(rbind.data.frame, foldD_samples) worked well. I'd be happy to accept this as an answer if posted as such. data.table::rbindlist doesn't accept mcmc.list objects as input. Also note the presumed typo "code" for "coda" in the postscript.Sideward
rbindlist(lapply(foldD_samples,as.data.frame)) might work ... if the efficiency matters to you.Belief
S
8

As it's a list structure you can use either of these methods to bind the matrices together.

do.call(rbind.data.frame, foldD_samples)

or

rbindlist(lapply(foldD_samples, as.data.frame)) # thanks to BenBolker

A mwe

library(rjags)
library(coda)
library(data.table)

mod <- textConnection("model {
  A ~ dnorm(0, 1)
  B ~ dnorm(0, 1)
}")

# evaluate
mod <- jags.model(mod, n.chains = 4, n.adapt = 50000) 
pos <- coda.samples(mod,  c("A", "B"),  10000)

out <- do.call(rbind.data.frame, pos)
out2 <- rbindlist(lapply(pos, as.data.frame))
all.equal(out, out2, check.attributes=FALSE)
Synchroscope answered 15/11, 2015 at 20:9 Comment(0)
P
2

The answer given by user20650 will certainly work, but it can be quite slow. Also note that as of this writing, rbind_list() is deprecated in place of bind_rows().

Something I've written for my own purposes converts the mcmc.list to a "long format" data.frame. On my machine, it is about 4-7x faster than the above methods, and adds two additional columns: one for the chain number, and one for the step number.

parameter_names <- varnames(mcmc_list)
saved_steps <- as.integer(row.names(mcmc_list[[1]]))
out <- data.frame("chain" = factor(rep(1 : length(mcmc_list), each = length(saved_steps))),
                  "step" = rep(saved_steps, length(mcmc_list)) )
for (param in parameter_names) {
    out[param] <- NA
}
for (a_chain in 1 : length(mcmc_list)) {
    out[out$chain == a_chain, parameter_names ] <- as.data.frame(mcmc_list[[a_chain]])
}
return(out)

Working with an mcmc.list object of 3 chains, 50,000 rows total, rbind_list/do.call method: 0.71 sec avg elapsed time my method: 0.12 sec avg elapsed time

Edit: some further reading of code in the library "coda" shows that as.matrix() is much faster.

parameter_names <- varnames(mcmc_list)
saved_steps <- as.integer(row.names(mcmc_list[[1]]))
out <- data.frame("chain" = factor(rep(1 : length(mcmc_list), each = length(saved_steps))),
                  "step" = rep(saved_steps, length(mcmc_list)) )
out <- cbind(out, as.data.frame(as.matrix(chain_samples)))

Takes on overage 0.03 seconds elapsed time.

Paries answered 16/7, 2017 at 15:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.