I am trying to visualize my data flow with a Sankey Diagram in R.
I found this blog post linking to an R script that produces a Sankey Diagram; unfortunately, it's quite raw and somewhat limited (see below for sample code and data).
Does anyone know of other scripts—or maybe even a package—that is more developed? My end goal is to visualize both data flow and percentages by relative size of diagram components, like in these examples of Sankey Diagrams.
I posted a somewhat similar question on the r-help list, but after two weeks without any responses I'm trying my luck here on stackoverflow.
Thanks, Eric
PS. I'm aware of the Parallel Sets Plot, but that is not what I'm looking for.
# thanks to, https://tonybreyal.wordpress.com/2011/11/24/source_https-sourcing-an-r-script-from-github/
sourc.https <- function(url, ...) {
# install and load the RCurl package
if (match('RCurl', nomatch=0, installed.packages()[,1])==0) {
install.packages(c("RCurl"), dependencies = TRUE)
require(RCurl)
} else require(RCurl)
# parse and evaluate each .R script
sapply(c(url, ...), function(u) {
eval(parse(text = getURL(u, followlocation = TRUE,
cainfo = system.file("CurlSSL", "cacert.pem",
package = "RCurl"))), envir = .GlobalEnv)
} )
}
# from https://gist.github.com/1423501
sourc.https("https://raw.github.com/gist/1423501/55b3c6f11e4918cb6264492528b1ad01c429e581/Sankey.R")
# My example (there is another example inside Sankey.R):
inputs = c(6, 144)
losses = c(6,47,14,7, 7, 35, 34)
unit = "n ="
labels = c("Transfers",
"Referrals\n",
"Unable to Engage",
"Consultation only",
"Did not complete the intake",
"Did not engage in Treatment",
"Discontinued Mid-Treatment",
"Completed Treatment",
"Active in \nTreatment")
SankeyR(inputs,losses,unit,labels)
# Clean up my mess
rm("inputs", "labels", "losses", "SankeyR", "sourc.https", "unit")
Sankey Diagram produced with the above code,