how to create a heatmap with a fixed external hierarchical cluster
Asked Answered
M

6

6

I have a matrix data, and want to visualize it with heatmap. The rows are species, so I want visualize the phylogenetic tree aside the rows and reorder the rows of the heatmap according the tree. I know the heatmap function in R can create the hierarchical clustering heatmap, but how can I use my phylogenetic clustering instead of the default created distance clustering in the plot?

Minnieminnnie answered 1/3, 2013 at 8:8 Comment(4)
What is the format of your phylogenetic tree? Can you provide some sample data?Ducan
I wonder if argument reorderfun in heatmap can aid in this...Majestic
In case you aren't familiar with it pasting the output from dput(head(mymatrixdata)) will let people reconstruct a portion of your data easily and will make it easier for them to help you.Ergot
@Ducan it's newick format, for example:(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);Minnieminnnie
D
13

First you need to use package ape to read in your data as a phylo object.

library(ape)
dat <- read.tree(file="your/newick/file")
#or
dat <- read.tree(text="((A:4.2,B:4.2):3.1,C:7.3);")

The following only works if your tree is ultrametric.

The next step is to transform your phylogenetic tree into class dendrogram.

Here is an example:

data(bird.orders) #This is already a phylo object
hc <- as.hclust(bird.orders) #Compulsory step as as.dendrogram doesn't have a method for phylo objects.
dend <- as.dendrogram(hc)
plot(dend, horiz=TRUE)

Plot of a phylogenetic tree, using plot.dendrogram

mat <- matrix(rnorm(23*23),nrow=23, dimnames=list(sample(bird.orders$tip, 23), sample(bird.orders$tip, 23))) #Some random data to plot

First we need to order the matrix according to the order in the phylogenetic tree:

ord.mat <- mat[bird.orders$tip,bird.orders$tip]

Then input it to heatmap:

heatmap(ord.mat, Rowv=dend, Colv=dend)

Heatmap with two-way phylogenetic tree indexing

Edit: Here is a function to deal with ultrametric and non-ultrametric trees.

heatmap.phylo <- function(x, Rowp, Colp, ...){
    # x numeric matrix
    # Rowp: phylogenetic tree (class phylo) to be used in rows
    # Colp: phylogenetic tree (class phylo) to be used in columns
    # ... additional arguments to be passed to image function
    x <- x[Rowp$tip, Colp$tip]
    xl <- c(0.5, ncol(x)+0.5)
    yl <- c(0.5, nrow(x)+0.5)
    layout(matrix(c(0,1,0,2,3,4,0,5,0),nrow=3, byrow=TRUE),
                  width=c(1,3,1), height=c(1,3,1))
    par(mar=rep(0,4))
    plot(Colp, direction="downwards", show.tip.label=FALSE,
               xlab="",ylab="", xaxs="i", x.lim=xl)
    par(mar=rep(0,4))
    plot(Rowp, direction="rightwards", show.tip.label=FALSE, 
               xlab="",ylab="", yaxs="i", y.lim=yl)
    par(mar=rep(0,4), xpd=TRUE)
    image((1:nrow(x))-0.5, (1:ncol(x))-0.5, x, 
           xaxs="i", yaxs="i", axes=FALSE, xlab="",ylab="", ...)
    par(mar=rep(0,4))
    plot(NA, axes=FALSE, ylab="", xlab="", yaxs="i", xlim=c(0,2), ylim=yl)
    text(rep(0,nrow(x)),1:nrow(x),Rowp$tip, pos=4)
    par(mar=rep(0,4))
    plot(NA, axes=FALSE, ylab="", xlab="", xaxs="i", ylim=c(0,2), xlim=xl)
    text(1:ncol(x),rep(2,ncol(x)),Colp$tip, srt=90, pos=2)
    }

Here is with the previous (ultrametric) example:

heatmap.phylo(mat, bird.orders, bird.orders)

Heatmap with ultrametric phylogenies as index

And with a non-ultrametric:

cat("owls(((Strix_aluco:4.2,Asio_otus:4.2):3.1,Athene_noctua:7.3):6.3,Tyto_alba:13.5);",
    file = "ex.tre", sep = "\n")
tree.owls <- read.tree("ex.tre")
mat2 <- matrix(rnorm(4*4),nrow=4, 
             dimnames=list(sample(tree.owls$tip,4),sample(tree.owls$tip,4)))
is.ultrametric(tree.owls)
[1] FALSE
heatmap.phylo(mat2,tree.owls,tree.owls)

Heatmap with non-ultrametric phylogenies as index

Ducan answered 1/3, 2013 at 8:32 Comment(6)
interesting heatmap.phylo function! new approach independent from the deprogram concept! I am pretty sure transpose it to the grid world! +10! I am pretty sure I can trsnpose it to the grid package(lattice and grid not sure for ggplot2)Burgener
yes. without limitation to ultrametric trees. This is excellent. thanks, plannapus!Minnieminnnie
a following question, I met error: Error in image.default((1:ncol(x)) - 0.5, (1:nrow(x)) - 0.5, x, xaxs = "i", : dimensions of z are not length(x)(-1) times length(y)(-1) running heatmap.phylo(c, d1, d2) on my own data. I checked that the matrix dimension and the lengths of tips of the two trees definitely agree. Do you have idea what could be the problem? Thanks.Minnieminnnie
i actually sovled it. In your heatmap.phylo(), the ncol and nrow were switched in the image function. I corrected it.Minnieminnnie
Nice heatmap.phylo() function. However, when I tried to follow your worked example I receive the following error message: Error in plot.default(0, type = "n", xlim = x.lim, ylim = y.lim, xlab = "", : formal argument "xlab" matched by multiple actual arguments. I have no idea where the error can came from, because I just copy/paste your example here for the non-ultrametric tree...Any Ideas?Outrank
@AntonioCanepa i wrote that function 7 years ago, I imagine package ape changed their code to plot phylogenies since then. Just get rid of both xlab='', ylab='' in lines 13 and 16 of the heatmap.phylo code and it should work.Ducan
B
3

First, I create a reproducible example. Without data we can just guess what you want. So please try to do better next time(specially you are confirmed user). For example you can do this to create your tree in newick format:

tree.text='(((XXX:4.2,ZZZ:4.2):3.1,HHH:7.3):6.3,AAA:13.6);'

Like @plannpus, I am using ape to converts this tree to a hclust class. Unfortunatlty, it looks that we can do the conversion only for ultrametric tree: the distance from the root to each tip is the same.

library(ape)
tree <- read.tree(text='(((XXX:4.2,ZZZ:4.2):3.1,HHH:7.3):6.3,AAA:13.6);')
is.ultrametric(tree)
hc <- as.hclust.phylo(tree)

Then I am using dendrogramGrob from latticeExtra to plot my tree. and levelplot from lattice to draw the heatmap.

library(latticeExtra)
dd.col <- as.dendrogram(hc)
col.ord <- order.dendrogram(dd.col)
mat <- matrix(rnorm(4*4),nrow=4)
colnames(mat) <- tree$tip.label
rownames(mat) <- tree$tip.label
levelplot(mat[tree$tip,tree$tip],type=c('g','p'),
          aspect = "fill",
          colorkey = list(space = "left"),
          legend =
            list(right =
                   list(fun = dendrogramGrob,
                        args =
                          list(x = dd.col, 
                               side = "right",
                               size = 10))),
          panel=function(...){
            panel.fill('black',alpha=0.2)
            panel.levelplot.points(...,cex=12,pch=23)
          }
)

enter image description here

Burgener answered 1/3, 2013 at 10:40 Comment(2)
+1 As usual, very nice. I'll try to see if I can find an easy workaround for non-ultrametric trees, when I'll have some time later on.Ducan
How do you adjust the space between the phylogeny and the panel?Deterrent
A
1

I adapted plannapus' answer to deal with more than one tree (also cutting out some options I didn't need in the process):

Heatmap with three trees

library(ape)

heatmap.phylo <- function(x, Rowp, Colp, breaks, col, denscol="cyan", respect=F, ...){
    # x numeric matrix
    # Rowp: phylogenetic tree (class phylo) to be used in rows
    # Colp: phylogenetic tree (class phylo) to be used in columns
    # ... additional arguments to be passed to image function

    scale01 <- function(x, low = min(x), high = max(x)) {
        x <- (x - low)/(high - low)
        x
    }

    col.tip <- Colp$tip
    n.col <- 1
    if (is.null(col.tip)) {
        n.col <- length(Colp)
        col.tip <- unlist(lapply(Colp, function(t) t$tip))
        col.lengths <- unlist(lapply(Colp, function(t) length(t$tip)))
        col.fraction <- col.lengths / sum(col.lengths)
        col.heights <- unlist(lapply(Colp, function(t) max(node.depth.edgelength(t))))
        col.max_height <- max(col.heights)
    }

    row.tip <- Rowp$tip
    n.row <- 1
    if (is.null(row.tip)) {
        n.row <- length(Rowp)
        row.tip <- unlist(lapply(Rowp, function(t) t$tip))
        row.lengths <- unlist(lapply(Rowp, function(t) length(t$tip)))
        row.fraction <- row.lengths / sum(row.lengths)
        row.heights <- unlist(lapply(Rowp, function(t) max(node.depth.edgelength(t))))
        row.max_height <- max(row.heights)
    }

    cexRow <- min(1, 0.2 + 1/log10(n.row))
    cexCol <- min(1, 0.2 + 1/log10(n.col))

    x <- x[row.tip, col.tip]
    xl <- c(0.5, ncol(x)+0.5)
    yl <- c(0.5, nrow(x)+0.5)

    screen_matrix <- matrix( c(
        0,1,4,5,
        1,4,4,5,
        0,1,1,4,
        1,4,1,4,
        1,4,0,1,
        4,5,1,4
    ) / 5, byrow=T, ncol=4 )

    if (respect) {
        r <- grconvertX(1, from = "inches", to = "ndc") / grconvertY(1, from = "inches", to = "ndc")
        if (r < 1) {
            screen_matrix <- screen_matrix * matrix( c(r,r,1,1), nrow=6, ncol=4, byrow=T)
        } else {
            screen_matrix <- screen_matrix * matrix( c(1,1,1/r,1/r), nrow=6, ncol=4, byrow=T)
        }
    }


    split.screen( screen_matrix )

    screen(2)
    par(mar=rep(0,4))

    if (n.col == 1) {
        plot(Colp, direction="downwards", show.tip.label=FALSE,xaxs="i", x.lim=xl)
    } else {
        screens <- split.screen( as.matrix(data.frame( left=cumsum(col.fraction)-col.fraction, right=cumsum(col.fraction), bottom=0, top=1)))
        for (i in 1:n.col) {
            screen(screens[i])
            plot(Colp[[i]], direction="downwards", show.tip.label=FALSE,xaxs="i", x.lim=c(0.5,0.5+col.lengths[i]), y.lim=-col.max_height+col.heights[i]+c(0,col.max_height))
        }
    }

    screen(3)
    par(mar=rep(0,4))

    if (n.col == 1) {
        plot(Rowp, direction="rightwards", show.tip.label=FALSE,yaxs="i", y.lim=yl)
    } else {
        screens <- split.screen( as.matrix(data.frame( left=0, right=1, bottom=cumsum(row.fraction)-row.fraction, top=cumsum(row.fraction))) )
        for (i in 1:n.col) {
            screen(screens[i])
            plot(Rowp[[i]], direction="rightwards", show.tip.label=FALSE,yaxs="i", x.lim=c(0,row.max_height), y.lim=c(0.5,0.5+row.lengths[i]))
        }
    }


    screen(4)
    par(mar=rep(0,4), xpd=TRUE)
    image((1:nrow(x))-0.5, (1:ncol(x))-0.5, x, xaxs="i", yaxs="i", axes=FALSE, xlab="",ylab="", breaks=breaks, col=col, ...)

    screen(6)
    par(mar=rep(0,4))
    plot(NA, axes=FALSE, ylab="", xlab="", yaxs="i", xlim=c(0,2), ylim=yl)
    text(rep(0,nrow(x)),1:nrow(x),row.tip, pos=4, cex=cexCol)

    screen(5)
    par(mar=rep(0,4))
    plot(NA, axes=FALSE, ylab="", xlab="", xaxs="i", ylim=c(0,2), xlim=xl)
    text(1:ncol(x),rep(2,ncol(x)),col.tip, srt=90, adj=c(1,0.5), cex=cexRow)

    screen(1)
    par(mar = c(2, 2, 1, 1), cex = 0.75)

    symkey <- T
    tmpbreaks <- breaks
    if (symkey) {
        max.raw <- max(abs(c(x, breaks)), na.rm = TRUE)
        min.raw <- -max.raw
        tmpbreaks[1] <- -max(abs(x), na.rm = TRUE)
        tmpbreaks[length(tmpbreaks)] <- max(abs(x), na.rm = TRUE)
    } else {
        min.raw <- min(x, na.rm = TRUE)
        max.raw <- max(x, na.rm = TRUE)
    }
    z <- seq(min.raw, max.raw, length = length(col))

    image(z = matrix(z, ncol = 1), col = col, breaks = tmpbreaks, 
          xaxt = "n", yaxt = "n")
    par(usr = c(0, 1, 0, 1))
    lv <- pretty(breaks)
    xv <- scale01(as.numeric(lv), min.raw, max.raw)
    axis(1, at = xv, labels = lv)

    h <- hist(x, plot = FALSE, breaks = breaks)
    hx <- scale01(breaks, min.raw, max.raw)
    hy <- c(h$counts, h$counts[length(h$counts)])
    lines(hx, hy/max(hy) * 0.95, lwd = 1, type = "s", 
          col = denscol)
    axis(2, at = pretty(hy)/max(hy) * 0.95, pretty(hy))
    par(cex = 0.5)
    mtext(side = 2, "Count", line = 2)

    close.screen(all.screens = T)

}

tree <- read.tree(text = "(A:1,B:1);((C:1,D:2):2,E:1);((F:1,G:1,H:2):5,((I:1,J:2):2,K:1):1);", comment.char="")
N <- sum(unlist(lapply(tree, function(t) length(t$tip))))

set.seed(42)
m <- cor(matrix(rnorm(N*N), nrow=N))
rownames(m) <- colnames(m) <- LETTERS[1:N]
heatmap.phylo(m, tree, tree, col=bluered(10), breaks=seq(-1,1,length.out=11), respect=T) 
Anabelle answered 11/12, 2013 at 12:18 Comment(1)
Hi @Michael Kuhn. Thanks for your code. However, when I tried to follow it and specifically when I tried to create the tree object by executing: tree <- read.tree(text = "(A:1,B:1);((C:1,D:2):2,E:1);((F:1,G:1,H:2):5,((I:1,J:2):2,K:1):1);", comment.char=""), I have this error message Error in if (z[i]) { : missing value where TRUE/FALSE needed. Any Ideas?Outrank
V
0

This exact application of a heatmap is already implemented in the plot_heatmap function (based on ggplot2) in the phyloseq package, which is openly/freely developed on GitHub. Examples with complete code and results are included here:

http://joey711.github.io/phyloseq/plot_heatmap-examples

One caveat, and not what you are explicitly asking for here, but phyloseq::plot_heatmap does not overlay a hierarchical tree for either axis. There is a good reason not to base your axis ordering on hierarchical clustering -- and this is because of the way indices at the end of long branches can still be next to each other arbitrarily depending on how branches are rotated at the nodes. This point, and an alternative based on non-metric multidimensional scaling is explained further in an article about the NeatMap package, which is also written for R and uses ggplot2. This dimension-reduction (ordination) approach to ordering the indices in a heatmap is adapted for phylogenetic abundance data in phyloseq::plot_heatmap.

Virelay answered 22/6, 2013 at 20:57 Comment(3)
It seems like plot_heatmap will make a heatmap without a hierarchical clustering tree next to it, but can-not (as OP requested) cluster by phylogeny (or put a phylogenetic tree next to the plot to indicate that phylogeny). Does that sound right or am I missing something?Tremain
Actually since 2014 phyloseq::plot_heatmap can order the taxa in the heatmap according to their order in the tree. This is accomplished through the taxa.order command, which can take either a taxonomic rank to cluster the indices, or an arbitrary order of the indices themselves. rdocumentation.org/packages/phyloseq/versions/1.16.2/topics/… github.com/joey711/phyloseq/issues/230Owing
Secondly, my point was that the OP probably over-specified their request, not understanding why ordering by a heatmap by a hierarchical or phylogenetic tree is probably not what they want, since is demonstrably less effective way to display structural patterns in the data. Hence my reference to NeatMap, etc.Owing
V
0

While my suggestion for phlyoseq::plot_heatmap would get you part of the way there, the powerful "ggtree" package can do this, or more, if representing data on trees is really what you are going for.

Some examples are shown on the top of the following ggtree documentation page:

http://www.bioconductor.org/packages/3.7/bioc/vignettes/ggtree/inst/doc/advanceTreeAnnotation.html

Note that I am not affiliated with ggtree dev at all. Just a fan of the project and what it can already do.

Virelay answered 8/1, 2018 at 20:1 Comment(0)
O
0

After communication with @plannapus, I've modified (just a few) the code to remove some extra xlab="" information on the above code. Here you will find the code. You can see the commented lines having the extra code and now the new lines just erasing them. Hope this can help new users like me! :)

heatmap.phylo <- function(x, Rowp, Colp, ...){
    # x numeric matrix
    # Rowp: phylogenetic tree (class phylo) to be used in rows
    # Colp: phylogenetic tree (class phylo) to be used in columns
    # ... additional arguments to be passed to image function
    x <- x[Rowp$tip, Colp$tip]
    xl <- c(0.5, ncol(x) + 0.5)
    yl <- c(0.5, nrow(x) + 0.5)
    layout(matrix(c(0,1,0,2,3,4,0,5,0),nrow = 3, byrow = TRUE),
                  width = c(1,3,1), height = c(1,3,1))
    par(mar = rep(0,4))
    # plot(Colp, direction = "downwards", show.tip.label = FALSE,
    #            xlab = "", ylab = "", xaxs = "i", x.lim = xl)
      plot(Colp, direction = "downwards", show.tip.label = FALSE,
               xaxs = "i", x.lim = xl)
    par(mar = rep(0,4))
    # plot(Rowp, direction = "rightwards", show.tip.label = FALSE, 
    #            xlab = "", ylab = "", yaxs = "i", y.lim = yl)
    plot(Rowp, direction = "rightwards", show.tip.label = FALSE, 
               yaxs = "i", y.lim = yl)
    par(mar = rep(0,4), xpd = TRUE)
    image((1:nrow(x)) - 0.5, (1:ncol(x)) - 0.5, x, 
           #xaxs = "i", yaxs = "i", axes = FALSE, xlab = "", ylab = "", ...)
           xaxs = "i", yaxs = "i", axes = FALSE, ...)
    par(mar = rep(0,4))
    plot(NA, axes = FALSE, ylab = "", xlab = "", yaxs = "i", xlim = c(0,2), ylim = yl)
    text(rep(0, nrow(x)), 1:nrow(x), Rowp$tip, pos = 4)
    par(mar = rep(0,4))
    plot(NA, axes = FALSE, ylab = "", xlab = "", xaxs = "i", ylim = c(0,2), xlim = xl)
    text(1:ncol(x), rep(2, ncol(x)), Colp$tip, srt = 90, pos = 2)
}
Outrank answered 16/9, 2020 at 7:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.