Partial Row Labels Heatmap - R
Asked Answered
T

1

7

I was wondering if anyone knows of a package that allows partial row labeling of heatmaps. I am currently using pheatmap() to construct my heatmaps, but I can use any package that has this functionality.

I have plots with many rows of differentially expressed genes and I would like to label a subset of them. There are two main things to consider (that I can think of):

  • The placement of the text annotation depends on the height of the row. If the rows are too narrow, then the text label will be ambiguous without some sort of pointer.
  • If multiple adjacent rows are significant (i.e. will be labelled), then these will need to be offset, and again, a pointer will be needed.

Below is an example of a partial solution that really only gets maybe halfway there, but I hope illustrates what I'd like to be able to do.

set.seed(1)
require(pheatmap)
require(RColorBrewer)
require(grid)

### Data to plot
data_mat <- matrix(sample(1:10000, 300), nrow = 50, ncol = 6)
rownames(data_mat) <- paste0("Gene", 1:50)
colnames(data_mat) <- c(paste0("A", 1:3), paste0("B", 1:3))

### Set how many genes to annotate
  ### TRUE - make enough labels that some overlap
  ### FALSE - no overlap
tooMany <- T

### Select a few genes to annotate
if (tooMany) {
  sigGenes_v <- paste0("Gene", c(5,20,26,42,47,16,28))
  newMain_v <- "Too Many Labels"
} else {
  sigGenes_v <- paste0("Gene", c(5,20,26,42))
  newMain_v <- "OK Labels"
}

### Make color list
colors_v <- brewer.pal(8, "Dark2")
colors_v <- colors_v[c(1:length(sigGenes_v), 8)]
names(colors_v) <- c(sigGenes_v, "No")
annColors_lsv <- list("Sig" = colors_v)

### Column Metadata
colMeta_df <- data.frame(Treatment = c(rep("A", 3), rep("B", 3)),
                      Replicate = c(rep(1:3, 2)),
                      stringsAsFactors = F, 
                      row.names = colnames(data_mat))

### Row metadata
rowMeta_df <- data.frame(Sig = rep("No", 50), 
                      stringsAsFactors = F,
                      row.names = rownames(data_mat))
for (gene_v in sigGenes_v) rowMeta_df[rownames(rowMeta_df) == gene_v, "Sig"] <- gene_v

### Heatmap
heat <- pheatmap(data_mat,
                 annotation_row = rowMeta_df,
                 annotation_col = colMeta_df,
                 annotation_colors = annColors_lsv,
                 cellwidth = 10,
                 main = "Original Heat")

### Get order of genes after clustering
genesInHeatOrder_v <- heat$tree_row$labels[heat$tree_row$order]
whichSigInHeatOrder_v <- which(genesInHeatOrder_v %in% sigGenes_v)
whichSigInHeatOrderLabels_v <- genesInHeatOrder_v[whichSigInHeatOrder_v]

sigY <- 1 - (0.02 * whichSigInHeatOrder_v)

### Change title
whichMainGrob_v <- which(heat$gtable$layout$name == "main")
heat$gtable$grobs[[whichMainGrob_v]] <- textGrob(label = newMain_v, 
                                                 gp = gpar(fontsize = 16))

### Remove rows
whichRowGrob_v <- which(heat$gtable$layout$name == "row_names")
heat$gtable$grobs[[whichRowGrob_v]] <- textGrob(label = whichSigInHeatOrderLabels_v,
                                                y = sigY,
                                                vjust = 1)
grid.newpage()
grid.draw(heat)

Here are a few outputs:

original heatmap: original heatmap

ok labels: ok labels

ok labels, with flags: ok labels, with flags

too many labels too many labels

too many labels, with flags too many labels, with flags

The "with flags" outputs are the desired final results.
I just saved these as images from the Rstudio plot viewer. I recognize that I could save them as pdfs and provide a larger file size to get rid of the label overlap, but then the individual cells would be larger than I want.

Trappings answered 1/10, 2018 at 21:32 Comment(2)
While the solution below is amazing and totally works, I thought it would be helpful to note that the ComplexHeatmap package natively supports partial row labeling.Aurar
How can you achieve partial row labelling in ComplexHeatmap? Any link/example please?Recce
S
7

Based on your code, you seem fairly comfortable with gtables & grobs. A (relatively) straightforward way to achieve the look you want is to zoom in on the row label grob, & make some changes there:

  1. replace unwanted labels with "";
  2. evenly spread out labels within the available space;
  3. add line segments joining the old and new label positions.

I wrote a wrapper function for this, which works as follows:

# heat refers to the original heatmap produced from the pheatmap() function
# kept.labels should be a vector of labels you wish to show
# repel.degree is a number in the range [0, 1], controlling how much the
# labels are spread out from one another

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 0)

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 0.5)

add.flag(heat,
         kept.labels = sigGenes_v,
         repel.degree = 1)

plot

Function (explanations in annotations):

add.flag <- function(pheatmap,
                     kept.labels,
                     repel.degree) {

  # repel.degree = number within [0, 1], which controls how much 
  #                space to allocate for repelling labels.
  ## repel.degree = 0: spread out labels over existing range of kept labels
  ## repel.degree = 1: spread out labels over the full y-axis

  heatmap <- pheatmap$gtable

  new.label <- heatmap$grobs[[which(heatmap$layout$name == "row_names")]] 

  # keep only labels in kept.labels, replace the rest with ""
  new.label$label <- ifelse(new.label$label %in% kept.labels, 
                            new.label$label, "")

  # calculate evenly spaced out y-axis positions
  repelled.y <- function(d, d.select, k = repel.degree){
    # d = vector of distances for labels
    # d.select = vector of T/F for which labels are significant

    # recursive function to get current label positions
    # (note the unit is "npc" for all components of each distance)
    strip.npc <- function(dd){
      if(!"unit.arithmetic" %in% class(dd)) {
        return(as.numeric(dd))
      }

      d1 <- strip.npc(dd$arg1)
      d2 <- strip.npc(dd$arg2)
      fn <- dd$fname
      return(lazyeval::lazy_eval(paste(d1, fn, d2)))
    }

    full.range <- sapply(seq_along(d), function(i) strip.npc(d[i]))
    selected.range <- sapply(seq_along(d[d.select]), function(i) strip.npc(d[d.select][i]))

    return(unit(seq(from = max(selected.range) + k*(max(full.range) - max(selected.range)),
                    to = min(selected.range) - k*(min(selected.range) - min(full.range)), 
                    length.out = sum(d.select)), 
                "npc"))
  }
  new.y.positions <- repelled.y(new.label$y,
                                d.select = new.label$label != "")
  new.flag <- segmentsGrob(x0 = new.label$x,
                           x1 = new.label$x + unit(0.15, "npc"),
                           y0 = new.label$y[new.label$label != ""],
                           y1 = new.y.positions)

  # shift position for selected labels
  new.label$x <- new.label$x + unit(0.2, "npc")
  new.label$y[new.label$label != ""] <- new.y.positions

  # add flag to heatmap
  heatmap <- gtable::gtable_add_grob(x = heatmap,
                                   grobs = new.flag,
                                   t = 4, 
                                   l = 4
  )

  # replace label positions in heatmap
  heatmap$grobs[[which(heatmap$layout$name == "row_names")]] <- new.label

  # plot result
  grid.newpage()
  grid.draw(heatmap)

  # return a copy of the heatmap invisibly
  invisible(heatmap)
}
Soemba answered 2/10, 2018 at 7:15 Comment(5)
Wow! Thanks so much, this is perfect. I have a passing familiarity with gtables and grobs, but it looks like I'll have to do some more learning because this is pretty powerful. Any favorite grid graphics resources you would recommend? Thanks again.Trappings
Hello, I am trying to use this function but I am running into an error: 'Error: Cannot create zero-length unit vector ("unit" subsetting) ' This error happens at this line of code: repelled.y(new.label$y, d.select = new.label$label != "") I am not sure if this has to do with pheatmap being updated or something, but any help would be greatly appreciatedAurar
@MaxF I just tried the code again (R 4.0.1, pheatmap 1.0.12), & couldn't trigger the error you encountered. Can you describe your package versions?Soemba
@Soemba -- I updated all packages and now it is working. Thank you so much for your timeAurar
Hi @Soemba I have a similar/related issue with pheatmap, could you please have a look and see if you can help me? thanks a lot #66893832G

© 2022 - 2024 — McMap. All rights reserved.