Creating grouped bar-plot of multi-column data in R
Asked Answered
E

5

15

I have the following data

       Input Rtime Rcost Rsolutions  Btime Bcost 
1   12 proc.     1    36     614425     40    36 
2   15 proc.     1    51     534037     50    51 
3    18-proc     5    62    1843820     66    66 
4    20-proc     4    68    1645581 104400    73 
5 20-proc(l)     4    64    1658509  14400    65 
6    21-proc    10    78    3923623 453600    82 

I want to create a grouped bar chart from this data such that x-axis contains Input field (as groups) and y axis represent the log scale for the Rtime and Btime fields (the two bars).

All solutions/examples I checked online had similar data put into a three column layout. I do not know how to use the data I have to generate the grouped bar-chart. Or if there is a way to convert this data (manually converting is not an options because it is a huge file with a lot of rows) into a R and ggplot compatible data format.

Edit :

Graph generated using gncs solution

enter image description here

Evolution answered 18/4, 2012 at 14:55 Comment(0)
C
39

As requested, a ggplot2 solution that also uses reshape2:

library(reshape2)

df <- read.table(text = "       Input Rtime Rcost Rsolutions  Btime Bcost 
1   12-proc.     1    36     614425     40    36 
2   15-proc.     1    51     534037     50    51 
3    18-proc     5    62    1843820     66    66 
4    20-proc     4    68    1645581 104400    73 
5 20-proc(l)     4    64    1658509  14400    65 
6    21-proc    10    78    3923623 453600    82",header = TRUE,sep = "")

dfm <- melt(df[,c('Input','Rtime','Btime')],id.vars = 1)

ggplot(dfm,aes(x = Input,y = value)) + 
    geom_bar(aes(fill = variable),stat = "identity",position = "dodge") + 
    scale_y_log10()

enter image description here

Note a style difference here, where since log(1) = 0, ggplot2 treats that as a bar of zero height and doesn't plot anything, whereas barplot plots a little stub (which in my opinion is a little misleading).

Crayon answered 18/4, 2012 at 16:42 Comment(6)
awesome. I wish I knew this before writing the stupid python script {Python is good though!} Thanks a lot joranEvolution
Worth noting that melt is in the package reshape2Galvin
Also, needed to add stat = "identity" into geom_bar as it instead defaults to stat = "bin"Galvin
@Serenthia Thanks, yes, this sort of thing is a recurring problem with my old ggplot2 answers which constantly need to be updated as new versions come out.Crayon
This example is not running on R 3.3.3. It produces this error message: "Error: stat_count() must not be used with a y aesthetic."Raconteur
Is it possible to add extra columns on top, that are less wide and have an alpha value of 0.5?Starla
P
9

As requested, a ggplot2 solution that also uses pivot_longer() https://tidyr.tidyverse.org/reference/pivot_longer.html to transform the data into a format that geom_bar() or geom_col() can easily plot. position = "dodge" means make the multi-column style (not stacked-bar). geom_bar(stat = "identity") is the same as geom_col().

Update with cleaner code:

library(tidyverse)
df %>% 
  pivot_longer(-Input) %>% 
  ggplot(aes(x = Input, y = value, fill = name)) + 
  geom_col(position = "dodge") + 
  # geom_bar(stat = "identity", position = "dodge") + 
  scale_y_log10()

enter image description here

Original answer:

library(dplyr)
library(ggplot2)

df <- read.table(text = "Input Rtime Rcost Rsolutions  Btime Bcost 
                         1   12-proc.     1    36     614425     40    36 
                         2   15-proc.     1    51     534037     50    51 
                         3    18-proc     5    62    1843820     66    66 
                         4    20-proc     4    68    1645581 104400    73 
                         5 20-proc(l)     4    64    1658509  14400    65 
                         6    21-proc    10    78    3923623 453600    82", 
                         header = TRUE, sep = "")

dfm <- pivot_longer(df, -Input, names_to="variable", values_to="value")

## pivot_longer takes the input data frame, excludes the Input field from the transformation, turns the remaining column names into the variable "variable" (often called the "key"), and assigns the values to the variable "value". 

ggplot(dfm, aes(x = Input,y = value, fill = variable)) + 
    geom_bar(stat = "identity", position = "dodge") + 
    scale_y_log10()
Penknife answered 27/1, 2021 at 16:35 Comment(0)
S
7

I think I understand the problem and this is what I would suggest (short run - option):

data <- read.table("data.txt", header=TRUE)
subset <- t(data.frame(data$Rtime, data$Btime))
barplot(subset, legend = c("Rtime", "Btime"), names.arg=data$Input, log="y", beside=TRUE)

Is that what you want? It is kind of dirty, but it does the job.

Update: code corrected.

Singsong answered 18/4, 2012 at 15:37 Comment(1)
You are the man! Thanks a lot. Also, do you know how to do this using ggplot?Evolution
C
2

joran's answer helped me a lot, but I had to use stat="identity" in the ggplot statement like that:

ggplot(dfm, aes(x = Input,y = value)) + 
geom_bar(aes(fill = variable), position = "dodge", stat="identity") + 
scale_y_log10()

My version of R is 3.2.2 and ggplot2 version 1.0.1

Thanks.

Crochet answered 14/2, 2016 at 18:32 Comment(0)
Q
0

Following : Fusion Multiple ggplots graphs

The database d1:

id = c(1, 2, 3, 3, 4, 5, 6, 7, 8, 9, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19), 
observations = c("vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis", "vmbis"), 
reponse = c("18_vsys", "18_vsys", "20_vsys", "23_vsys", "15_vsys", "14_vsys", "17_vsys", "14_vsys", "17_vsys", "17_vsys", "23_vsys", "24_vsys", "24_vsys", 
"17_vsys", "14_vsys", "16_vsys", "12_vsys", "12_vsys", "14_vsys", "14_vsys", "18_vsys"), 
inf_palu2 = c("2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"), inf_sympto2 = c("1", "1", "0", "1", "1", "1", "1", "1", "0", "1", "1", "1", "0", "0", "1", "1", "1", "1", "1", 
"0", "1"))

d1 <- data.frame(id, observations, reponse, inf_palu2, inf_sympto2)

Attached code graph :

library(dplyr)
library(tidyr)
library(ggplot2)

df <- d1 |>
  pivot_longer(-c(id, reponse))

ggplot() + 
  geom_bar(data = df, aes(x = reponse, fill = name),  colour = "#006ddb", position = "dodge") +
    scale_fill_manual(values = c("#DF536B", "#F0E442","#E69F00" )) +
    theme(axis.text.x = element_text(angle = 90, vjust = 1)) +
    ggtitle("Nb observations, infections palustres, infections symptomatiques / visite")

Graph : enter image description here

Quinlan answered 1/6 at 9:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.