calculate row sum and product in data.frame
Asked Answered
I

6

15

I would like to append a columns to my data.frame in R that contain row sums and products Consider following data frame

x    y     z
1    2     3
2    3     4
5    1     2

I want to get the following

x    y     z    sum    prod
1    2     3    6       6  
2    3     4    9       24 
5    1     2    8       10

I have tried

 sum = apply(ages,1,add)

but it gives me a row vector. Can some one please show me an efficient command to sum and product and append them to original data frame as shown above?

Integrity answered 24/11, 2014 at 10:49 Comment(0)
P
26

Try

 transform(df, sum=rowSums(df), prod=x*y*z)
 #  x y z sum prod
 #1 1 2 3   6    6
 #2 2 3 4   9   24
 #3 5 1 2   8   10

Or

 transform(df, sum=rowSums(df), prod=Reduce(`*`, df))
 #   x y z sum prod
 #1 1 2 3   6    6
 #2 2 3 4   9   24
 #3 5 1 2   8   10

Another option would be to use rowProds from matrixStats

 library(matrixStats)
 transform(df, sum=rowSums(df), prod=rowProds(as.matrix(df)))

If you are using apply

 df[,c('sum', 'prod')] <-  t(apply(df, 1, FUN=function(x) c(sum(x), prod(x))))
 df
 #  x y z sum prod
 #1 1 2 3   6    6
 #2 2 3 4   9   24
 #3 5 1 2   8   10
Playroom answered 24/11, 2014 at 10:51 Comment(2)
Can you please advise me what Reduce(`*`, df) does?Integrity
@Khurram Majeed It is similar to Reduce(*,as.list(df)). As data.frame is a list, you don't need to use as.list. it multiplies each element in the correspoinding positions i.e. 1st row, 2nd row, etc.Playroom
H
7

Another approach.

require(data.table)

# Create data
dt <- data.table(x = c(1,2,5), y = c(2,3,1), z = c(3,4,2))

# Create index
dt[, i := .I]

# Compute sum and prod
dt[, sum := sum(x, y, z), by = i]
dt[, prod := prod(x, y, z), by = i]
dt


# Compute sum and prod using .SD
dt[, c("sum", "prod") := NULL]
dt
dt[, sum := sum(.SD), by = i, .SDcols = c("x", "y", "z")]
dt[, prod := prod(.SD), by = i, .SDcols = c("x", "y", "z")]
dt


# Compute sum and prod using .SD and list
dt[, c("sum", "prod") := NULL]
dt
dt[, c("sum", "prod") := list(sum(.SD), prod(.SD)), by = i,
   .SDcols = c("x", "y", "z")]
dt


# Compute sum and prod using .SD and lapply
dt[, c("sum", "prod") := NULL]
dt
dt[, c("sum", "prod") := lapply(list(sum, prod), do.call, .SD), by = i,
   .SDcols = c("x", "y", "z")]
dt
Hexosan answered 24/11, 2014 at 10:57 Comment(6)
You should demonstrate how to use lapply in conjunction with .SD and .SDcols.Aruabea
@Roland, don't think there is a need in lapply or .SDcols here, he should have just done something like setDT(df)[, ":="(sum = rowSums(.SD), prod = Reduce("*", .SD))]Fabien
I have added solution with .SD. I do not see how lapply could help here.Hexosan
@Roland, I have added lapply solution. Could be useful in some cases but I am considering it to be ugly.Hexosan
Why would I need to create an index in your first approach?Timbuktu
@Chrissl, because without index you will get sum and product of all rows and columns. This is not what is needed here.Hexosan
F
4

Following can also be done but column names need to be entered:

ddf$sum = with(ddf, x+y+z)
ddf$prod = with(ddf, x*y*z)
ddf
  x y z sum prod
1 1 2 3   6    6
2 2 3 4   9   24
3 5 1 2   8   10

With data.table, another form can be:

library(data.table)    
cbind(dt, dt[,list(sum=x+y+z, product=x*y*z),])
   x y z sum product
1: 1 2 3   6       6
2: 2 3 4   9      24
3: 5 1 2   8      10

A simpler version is suggested by @David Arenberg in comments:

dt[, ":="(sum = x+y+z, product = x*y*z)]
Francophobe answered 24/11, 2014 at 11:45 Comment(2)
I wonder if you could just assign by reference instead of using cbind... Something like dt[, ":="(sum = x+y+z, product = x*y*z)] maybe?Fabien
I was not aware of this simpler version.Francophobe
O
3

Only a partial answer, but if all values are greater than or equal to 0, rowSums/rowsum can be used to calculate products:

df <- data.frame(x = c(1, 2, 5), y = c(2, 3, 1), z = c(3, 4, 2))

# custom row-product-function
my_rowprod <- function(x) exp(rowSums(log(x)))

df$prod <- my_rowprod(df)
df

The generic version is (including negatives):

my_rowprod_2 <- function(x) {
  sign <- ifelse((rowSums(x < 0) %% 2) == 1, -1, 1)
  prod <- exp(rowSums(log(abs(x)))) * sign
  prod
}
df$prod <- my_rowprod_2(df)
df
Osher answered 25/7, 2018 at 12:44 Comment(0)
A
0

Here is a quick way to do it using base R.

# create your example data frame
df <- data.frame(x=c(1,2,5), y=c(2,3,1), z=c(3,4,1))

# add the sum column
df$sum <- apply(df, MARGIN = 1, sum)

# add the product column (don't include the sum column from the last operation)
df$prod <- apply(df[,-4], MARGIN = 1, prod)

# print result to console
df

  x y z sum prod
1 1 2 3   6    6
2 2 3 4   9   24
3 5 1 1   7    5
Anteroom answered 29/3, 2024 at 0:53 Comment(0)
H
0

With base R, you can use Reduce like below

cbind(
    df,
    list2DF(
        lapply(
            c(sum = `+`, prod = `*`),
            \(op) Reduce(op, df)
        )
    )
)

which gives

  x y z sum prod
1 1 2 3   6    6
2 2 3 4   9   24
3 5 1 2   8   10

Data

df <- data.frame(x = c(1, 2, 5), y = c(2, 3, 1), z = c(3, 4, 2))
Hydatid answered 29/3, 2024 at 9:14 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.