Multiply values across each column by weight in another data.frame in R
Asked Answered
C

5

6

I have two data.frames: df and weights (code below).

df looks like this:

    id             a           b           d           EE           f
 1  this           0.23421153  -0.02324956 0.5457353   0.73068586   0.5642554
 2  is             0.28378641  0.36346241  1.0190496   1.97715019   -1.190179
 3  an             -0.04372133  0.07412557  -0.4510299  1.8442713   -0.5301328
 4  example        -0.18139887  0.10404329  -1.0871962  1.46920108  0.4685703
 5  data.frame     0.24235498   -0.1501064  -1.1183967  0.22884407  0.4272259
 6  for             -0.72718239 0.16337997  1.2635683   0.44206945  0.7303647
 7  stackoverflow  0.25203942   -0.1772715  -0.3371532  -0.29167792 -0.7065494
 8  please         -0.11047364  -0.06631552 0.4342659   -1.49584522 0.2837016
 9  help           -0.1136639   0.22414253  0.4284864   1.59096047  0.2915938
 10 me             -0.3677288   0.05974474  -0.1136177  0.02322094  -0.6533994

How can I multiply the values in each of the columns by the corresponding weights in the weights data.frame?

Expected result:

    id            a             b           d            EE   f
 1  this          0.749476896   -0.1743717  5.29363241   NA   4.17548996
 2  is            0.908116512   2.725968075 9.88478112   NA   -8.8073246
 3  an            -0.139908256  0.555941775 -4.37499003  NA   -3.92298272
 4  example       -0.580476384  0.780324675 -10.54580314 NA   3.46742022
 5  data.frame    0.775535936   -1.125798   -10.84844799 NA   3.16147166
 6  for           -2.326983648  1.225349775 12.25661251  NA   5.40469878
 7  stackoverflow 0.806526144   -1.32953625 -3.27038604  NA   -5.22846556
 8  please        -0.353515648  -0.4973664  4.21237923   NA   2.09939184
 9  help          -0.36372448   1.681068975 4.15631808   NA   2.15779412
 10 me            -1.17673216   0.44808555  -1.10209169  NA   -4.83515556

Code:

 set.seed(12345)
 df <- data.frame(id=c("this", "is", "an", "example", "data.frame", "for", 
      "stackoverflow", "please", "help", "me"), a=rnorm(10,0,0.4), b=rnorm(10,0,0.2), 
      d=rnorm(10,0,0.7), EE=rnorm(10,0,0.9), f=rnorm(10,0,0.5))
 weights <- data.frame(V1=as.numeric(c("3.2", "7.5", "2.2", "9.7", "5.4", "7.4", "2.1", 
      "5.0", "3.3", "7.6", "3.6", "7.7", "7.1", "3.3", "9.8", "9.2", "2.5", "6.2", "4.1", "8.7", 
      "3.3", "9.3", "8.3")))
 rownames(weights) <- paste(letters[1:23])
Canaday answered 18/7, 2021 at 5:56 Comment(0)
K
5

You can use sweep and match -

df[-1] <- sweep(df[-1],2, weights$V1[match(names(df[-1]),rownames(weights))],`*`)

df
#              id          a          b          d EE         f
#1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
#2             is  0.9081165  2.7259681   9.884781 NA -8.807325
#3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
#4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
#5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
#6            for -2.3269836  1.2253498  12.256612 NA  5.404699
#7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
#8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
#9           help -0.3637245  1.6810690   4.156319 NA  2.157794
#10            me -1.1767322  0.4480855  -1.102092 NA -4.835156
Kozak answered 18/7, 2021 at 6:21 Comment(0)
C
4

Here is an alternative solution:

  1. Create column of rownames called Names in weights
  2. match column names from df with column Names of weights dataframe
  3. multiply with the corresponding V1 based on the numeric index by replicating the weights$V1
  4. Finally cbind to get id in df, which was removed by df[-1] for calculation purposes.
library(tibble)
# rownames to column
weights <- weights %>% 
  rownames_to_column(var = "Names")

df_result <- df[-1]*weights$V1[match(names(df[-1]), weights$Names)][col(df[-1])]
df_result <- cbind(id=df$id, df_result)
df_result

Output:

id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156
Clueless answered 18/7, 2021 at 8:29 Comment(0)
F
3

Using lapply. Should be much faster than matching and sweeping.

df[-1] <- lapply(names(df)[-1], \(x) df[, x]*weights[x, ])
#               id          a          b          d EE         f
# 1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
# 2             is  0.9081165  2.7259681   9.884781 NA -8.807325
# 3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
# 4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
# 5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
# 6            for -2.3269836  1.2253498  12.256612 NA  5.404699
# 7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
# 8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
# 9           help -0.3637245  1.6810690   4.156319 NA  2.157794
# 10            me -1.1767322  0.4480855  -1.102092 NA -4.835156
Faeroese answered 18/7, 2021 at 15:16 Comment(0)
Y
3

With tidyverse, we can loop across the columns, get the corresponding column name (cur_column()), use that to subset the 'weight's data by specifying the row/column, names and multiply

library(dplyr)
df %>%
    mutate(across(where(is.numeric), ~ . * weights[cur_column(), 'V1']))
              id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156
Yellowtail answered 18/7, 2021 at 20:10 Comment(6)
This is excellent. I tried all day to make it with across. And I was sure that it is possible. Thank you for this solution akrun!Clueless
Dear @akrun, many thanks for the answer. But, when I tried to run it again, I have the error message that variable 'V1' doesn't exists...¿Do you know what 'V1' is used for? Thanks!Crazed
@AntonioCanepa if you check the OP's data, V1 is a column in 'weights' i.e. weights[, 'V1']# [1] 3.2 7.5 2.2 9.7 5.4 7.4 2.1 5.0 3.3 7.6 3.6 7.7 7.1 3.3 9.8 9.2 2.5 6.2 4.1 8.7 3.3 9.3 8.3Yellowtail
@akrun, thanks for your tip. However, what about if you need to multiply (or operate) over more than one column? I mean, I have 7 variables in one data frame and other 7 variables in the other data frame; if I need to, let's say, multiply, each value of the column one from DF1 with the value of column 1 from the DF2, and so on with the rest of the 7 variables, should I just left df1 %>% mutate(across(where(is.numeric), ~ . * df2[cur_column()])) ?....thanks!Crazed
@Antonio Canepa. Best would be to start a followup question referring to the first question.Clueless
@AntonioCanepa if there are corresponding column names, that code should workYellowtail
D
1

You can try the base R code below

df[-1] <- df[-1] * weights[names(df)[-1], "V1"][col(df[-1])]

which gives

> df
              id          a          b          d EE         f
1           this  0.7494769 -0.1743717   5.293633 NA  4.175490
2             is  0.9081165  2.7259681   9.884781 NA -8.807325
3             an -0.1399082  0.5559418  -4.374990 NA -3.922983
4        example -0.5804764  0.7803247 -10.545803 NA  3.467420
5     data.frame  0.7755359 -1.1257980 -10.848448 NA  3.161471
6            for -2.3269836  1.2253498  12.256612 NA  5.404699
7  stackoverflow  0.8065261 -1.3295363  -3.270386 NA -5.228465
8         please -0.3535157 -0.4973664   4.212379 NA  2.099392
9           help -0.3637245  1.6810690   4.156319 NA  2.157794
10            me -1.1767322  0.4480855  -1.102092 NA -4.835156
Dougie answered 18/7, 2021 at 20:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.