Reverse Scoring Items
Asked Answered
A

7

15

I have a survey of about 80 items, primarily the items are valanced positively (higher scores indicate better outcome), but about 20 of them are negatively valanced, I need to find a way to reverse score the ones negatively valanced in R. I am completely lost on how to do so. I am definitely an R beginner, and this is probably a dumb question, but could someone point me in an direction code-wise?

Americium answered 12/11, 2014 at 1:51 Comment(4)
what do you mean by valanced?Longs
I just mean that higher scores indicate a better outcome (e.g. 1-5 likert scale; "5" = strongly agree with a statement). But for some of them a "1" on the likert scale indicates strongly agree instead (this would be negatively valanced). I reverse score them so that for each item a higher score is indicative of stronger agreement. @ScottChamberlainAmericium
I meant to say "I need to" reverse score them.Americium
Newcomers to this question may wish to scroll down a bit to a tidyverse based solutionRamsgate
F
18

Here's an example with some fake data that you can adapt to your data:

# Fake data: Three questions answered on a 1 to 5 scale
set.seed(1)
dat = data.frame(Q1=sample(1:5,10,replace=TRUE), 
                 Q2=sample(1:5,10,replace=TRUE),
                 Q3=sample(1:5,10,replace=TRUE))

dat
   Q1 Q2 Q3
1   2  2  5
2   2  1  2
3   3  4  4
4   5  2  1
5   2  4  2
6   5  3  2
7   5  4  1
8   4  5  2
9   4  2  5
10  1  4  2

# Say you want to reverse questions Q1 and Q3
cols = c("Q1", "Q3")

dat[ ,cols] = 6 - dat[ ,cols]

dat
   Q1 Q2 Q3
1   4  2  1
2   4  1  4
3   3  4  2
4   1  2  5
5   4  4  4
6   1  3  4
7   1  4  5
8   2  5  4
9   2  2  1
10  5  4  4

If you have a lot of columns, you can use tidyverse functions to select multiple columns to recode in a single operation.

library(tidyverse)

# Reverse code columns Q1 and Q3
dat %>% mutate(across(matches("^Q[13]"), ~ 6 - .))

# Reverse code all columns that start with Q followed by one or two digits
dat %>% mutate(across(matches("^Q[0-9]{1,2}"), ~ 6 - .))

# Reverse code columns Q11 through Q20
dat %>% mutate(across(Q11:Q20, ~ 6 - .))

If different columns could have different maximum values, you can (adapting @HellowWorld's suggestion) customize the reverse-coding to the maximum value of each column:

# Reverse code columns Q11 through Q20 
dat %>% mutate(across(Q11:Q20, ~ max(.) + 1 - .))
Faria answered 12/11, 2014 at 2:23 Comment(3)
Thanks a lot, this works and is something I can actually understand. @FariaAmericium
for anyone finding this in the future the psych package has a function called reverse.code() which does this.Sericin
replacing 6 by 'max(dat[, cols]) + 1' extends the code to other situations and prevents loading a library for one function callUntouchable
M
9

Here is an alternative approach using the psych package. If you are working with survey data this package has lots of good functions. Building on @eipi10 data:

# Fake data: Three questions answered on a 1 to 5 scale
set.seed(1)
original_data = data.frame(Q1=sample(1:5,10,replace=TRUE), 
                 Q2=sample(1:5,10,replace=TRUE),
                 Q3=sample(1:5,10,replace=TRUE))
original_data

# Say you want to reverse questions Q1 and Q3. Set those keys to -1 and Q2 to 1.
# install.packages("psych") # Uncomment this if you haven't installed the psych package
library(psych)
keys <- c(-1,1,-1)

# Use the handy function from the pysch package
# mini is the minimum value and maxi is the maimum value
# mini and maxi can also be vectors if you have different scales
new_data <- reverse.code(keys,original_data,mini=1,maxi=5)
new_data

The pro to this approach is that you can recode your entire survey in one function. The con to this is you need a library. The stock R approach is more elegant as well.

FYI, this is my first post on stack overflow. Long time listener, first time caller. So please give me feedback on my response.

Motherwort answered 24/1, 2017 at 3:1 Comment(0)
H
8

Just converting @eipi10's answer using tidyverse:

# Create same fake data: Three questions answered on a 1 to 5 scale
set.seed(1)
dat <- data.frame(Q1 = sample(1:5,10, replace=TRUE), 
                  Q2 = sample(1:5,10, replace=TRUE),
                  Q3 = sample(1:5,10, replace=TRUE))

# Reverse scores in the desired columns (Q2 and Q3)

dat <- dat %>% 
  mutate(Q2Reversed = 6 - Q2,
         Q3Reversed = 6 - Q3)
Halliday answered 31/3, 2020 at 9:55 Comment(0)
H
4

Another example is to use recode in library(car).

 #Example data
 data = data.frame(Q1=sample(1:5,10, replace=TRUE))

 # Say you want to reverse questions Q1
 library(car)
 data$Q1reversed <- recode(data$Q1, "1=5; 2=4; 3=3; 4=2; 5=1")
 data
Hapless answered 6/6, 2018 at 12:11 Comment(0)
A
1

The psych package has the intuitive reverse.code() function that can be helpful. Using the dataset started by @eipi10 and the same goal or reversing q1 and q2:

set.seed(1)
dat <- data.frame(q1 =sample(1:5,10,replace=TRUE), 
                 q2=sample(1:5,10,replace=TRUE),
                 q3 =sample(1:5,10,replace=TRUE))

You can use the reverse.code() function. The first argument is the keys. This is a vector of 1 and -1. -1 means that you want to reverse that item. These go in the same order as your data.

The second argument, called items, is simply the name of your dataset. That is, where are these items located?

Last, the mini and maxi arguments are the smallest and largest values that a participant could possibly score. You can also leave these arguments to NULL and the function will use the lowest and highest values in your data.

library(psych)
keys <- c(-1, 1, -1)
dat1 <- reverse.code(keys = keys, items = dat, mini = 1, maxi = 5)

dat1

Alternatively, your keys can also contain the specific names of the variables that you want to reverse score. This is helpful if you have many variables to reverse score and yields the same answer:

library(psych)
keys <- c("q1", "q3")
dat2 <- reverse.code(keys = keys, items = dat, mini = 1, maxi = 5)

dat2

Note that, after reverse scoring, reverse.code() slightly modifies the variable name to have a - behind it (i.e., q1 becomes q1- after being reverse scored).

Aleciaaleck answered 12/2, 2020 at 12:10 Comment(0)
S
0

The solutions above assume wide data (one score per column). This reverse scores specific rows in long data (one score per row).

library(magrittr)
max <- 5
df <- data.frame(score=sample(1:max, 20, replace=TRUE))
df <- mutate(df, question = rownames(df))
df
df[c(4,13,17),] %<>% mutate(score = max + 1 - score)
df
Shopper answered 8/1, 2020 at 13:35 Comment(0)
C
0

Here is another attempt that will generalize to any number of columns. Let's use some made up data to illustrate the function.

# create a df
{
A = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3)
B = c(9, 2, 3, 2, 4, 0, 2, 7, 2, 8)
C = c(2, 4, 1, 0, 2, 1, 3, 0, 7, 8)

df1 = data.frame(A, B, C)
print(df1)
}
   A B C
1  3 9 2
2  3 2 4
3  3 3 1
4  3 2 0
5  3 4 2
6  3 0 1
7  3 2 3
8  3 7 0
9  3 2 7
10 3 8 8

The columns to reverse code

# variables to reverse code
vtcode = c("A", "B")

The function to reverse-code the selected columns

reverseCode <- function(data, rev){
  
  # get maximum value per desired col: lapply(data[rev], max)
  # subtract values in cols to reverse-code from max value plus 1
  data[, rev] = mapply("-", lapply(data[rev], max), data[, rev]) + 1
  
  return(data)
  
}


reverseCode(df1, vtcode)

   A  B C
1  1  1 2
2  1  8 4
3  1  7 1
4  1  8 0
5  1  6 2
6  1 10 1
7  1  8 3
8  1  3 0
9  1  8 7
10 1  2 8

This code was inspired by another response a response from @catastrophic-failure relating to subtract max of column from all entries in column R

Cardio answered 19/6, 2021 at 19:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.