How can I create a distance matrix containing the mean absolute scores between each row?
Asked Answered
C

1

7

Given the matrix,

df <- read.table(text="
 X1 X2 X3 X4 X5
  1  2  3  2  1
  2  3  4  4  3
  3  4  4  6  2
  4  5  5  5  4
  2  3  3  3  6
  5  6  2  8  4", header=T)

I want to create a distance matrix containing the absolute mean difference between each row of each column. For example, the distance between X1 and X3 should be = 1.67 given that:

abs(1 - 3) + abs(2-4) + abs(3-4) + abs(4-5) + abs(2-3) + abs(5-2) = 10 / 6 = 1.67

I have tried using the designdist() function in the vegan package this way:

designdist(t(df), method = "abs(A-B)/6", terms = "minimum")

The resulting distance for columns 1 and 3 is 0.666. The problem with this function is that it sums all the values in each column and then subtracts them. But I need to sum the absolute differences between each row (individually, absolute) and then divide it by N.

Cumberland answered 22/5, 2012 at 17:47 Comment(0)
R
5

Here's a one-line solution. It takes advantage of dist()'s method argument to calculate the L1 norm aka city block distance aka Manhattan distance between each pair of columns in your data.frame.

as.matrix(dist(df, "manhattan", diag=TRUE, upper=TRUE)/nrow(df))

To make it reproducible:

df <- read.table(text="
 X1 X2 X3 X4 X5
  1  2  3  2  1
  2  3  4  4  3
  3  4  4  6  2
  4  5  5  5  4
  2  3  3  3  6
  5  6  2  8  4", header=T)

dmat <- as.matrix(dist(df, "manhattan", diag=TRUE, upper=TRUE)/nrow(df))
print(dmat, digits=3)
#      1     2     3    4     5    6
# 1 0.00 1.167 1.667 2.33 1.333 3.00
# 2 1.17 0.000 0.833 1.17 0.833 2.17
# 3 1.67 0.833 0.000 1.00 1.667 1.67
# 4 2.33 1.167 1.000 0.00 1.667 1.33
# 5 1.33 0.833 1.667 1.67 0.000 2.33
# 6 3.00 2.167 1.667 1.33 2.333 0.00
Ringnecked answered 22/5, 2012 at 17:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.