how to calculate the Euclidean norm of a vector in R?
Asked Answered
R

11

52

I tried norm, but I think it gives the wrong result. (the norm of c(1, 2, 3) is sqrt(1*1+2*2+3*3), but it returns 6..

x1 <- 1:3
norm(x1)
# Error in norm(x1) : 'A' must be a numeric matrix
norm(as.matrix(x1))
# [1] 6
as.matrix(x1)
#      [,1]
# [1,]    1
# [2,]    2
# [3,]    3
norm(as.matrix(x1))
# [1] 6

Does anyone know what's the function to calculate the norm of a vector in R?

Raguelragweed answered 7/6, 2012 at 14:34 Comment(2)
"norm" is not quite what you think it is. Try sqrt(sum(x^2)) . R does "what you expect." norm and dist are designed to provide generalized distance calculations among rows of a matrix.Regression
This returns a vector with the square roots of each of the components to the square, thus 1 2 3 instead of the Euclidean NormSussi
D
60

This is a trivial function to write yourself:

norm_vec <- function(x) sqrt(sum(x^2))
Dennard answered 7/6, 2012 at 14:43 Comment(6)
Hey, you violated my copyright from the comment above! I'm sending a team of RIAA lawyers after you. :-)Regression
@CarlWitthoft I just went and paid some royalties, so hopefully we're all square. :)Dennard
Highly disagree with this answer. In R you almost always want to use a built in function if one is available. They are highly optimized. The answer by Bernd is the correct answer. If you come across this please scroll down and use the proper R function to perform this.Publea
@Publea You should bother to benchmark the two solutions before you make such strong statements. You might find the results surprising. When I do I find that my version is around 5x faster. (Which is not to say that there may not be reasons to use norm. But you should actually check before speaking strongly.)Dennard
It is not really a question of speed. Try norm_vec(c(10^200)) and norm(c(10^200), type="2") to see the difference.Potentiate
see my answer below which guard against overflow or underflow problems while computing k-norm.Petrography
T
84
norm(c(1,1), type="2")     # 1.414214
norm(c(1, 1, 1), type="2")  # 1.732051
Toxoid answered 10/12, 2014 at 16:16 Comment(4)
This is the correct answer which also allows R to use it's internal optimizations.Publea
This takes more time than jorah's answer for me. See AbdealiJK's answer to check timings.Hyperaesthesia
it's not about speed, it's about avoiding over/underflowSubdiaconate
for destructive overflow or underflow problems, use norm with scaling. see my answer below. Look at the function definition of knorm()Petrography
D
60

This is a trivial function to write yourself:

norm_vec <- function(x) sqrt(sum(x^2))
Dennard answered 7/6, 2012 at 14:43 Comment(6)
Hey, you violated my copyright from the comment above! I'm sending a team of RIAA lawyers after you. :-)Regression
@CarlWitthoft I just went and paid some royalties, so hopefully we're all square. :)Dennard
Highly disagree with this answer. In R you almost always want to use a built in function if one is available. They are highly optimized. The answer by Bernd is the correct answer. If you come across this please scroll down and use the proper R function to perform this.Publea
@Publea You should bother to benchmark the two solutions before you make such strong statements. You might find the results surprising. When I do I find that my version is around 5x faster. (Which is not to say that there may not be reasons to use norm. But you should actually check before speaking strongly.)Dennard
It is not really a question of speed. Try norm_vec(c(10^200)) and norm(c(10^200), type="2") to see the difference.Potentiate
see my answer below which guard against overflow or underflow problems while computing k-norm.Petrography
E
26

I was surprised that nobody had tried profiling the results for the above suggested methods, so I did that. I've used a random uniform function to generate a list and used that for repetition (Just a simple back of the envelop type of benchmark):

> uut <- lapply(1:100000, function(x) {runif(1000, min=-10^10, max=10^10)})
> norm_vec <- function(x) sqrt(sum(x^2))
> norm_vec2 <- function(x){sqrt(crossprod(x))}
> 
> system.time(lapply(uut, norm_vec))
   user  system elapsed 
   0.58    0.00    0.58 
> system.time(lapply(uut, norm_vec2))
   user  system elapsed 
   0.35    0.00    0.34 
> system.time(lapply(uut, norm, type="2"))
   user  system elapsed 
   6.75    0.00    6.78 
> system.time(lapply(lapply(uut, as.matrix), norm))
   user  system elapsed 
   2.70    0.00    2.73 

It seems that taking the power and then sqrt manually is faster than the builtin norm for real values vectors at least. This is probably because norm internally does an SVD:

> norm
function (x, type = c("O", "I", "F", "M", "2")) 
{
    if (identical("2", type)) {
        svd(x, nu = 0L, nv = 0L)$d[1L]
    }
    else .Internal(La_dlange(x, type))
}

and the SVD function internally converts the vector into a matrix, and does more complicated stuff:

> svd
function (x, nu = min(n, p), nv = min(n, p), LINPACK = FALSE) 
{
    x <- as.matrix(x)
    ...

EDIT (20 Oct 2019):

There have been some comments to point out the correctness issue which the above test case doesn't bring out:

> norm_vec(c(10^155))
[1] Inf
> norm(c(10^155), type="2")
[1] 1e+155

This happens because large numbers are considered as infinity in R:

> 10^309
[1] Inf

So, it looks like:

It seems that taking the power and then sqrt manually is faster than the builtin norm for real values vectors for small numbers.

How small? So that the sum of squares doesn't overflow.

Eldon answered 6/9, 2016 at 4:27 Comment(2)
I love how this question went from "How do I get R to do this?" to optimization. I also just made my own distance function in R then became curious what the built-in function is and why just trying "norm(v)" doesn't work haha. I also checked the speed of my function..Spare
library(pracma) and system.time(lapply(uut, Norm)) gives 0.852 0.000 0.854 for times (my norm_vec2 times are very close to above). Also gives Inf for the large number test.Dragon
C
15
norm(x, type = c("O", "I", "F", "M", "2"))

The default is "O".

"O", "o" or "1" specifies the one norm, (maximum absolute column sum);

"F" or "f" specifies the Frobenius norm (the Euclidean norm of x treated as if it were a vector);

norm(as.matrix(x1),"o")

The result is 6, same as norm(as.matrix(x1))

norm(as.matrix(x1),"f")

The result is sqrt(1*1+2*2+3*3)

So, norm(as.matrix(x1),"f") is answer.

Corotto answered 12/6, 2014 at 19:12 Comment(0)
E
4

We can also find the norm as :

Result<-sum(abs(x)^2)^(1/2)

OR Even You can also try as:

Result<-sqrt(t(x)%*%x)

Both will give the same answer

Expatriate answered 15/2, 2013 at 9:32 Comment(1)
Two simplifications: if the components of x are real numbers, you can replace abs(x)^2 with x^2. Similarly, %*% transposes vectors as needed, so you can simplify t(x)%*%x to x%*%x.Raggedy
H
3

I'mma throw this out there too as an equivalent R expression

norm_vec(x) <- function(x){sqrt(crossprod(x))}

Don't confuse R's crossprod with a similarly named vector/cross product. That naming is known to cause confusion especially for those with a physics/mechanics background.

Hippy answered 12/12, 2014 at 22:31 Comment(1)
Absolutely right, the code I wrote would do what you say but I was really just trying to highlight the vector norm computation. I'll follow Joran's naming convention here. Good suggestion.Hippy
P
3

Answer for Euclidean length of a vector (k-norm) with scaling to avoid destructive underflow and overflow is

norm <- function(x, k) { max(abs(x))*(sum((abs(x)/max(abs(x)))^k))^(1/k) }

See below for explanation.

1. Euclidean length of a vector with no scaling:


norm() is a vector-valued function which computes the length of the vector. It takes two arguments such as the vector x of class matrix and the type of norm k of class integer.

norm <- function(x, k) {
  # x = matrix with column vector and with dimensions mx1 or mxn
  # k = type of norm with integer from 1 to +Inf
  stopifnot(k >= 1) # check for the integer value of k greater than 0
  stopifnot(length(k) == 1) # check for length of k to be 1. The variable k is not vectorized.
  if(k == Inf) {
    # infinity norm
    return(apply(x, 2, function(vec) max(abs(vec)) ))
  } else {
    # k-norm
    return(apply(x, 2, function(vec) (sum((abs(vec))^k))^(1/k) ))
  }
}

x <- matrix(c(1,-2,3,-4)) # column matrix
sapply(c(1:4, Inf), function(k) norm(x = x, k = k))
# [1] 10.000000  5.477226  4.641589  4.337613  4.000000
  • 1-norm (10.0) converges to infinity-norm (4.0).
  • k-norm is also called as "Euclidean norm in Euclidean n-dimensional space".

Note: In the norm() function definition, for vectors with real components, the absolute values can be dropped in norm-2k or even indexed norms, where k >= 1.

If you are confused with the norm function definition, you can read each one individually as given below.

norm_1 <- function(x) sum(abs(x))
norm_2 <- function(x) (sum((abs(x))^2))^(1/2)
norm_3 <- function(x) (sum((abs(x))^3))^(1/3)
norm_4 <- function(x) (sum((abs(x))^4))^(1/4)
norm_k <- function(x) (sum((abs(x))^k))^(1/k)
norm_inf <- max(abs(x))

2. Euclidean length of a vector with scaling to avoid destructive overflow and underflow issues:


Note-2: The only problem with this solution norm() is that it does not guard against overflow or underflow problems as alluded here and here.

Fortunately, someone had already solved this problem for 2-norm (euclidean length) in the blas (basic linear algebra subroutines) fortran library. A description of this problem can be found in the textbook of "Numerical Methods and Software by Kahaner, Moler and Nash" - Chapter-1, Section 1.3, page - 7-9.

The name of the fortran subroutine is dnrm2.f, which handles destructive overflow and underflow issues in the norm() by scaling with the maximum of the vector components. The destructive overflow and underflow problem arise due to radical operation in the norm() function.

I will show how to implement dnrm2.f in R below.

#1. find the maximum among components of vector-x
max_x <- max(x)
#2. scale or divide the components of vector by max_x
scaled_x <- x/max_x
#3. take square of the scaled vector-x
sq_scaled_x <- (scaled_x)^2
#4. sum the square of scaled vector-x
sum_sq_scaled_x <- sum(sq_scaled_x)
#5. take square root of sum_sq_scaled_x
rt_sum_sq_scaled_x  <- sqrt(sum_sq_scaled_x)
#6. multiply the maximum of vector x with rt_sum_sq_scaled_x
max_x*rt_sum_sq_scaled_x

one-liner of the above 6-steps of dnrm2.f in R is:

# Euclidean length of vector - 2norm
max(x)*sqrt(sum((x/max(x))^2))

Lets try example vectors to compute 2-norm (see other solutions in this thread) for this problem.

x = c(-8e+299, -6e+299, 5e+299, -8e+298, -5e+299)
max(x)*sqrt(sum((x/max(x))^2))
# [1] 1.227355e+300

x <- (c(1,-2,3,-4))
max(x)*sqrt(sum((x/max(x))^2))
# [1] 5.477226

Therefore, the recommended way to implement a generalized solution for k-norm in R is that single line, which guard against the destructive overflow or underflow problems. To improve this one-liner, you can use a combination of norm() without scaling for a vector containing not-too-small or not-too-large components and knorm() with scaling for a vector with too-small or too-large components. Implementing scaling for all vectors results in too many calculations. I did not implement this improvement in knorm() given below.

# one-liner for k-norm - generalized form for all norms including infinity-norm:
max(abs(x))*(sum((abs(x)/max(abs(x)))^k))^(1/k)

# knorm() function using the above one-liner.
knorm <- function(x, k) { 
  # x = matrix with column vector and with dimensions mx1 or mxn
  # k = type of norm with integer from 1 to +Inf
  stopifnot(k >= 1) # check for the integer value of k greater than 0
  stopifnot(length(k) == 1) # check for length of k to be 1. The variable k is not vectorized.
  # covert elements of matrix to its absolute values
  x <- abs(x)
  if(k == Inf) { # infinity-norm
    return(apply(x, 2, function(vec) max(vec)))
  } else { # k-norm
    return(apply(x, 2, function(vec) {
      max_vec <- max(vec)
      return(max_vec*(sum((vec/max_vec)^k))^(1/k))
    }))
  }
}

# 2-norm
x <- matrix(c(-8e+299, -6e+299, 5e+299, -8e+298, -5e+299))
sapply(2, function(k) knorm(x = x, k = k))
# [1] 1.227355e+300

# 1-norm, 2-norm, 3-norm, 4-norm, and infinity-norm
sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
# [1] 2.480000e+300 1.227355e+300 9.927854e+299 9.027789e+299 8.000000e+299

x <- matrix(c(1,-2,3,-4))
sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
# [1] 10.000000  5.477226  4.641589  4.337613  4.000000

x <- matrix(c(1,-2,3,-4, 0, -8e+299, -6e+299, 5e+299, -8e+298, -5e+299), nc = 2)
sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
#           [,1]          [,2]          [,3]          [,4]   [,5]
# [1,]  1.00e+01  5.477226e+00  4.641589e+00  4.337613e+00  4e+00
# [2,] 2.48e+300 1.227355e+300 9.927854e+299 9.027789e+299 8e+299
Petrography answered 6/9, 2020 at 11:49 Comment(1)
Good answer. Your one-liner is just missing the case that x is the zero-vector.Windjammer
J
2

If you have a data.frame or a data.table 'DT', and want to compute the Euclidian norm (norm 2) across each row, the apply function can be used.

apply(X = DT, MARGIN = 1, FUN = norm, '2')

Example:

>DT 

        accx       accy       accz
 1: 9.576807 -0.1629486 -0.2587167
 2: 9.576807 -0.1722938 -0.2681506
 3: 9.576807 -0.1634264 -0.2681506
 4: 9.576807 -0.1545590 -0.2681506
 5: 9.576807 -0.1621254 -0.2681506
 6: 9.576807 -0.1723825 -0.2682434
 7: 9.576807 -0.1723825 -0.2728810
 8: 9.576807 -0.1723825 -0.2775187

> apply(X = DT, MARGIN = 1, FUN = norm, '2')
 [1] 9.581687 9.582109 9.581954 9.581807 9.581932 9.582114 9.582245 9.582378
Jaban answered 16/8, 2016 at 4:58 Comment(0)
R
1

Following AbdealiJK's answer,

I experimented further to gain some insight.

Here's one.

x = c(-8e+299, -6e+299, 5e+299, -8e+298, -5e+299)
sqrt(sum(x^2))
norm(x, type='2')

The first result is Inf and the second one is 1.227355e+300 which is quite correct as I show you in the code below.

library(Rmpfr)
y <- mpfr(x, 120)
sqrt(sum(y*y))    

The result is 1227354879.... I didn't count the number of trailing numbers but it looks all right. I know there another way around this OVERFLOW problem which is first applying log function to all numbers and summing up, which I do not have time to implement!

Recover answered 14/6, 2018 at 21:15 Comment(0)
I
0

Create your matrix as column vise using cbind then the norm function works well with Frobenius norm (the Euclidean norm) as an argument.

x1<-cbind(1:3)

norm(x1,"f")

[1] 3.741657

sqrt(1*1+2*2+3*3)

[1] 3.741657

Imperturbable answered 18/11, 2014 at 20:22 Comment(0)
C
-1

The Euclidean norm is calculated as the square root of the sum of the squares of the vector's elements. It is also known as the L2 or second norm.

You can calculate it using the norm function

x <- c(1, 3, 5, 7, 9, 11, 14)

vector_magnitude <- norm(x, type = "2")
Calvados answered 13/6, 2023 at 22:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.