Let's create some factors first:
F1 <- factor(c(1,2,20,10,25,3))
F2 <- factor(paste0(F1, " years"))
F3 <- F2
levels(F3) <- paste0(sort(F1), " years")
F4 <- factor(paste0(F1, " years"), levels=paste0(sort(F1), " years"))
then take a look at them:
> F1
[1] 1 2 20 10 25 3
Levels: 1 2 3 10 20 25
> F2
[1] 1 years 2 years 20 years 10 years 25 years 3 years
Levels: 1 years 10 years 2 years 20 years 25 years 3 years
> F3
[1] 1 years 3 years 10 years 2 years 20 years 25 years
Levels: 1 years 2 years 3 years 10 years 20 years 25 years
> F4
[1] 1 years 2 years 20 years 10 years 25 years 3 years
Levels: 1 years 2 years 3 years 10 years 20 years 25 years
First I note that the "expected" order of the levels in F2 is not similar to F1. Taking a look at factor
documentation reveals why: the levels are created by first sorting the input. In the case of F2, these are the strings, where sorting takes length into account (?).
What is harder for me to understand is the difference in setting the levels between F3 and F4. In F3 I set the levels after the factor is created while in F4 I set them explicitly when creating the factor. In F3, the use of levels()<- isn't purely a relabel of the levels, but neither does it reorder them the way I expected.
Can someone explain the difference?