Why does testthat 2.3.2 use a different sort()
Asked Answered
V

1

7

(This question is also asked at Github here)

After an upgrade of R to 4.0.2 tests fail because it seems, the algorithm of sort in testthat changed. The following shows, that base::sort() and browser() are fine in R 4.0.2 (See this question, why this check is added.):

y <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
print(sort(y))
# [1] "Schaffhausen" "Schwyz"       "Seespital"    "SRZ"
browser()
print(sort(y))
# [1] "Schaffhausen" "Schwyz"       "Seespital"    "SRZ"

But if you create a package, call it testsort, add test-environment using usethis::use_testthat() and add a file "test-sort.R" in /testsort/tests/testthat/

test_that("test sort", {
  xx <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
  print("")
  # bowser()
  print(sort(xx))
  expect_equal(sort(xx), c("Schaffhausen", "Schwyz", "Seespital", "SRZ"))
})

you get

==> devtools::test()

Loading testsort
Testing testsort
v |  OK F W S | Context
/ |   0       | sort[1] ""
[1] "SRZ"          "Schaffhausen" "Schwyz"       "Seespital"   
v |   1       | sort

== Results =============================================================================
OK:       1
Failed:   0
Warnings: 0
Skipped:  0

I used debug(sort) and devtools::test() in the RStudio console(!) but was not able to figure out what happens.

R.version
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          4                           
minor          0.2                         
year           2020                        
month          06                          
day            22                          
svn rev        78730                       
language       R                           
version.string R version 4.0.2 (2020-06-22)
nickname       Taking Off Again   

At present, testthat 2.3.2 is up to date, that is there is no newer version of testthat.

Thanks to @Ulugbek Umirov from test:

10.5 CRAN notes

CRAN will run your tests on all CRAN platforms: Windows, Mac, Linux and Solaris. There are a few things to bear in mind:

Note that tests are always run in the English language (LANGUAGE=EN) and with C sort order (LC_COLLATE=C). This minimises spurious differences between platforms.

Veterinary answered 10/8, 2020 at 8:32 Comment(3)
According to r-pkgs.org/tests.html - tests run with C sort order (LC_COLLATE=C) - which means that A-Z comes before a-z, thus it is expected to have SRZ in the beginning of the sorted list. Also this topic discusses breaking change in default behavior: r.789695.n4.nabble.com/…Coleencolella
@UlugbekUmirov So according to nabble post, I would consider the test behaviour as a bug, because everything is correct in R 4.0.2? But thanks for your comment, with my very limited knowledge about encoding, this seems to explain everything.Veterinary
@UlugbekUmirov If I understand correctly, testthat now uses withr::local_collate("C", .local_envir = .env). Furthernmore, this commit caused the change.Veterinary
V
1

Cross-platform reproducibility is more imprtant. Setting the collation to C makes sure, tests give the same result across all platforms.


Options to deal with this change if sort caused the problems (sort depends on the collation)** you have at least 3 different options:

  1. The use of stringr::sort(): New dependence on stringr package

  2. Customize your function without additional packages

    myfun <- function(my_collation = "German_Switzerland.1252", ...) {
      my_locale <- Sys.getlocale("LC_COLLATE")
      on.exit(expr = Sys.setlocale("LC_COLLATE", my_locale))
    
      Sys.setlocale("LC_COLLATE", my_collation)
      r <- sort(...)
      return(r)
    }
    

    No new packages are used thanks to on.exit()

  3. Use of withr-Packagewhich takes care of the on.exit part

    myfun <- function(my_collation = "German_Switzerland.1252", …) {
      withr::local_collate(my_collation)
    
      r <- sort(…)
      return(r)
    }
    
Veterinary answered 17/9, 2020 at 9:6 Comment(1)
There is one bracket too much in my_locale <- Sys.getlocale("LC_COLLATE")) :)Unassailable

© 2022 - 2024 — McMap. All rights reserved.