How can I test for a data frame content?

Asked 20/9, 2019 at 10:49 Answered 13/1, 2021 at 5:58

I have a routine that returns a data frame. I want to check if its values are reasonable. Using testthat I have expect_equal, but I am not sure if it applies to data.frames. I tried to do this but it doesn't work

testthat::expect_equal(result$ORs[1,1:3], c(1.114308, 0.5406599, 2.296604), tolerance=1.0e-6)

This is the message I get

─────────────────────────────────────────────────
test-xxx.R:19: failure: basic functionality
result$ORs[1, 1:3] not equal to c(1.114308, 0.5406599, 2.296604).
Modes: list, numeric
names for target but not for current
Attributes: < Modes: list, NULL >
Attributes: < Lengths: 2, 0 >
Attributes: < names for target but not for current >
Attributes: < current is not list-like >
─────────────────────────────────────────────────

══ Results ══════════════════════════════════════
Duration: 0.1 s

Condenser answered 20/9, 2019 at 10:49 Comment(4)

Do you want to test the whole data frame or just specific rows? Or map a whole row to several values? – Neritic 20/9, 2019 at 10:54

@Neritic Ideally and in this case, I want to check if every value is within a given tolerance. This is trivial for small data frames, but for larger ones I will use a subset or sum along the columns or rows to check the final value. – Condenser 20/9, 2019 at 10:58

Not really sure but could you explain how it fails? Testing with the "lower level" compare and all.equal seems to work. – Neritic 20/9, 2019 at 11:11

@Neritic Edited with the error – Condenser 20/9, 2019 at 11:52

If you merely test for a specific column's value (e.g. df[1,1]), then you can unname and unlist the value:

expect_equal(unname(unlist(df[1,1])), "Value")

Alita answered 7/12, 2020 at 21:1 Comment(0)

The thing is that ...$ORs[1, 1:3] is another data.frame because you take a row with more than one column:

# data example
ORs <- data.frame(1:3, 0:2, 2:4)

# show that it is a data.frame
str(ORs[1, 1:3])
#R> 'data.frame':   1 obs. of  3 variables:
#R>  $ X1.3: int 1
#R>  $ X0.2: int 0
#R>  $ X2.4: int 2

Same thing happens with multiple rows with more than one column. A few options are:

To dput the result and use the output (after possibly changing the entries to the expected values). Then you are sure that you have the correct attributes:

dput(ORs[1, 1:3])
#R> structure(list(X1.3 = 1L, X0.2 = 0L, X2.4 = 2L), row.names = 1L, class = "data.frame")

expect_equal(
  ORs[1, 1:3], 
  structure(list(X1.3 = 1L, X0.2 = 0L, X2.4 = 2L), row.names = 1L, class = "data.frame"))

# you can disregard the whole structure part if you use check.attributes = FALSE
# That is just list(...) with the expected values
expect_equal(ORs[1, 1:3], list(1L, 0L, 2L), check.attributes = FALSE)

use expect_known_value to store a .RDS file with expected output to test against.
Like suggested by anpami, unlist the data and test. This will work well particularly if all the data.frame entries are of the same type in the columns you are looking at (here the first three):

expect_equal(unlist(ORs[1, 1:3]), c(1L, 0L, 2L), 
             check.attributes = FALSE)

test each value separately. This does not seem like a nice option though:

expect_equal(ORs[1, 1], 1L)
expect_equal(ORs[1, 2], 0L)
expect_equal(ORs[1, 3], 2L)

Note that expect_equal uses all.equal so it is matter of getting all.equal to pass.

Tifanytiff answered 13/1, 2021 at 5:58 Comment(0)

Recommended topics

Hot tags