bit64 integers with fst
Asked Answered
P

2

5

I have data in a csv containing long integers. I am exchanging this data between csvs and fst files.

For example,

library(bit64)
library(data.table)
library(fst)
library(magrittr)

# Prepare example csvs
DT64_orig <- data.table(x = (c(2345612345679, 1234567890, 8714567890)))
fwrite(DT64_orig, "DT64_orig.csv")

# Read and move to fst
DT64 <- fread("DT64_orig.csv")
write.fst(DT64, "DT64_fst.fst")

DT_fst2 <- 
  read.fst("DT64_fst.fst") %>%
  setDT

# bit64 integers not preserved:
identical(DT_fst2, DT64)

Is there a way to use fst files for data.tables containing bit64 integers

Pals answered 21/2, 2017 at 23:32 Comment(2)
Looks to me that you should be complaining to the maintainer of fst.Persas
Filed: github.com/fstpackage/fst/issues/28Pals
M
6

It looks like fst might be dropping column attributes either when saving or loading (please ask as an issue on fst package). You can put the column types back yourself in the meantime. bit64::integer64 is a plain double under the hood so no bits have been lost. Just the type information telling R how to print the column.

> DT_fst2
               x
1: 1.158886e-311
2: 6.099576e-315
3: 4.305569e-314
> setattr(DT_fst2$x, "class", "integer64")
> DT_fst2
               x
1: 2345612345679
2:    1234567890
3:    8714567890
> identical(DT_fst2, DT64)
[1] TRUE
Mantilla answered 22/2, 2017 at 0:37 Comment(0)
C
4

Matt is absolutely right, fst is currently not serializing any column attributes. It will in the next version though, which is due in a few weeks. At that point, also classes such as Date and POSIXt will be supported. Supporting custom attributes will be a challenge however, because fst provides random access to the data and some attributes are modified upon sub-setting (think time series for example).

Cooley answered 22/2, 2017 at 8:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.