Reading-in and converting raw binary data to integers in R
Asked Answered
T

0

7

I have a binary file containing numeric values coded as signed or unsigned integers of different lengths (mostly 2-/4-byte). To process this data I read the desired section of the file as raw vector with readBin() and then try to convert it to decimal. The issue is, that R's built-in functions have restrictions, I do not fully understand (such as no long unsigned ints) - please see the example below.

How to read custom-length unsigned ints from raw data? Is there a more appropriate and elegant approach, than specified below?

require(dplyr)

###############################################################################
# create examplary raw vector of 24 bytes
set.seed(1)
raw <- sample(0:0xff, 24, T) %>% as.raw %>% print


###############################################################################
# approach with readBin() - not working
# read 2-byte unsigned integers left-to-right, not an issue
readBin(raw, size = 2, n = length(raw) / 2, integer(), endian = 'big', signed = FALSE)

# read 4-byte signed integers left-to-right, it's ok
readBin(raw, size = 4, n = length(raw) / 4, integer(), endian = 'big', signed = TRUE)

# first issue: readBin can't read-in 4-byte unsigned integers
readBin(raw, size = 4, n = length(raw) / 4, integer(), endian = 'big', signed = FALSE)

# second issue: readBin can't read-in custom-size integers
readBin(raw[1:3], size = 3, n = length(raw) / 3, integer(), endian = 'big')

###############################################################################
# approach with rawToBits() and packBits() - does not work either
# packBits() also treats an integer as signed
raw[1:2] %>% rawToBits %>% packBits('integer')
# and expects a length of 32 bits
raw[1:2] %>% rawToBits %>% packBits('integer')

###############################################################################
# manual approach - working
# please note this requires reversing order of raw vector, 
#   as rawToBits() places the most significant bit to the right
# this approach correctly converts the 32-bit unsigned int to decimal
#   but would be difficult to vectorize for multiple ints
#   (I guess summing must be done in loops)
raw[4:1] %>% rawToBits %>% as.logical %>% which %>% {2^(. - 1)} %>% sum
Trimer answered 29/8, 2017 at 12:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.