Read in 4-byte words from binary file in Julia
Asked Answered
M

3

6

I have a simple binary file that contains 32-bit floats adjacent to each other.

Using Julia, I would like to read each number (i.e. each 32-bit word) and put them each sequentially into a array of Float32 format.

I've tried a few different things through looking at the documentation, but all have yielded impossible values (I am using a binary file with known values as dummy input). It appears that:

  1. Julia is reading the binary file one-byte at a time.

  2. Julia is putting each byte into a Uint8 array.

For example, readbytes(f, 4) gives a 4-element array of unsigned 8-bit integers. read(f, Float32, DIM) also gives strange values.

Anyone have any idea how I should proceed?

Maje answered 11/8, 2014 at 21:24 Comment(0)
G
1

Julia Language has changed a lot since 5 years ago. read() no longer has API to specify Type and length simultaneously. reinterpret() creates a view of a binary array instead of array with desired type. It seems that now the best way to do this is to pre-allocate the desired array and fill it with read!:

data = Array{Float32, 1}(undef, 128)
read!(io, data)

This fills data with desired float numbers.

Goodfornothing answered 14/1, 2020 at 0:0 Comment(0)
M
9

(EDIT 2020: Outdated, see newest answer.) I found the issue. The correct way of importing binary data in single precision floating point format is read(f, Float32, NUM_VALS), where f is the file stream, Float32 is the data type, and NUM_VALS is the number of words (values or data points) in the binary data file.

It turns out that every time you call read(f, [...]) the data pointer iterates to the next item in the binary file.

This allows people to be able to read in data line-by-line simply:

f = open("my_file.bin")
first_item = read(f, Float32)
second_item = read(f, Float32)
# etc ...

However, I wanted to load in all the data in one line of code. As I was debugging, I had used read() on the same file pointer several times without re-declaring the file pointer. As a result, when I experimented with the correct operation, namely read(f, Float32, NUM_VALS), I got an unexpected value.

Maje answered 12/8, 2014 at 0:20 Comment(0)
K
8

I'm not sure of the best way of reading it in as Float32 directly, but given an array of 4*n Uint8s, I'd turn it into an array of n Float32s using reinterpret (doc link):

raw = rand(Uint8, 4*10)  # i.e. a vector of Uint8 aka bytes
floats = reinterpret(Float32, raw)  # now a vector of 10 Float32s

With output:

julia> raw = rand(Uint8, 4*2)
8-element Array{Uint8,1}:
 0xc8
 0xa3
 0xac
 0x12
 0xcd
 0xa2
 0xd3
 0x51

julia> floats = reinterpret(Float32, raw)
2-element Array{Float32,1}:
 1.08951e-27
 1.13621e11
Kinsman answered 11/8, 2014 at 21:58 Comment(0)
G
1

Julia Language has changed a lot since 5 years ago. read() no longer has API to specify Type and length simultaneously. reinterpret() creates a view of a binary array instead of array with desired type. It seems that now the best way to do this is to pre-allocate the desired array and fill it with read!:

data = Array{Float32, 1}(undef, 128)
read!(io, data)

This fills data with desired float numbers.

Goodfornothing answered 14/1, 2020 at 0:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.