Opening Binary Files in Fortran: Status, Form, Access
Asked Answered
R

4

9

I have been working with Fortran for years, but the file I/O is still hazy to me. My understanding of status, form, access, recl is limited, because I only needed certain use-cases in grad school.
I know that Fortran binary files have extra information at the top of the file that describe the size of the file. But that has never been an issue for me before because I have only had to deal with Fortran files in Fortran code, where the extra information is necessary, but invisible.

But how do I open a flat, binary file in Fortran?

In the past, I might open a Fortran binary using Fortran by doing something like this:

open(id,file=file_name,status='old',
     +     form='unformatted',access='direct',recl=4,iostat=ok)
      if (ok .ne. 0) then
        write(1,20) id,ok,file_name
                else
        write(1,21) id,file_name

But how does this change for a flat, binary file that doesn't have the Fortran header information? More importantly, where is a good link to describe these terms in greater detail: status, form, access, recl?

Roca answered 3/4, 2012 at 17:22 Comment(7)
Well, when I say "C++" binary, I just mean a raw, flat binary file that happened to be written by a C++ program. The file format will be... whatever, a flat binary filled with rows and columns of 1-byte integers. Obviously, my major concern is trying to read a binary file in Fortran if the file doesn't have the usual Fortran header.Roca
Then whatever language created it is completely irrelevant and distracts from your actual problem. Edited that out, and removed the C++ tag.Lasser
That seems fair. I put that in to the title and text to hopefully emphasize that these were the usual Fortran binaries I was trying to read. As long as that is still clear, I'm happy.Roca
There is no extra header information in Fortran direct access binary files! With sequential unformatted files you get record control words, but direct IO unformatted files are as plain as they can be. Writing an 8 byte real to disk will give you a file with exactly these eight bytes.Brach
@Brach already beat me to it. Only sequential access files give you the extra bits, and not knowing this led me to a great deal of debugging pain.Witkowski
Thank you, Haraldkl. I just tried that out and you are totally right. That was important information. If that was a post, you'd get a one-up.Roca
See See #8751685Dagnah
R
12

I hate to do this, but I feel that if I were hoping to find answers in this post, the way forward would not be clear. So here is the way forward.

The Short Version

In Fortran 77/90, to open a standard Fortran binary file you might write:

OPEN (5, FILE="myFile.txt")

But to open a flat, non-Fortran binary file you would have to write something more like this:

OPEN(5, file="myFile.txt", form='unformatted', access='direct', recl=1)

This difference is because Fortran-styled binary files have a 4-byte header and footer around each "record" in the file. These headers/footers describe the size of the data contained in the record. (In the most common case, each binary file you encounter will only have one record.)

The Long Version

Fortran assumes a lot of default open arguments. In fact, our original example can be written in the following verbose form to show all the defaults that were assumed.

OPEN (5, FILE="myFile.txt") 
OPEN (5, FILE="myFile.txt", FORM="FORMATTED", 
     +   ACCESS="SEQUENTIAL", STATUS="UNKNOWN")

Let us look at each argument:

  • FORM defines if a file consists of text (form='formatted') or binary data (form='unformatted').

  • ACCESS defines if you are reading data from the file in order (access='sequential') or in any order you want (access='direct').

  • RECL defines the number of bytes that goes into each record. For instance, recl=1 just says that the record lengths are 1 byte each; perhaps they are 1-byte integers.

  • STATUS defines if the file already exists. The STATUS="UNKNOWN" argument means that the file might not exist yet, but if it doesn't it will be created. If you want to protect against the possibility of writing over an old file use: STATUS="OLD". Similarly, if you know the file doesn't exist yet, you will want to use: STATUS="NEW".

For More Information:

These open statements also have an impact on the read/write/close statements that will follow. In my original post, I needed to know that if you open a direct access file you have to write to a direct access file. (That is, there will be no Fortran headers/footers included in your binary.) However, Fortran’s default functionality is to create sequential access files with Fortran headers and footers included.

For more information on open statements in Fortran 77/90, there are a good resources online:

A nice page by Lin Jinsen of Bishop University (thank you so much).

Slightly more official documentation by IBM for it's compilers.

Roca answered 4/4, 2012 at 19:34 Comment(1)
This is an old question but expanding the answer to include an example on how to read a record from the "non-Fortran binary file" would be useful. Particularly, one involving reading a derived type which I guess it is a common case when dealing with binary files. Another question is if an entire array of derived types can be read at once (i.e., I read the whole file into an array at once).Bookmaker
R
5

One caveat is the record length given in recl defaults to the number of 4-byte words with unformatted records (at least on Intel compilers, use byterecl to specify otherwise), so you may have to specify a compiler option or use recl=1.

As your code stands, using unformatted and direct, all you need to do to ensure you read data properly is to choose an appropriate record length. But, some FORTRAN compilers do not always play nice with unformatted binary files and I would suggest adopting HDF5 going forward.

If available, your compiler may allow recordtype='stream':

open (id, file=file_name, status='old', form='unformatted' &
        , access='stream', iostat=ios)
! read (id, pos=1) someValue
Revulsive answered 3/4, 2012 at 17:41 Comment(1)
Thanks for specifying that recl defaults to 4-byte words for Intel compilers. I could not find it mentioned in Intel documentation strangely.Trinitarianism
D
1

You can tell open to use the new Stream IO mode in Fortran 2003 with access='stream'.

Dioptrics answered 3/4, 2012 at 17:38 Comment(9)
Ah, that would be awesome, yes. But I am stuck using Fortran 77.Roca
How can you be stuck with Fortran 77, there is like no compiler around anymore, that can't compile F90?Brach
@Brach - There are a lot of ignorant users however, who refuse to learn the newer revisions, and insist on F77, however silly that may sound. Why not Fortran IV, I always ask?Sundial
Well, yes, I am sure I could compile using a Fortran 90, but we have maybe 100,000 lines of code compiled with Fortran 77 right now with no Fortran 90. I think it might bother people a little. BUT, I can ask around. I can ask, but that's may turn out to be a decision that's above my level, so to speak.Roca
Okay, preliminary word around the lab is that if I give concrete evidence that there are no backward compatibility issues, I can write any new files in Fortran 90. (Though all existing files will be kept in Fortran 77.) So F90 is now a possibility.Roca
You could always compile the existing code using F77 into a static library, then link that into an F90 project.Burra
There is about no backwards compatibility problem, by using the stream keyword in the open statement, you do not even have to use free formatted files and can stick to fixed format. The only problem is, that F77 compilers will not be able to treat the open treatment. However I do not see on which machine this might cause an issue, F90 is now quite old, and compilers had time to adapt ;) If you need evidence, point people to the standard, I think the only removed feature in F90 was the H-Format descriptor or so. Anyway compilers are likely to still support even that.Brach
In F2003 computed goto and arithmetic if got deleted I think. For an overview of which compiler support which features have a look at fortranwiki.org/fortran/show/Fortran+2003+status there are indeed three compilers indicated, which do not support stream IO: Absoft, HP and PathScale...Brach
@Brach Thanks again! I now think it's possible to start moving to more modern versions of the language (relatively speaking), and will be able to convince folks of that. It's just hard, I think, for professors who have been writing in Fortan77 for decades to want to learn more (it's not their field, after all).Roca
D
1

If you cannot use stream access, you have to use direct access. See links in answer to this question.

Dilapidate answered 3/4, 2012 at 19:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.