C File descriptor duplication without sharing offset or flags

Asked 23/2, 2017 at 20:53 Answered 27/2, 2017 at 21:26

I need to concurrently read from a file in different offsets using C. dup unforunately creates a file descriptor that shares offset and flags with the original.

Is there a function like dup that does not share the offset and flags?

EDIT I only have access to the file pointer FILE* fp; I do not have the file path

EDIT This program is compiled for windows in addition to mac and many flavors of linux

SOLUTION We can use pread on posix systems, and I wrote a pread function for windows which solves this problem https://github.com/Storj/libstorj/blob/master/src/utils.c#L227

Dragonet answered 23/2, 2017 at 20:53 Comment(2)

open() the file twice? – Magnetochemistry 23/2, 2017 at 20:54

I only have access to a file pointer. – Dragonet 23/2, 2017 at 20:56

I was able to use pread and pwrite on POSIX systems, and I wrapped ReadFile/WriteFile on Windows Systems into pread and pwrite functions

#ifdef _WIN32
ssize_t pread(int fd, void *buf, size_t count, uint64_t offset)
{
    long unsigned int read_bytes = 0;

    OVERLAPPED overlapped;
    memset(&overlapped, 0, sizeof(OVERLAPPED));

    overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
    overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);

    HANDLE file = (HANDLE)_get_osfhandle(fd);
    SetLastError(0);
    bool RF = ReadFile(file, buf, count, &read_bytes, &overlapped);

     // For some reason it errors when it hits end of file so we don't want to check that
    if ((RF == 0) && GetLastError() != ERROR_HANDLE_EOF) {
        errno = GetLastError();
        // printf ("Error reading file : %d\n", GetLastError());
        return -1;
    }

    return read_bytes;
}

ssize_t pwrite(int fd, const void *buf, size_t count, uint64_t offset)
{
    long unsigned int written_bytes = 0;

    OVERLAPPED overlapped;
    memset(&overlapped, 0, sizeof(OVERLAPPED));

    overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
    overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);

    HANDLE file = (HANDLE)_get_osfhandle(fd);
    SetLastError(0);
    bool RF = WriteFile(file, buf, count, &written_bytes, &overlapped);
    if ((RF == 0)) {
        errno = GetLastError();
        // printf ("Error reading file :%d\n", GetLastError());
        return -1;
    }

    return written_bytes;
}
#endif

Dragonet answered 27/2, 2017 at 21:26 Comment(0)

On Linux, you can recover the filename from /proc/self/fd/N, where N is the integral value of the file descriptor:

sprintf( linkname, "/proc/self/fd/%d", fd );

Then use readlink() on the resulting link name.

If the file has been renamed or deleted, you may be out of luck.

But why do you need another file descriptor? You can use pread() and/or pwrite() on the original file descriptor to read/write from/to the file without affecting the current offset. (caveat: on Linux, pwrite() to a file opened in append mode is buggy - POSIX states that pwrite() to a file opened in append mode will write to the offset specified in the pwrite() call, but the Linux pwrite() implementation is broken and will ignore the offset and append the data to the end of the file - see the BUGS section of the Linux man page)

Sadi answered 23/2, 2017 at 21:25 Comment(2)

Is there a windows equivalent for this too? I compile the program for many different OS's – Dragonet 23/2, 2017 at 21:27

@AlexanderLeitner This might be relevant (I don't have enough in-depth Windows programming experience to be sure): GetFinalPathNameByHandle – Sadi 23/2, 2017 at 21:36

No, neither C nor POSIX (since you mention dup()) has a function for opening a new, independent file handle based on an existing file handle. As you observed, you can dup() a file descriptor, but the result refers to the same underlying open file description.

To get an independent handle, you need to open() or fopen() the same path (which is possible only if the FILE refers to an object accessible through the file system). If you don't know what path that is, or if there isn't any in the first place, then you'll need a different approach.

Some alternatives to consider:

buffer some or all of the file contents in memory, and read as needed from the buffer to serve your needs for independent file offsets;
build an internal equivalent of the tee command; this will probably require a second thread, and you'll probably not be able to read one file too far ahead of the other, or to seek in either one;
copy the file contents to a temp file with a known name, and open that as many times as you want;
if the FILE corresponds to a regular file, map it into memory and access its contents there. The POSIX function fmemopen() could be useful in this case to adapt the memory mapping to your existing stream-based usage.

Krieger answered 23/2, 2017 at 21:18 Comment(1)

A temporary linux only solution also seems to be something like sprintf(path, "/proc/self/fd/%d", fileno(state->original_file)); the main reason I am doing this because I stopped writing any files to disk These are up files of to many gigabytes of data so I'll have to look into memory mapped files it seems. – Dragonet 23/2, 2017 at 21:23

On windows (assuming VisualStudio), you can get access to the OS file handle from the stdio FILE handle. From there, reopen it and convert back to a new FILE handle.

This is windows only, but I think Andrews answer will work for Linux and probably the Mac as well - unfortunately there is no portable way to have it work on all systems.

#include <Windows.h>
#include <fcntl.h>
#include <io.h>  
#include <stdio.h>

FILE *jreopen(FILE* f)
{
    int n = _fileno(f);
    HANDLE h = (HANDLE)_get_osfhandle(n);
    HANDLE h2 = ReOpenFile(h, GENERIC_READ, FILE_SHARE_READ, 0);
    int n2 = _open_osfhandle((intptr_t)h2, _O_RDONLY);
    FILE* g = _fdopen(n2, "r");

    return g;
}

Jurado answered 23/2, 2017 at 23:25 Comment(0)

I was able to use pread and pwrite on POSIX systems, and I wrapped ReadFile/WriteFile on Windows Systems into pread and pwrite functions

#ifdef _WIN32
ssize_t pread(int fd, void *buf, size_t count, uint64_t offset)
{
    long unsigned int read_bytes = 0;

    OVERLAPPED overlapped;
    memset(&overlapped, 0, sizeof(OVERLAPPED));

    overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
    overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);

    HANDLE file = (HANDLE)_get_osfhandle(fd);
    SetLastError(0);
    bool RF = ReadFile(file, buf, count, &read_bytes, &overlapped);

     // For some reason it errors when it hits end of file so we don't want to check that
    if ((RF == 0) && GetLastError() != ERROR_HANDLE_EOF) {
        errno = GetLastError();
        // printf ("Error reading file : %d\n", GetLastError());
        return -1;
    }

    return read_bytes;
}

ssize_t pwrite(int fd, const void *buf, size_t count, uint64_t offset)
{
    long unsigned int written_bytes = 0;

    OVERLAPPED overlapped;
    memset(&overlapped, 0, sizeof(OVERLAPPED));

    overlapped.OffsetHigh = (uint32_t)((offset & 0xFFFFFFFF00000000LL) >> 32);
    overlapped.Offset = (uint32_t)(offset & 0xFFFFFFFFLL);

    HANDLE file = (HANDLE)_get_osfhandle(fd);
    SetLastError(0);
    bool RF = WriteFile(file, buf, count, &written_bytes, &overlapped);
    if ((RF == 0)) {
        errno = GetLastError();
        // printf ("Error reading file :%d\n", GetLastError());
        return -1;
    }

    return written_bytes;
}
#endif

Dragonet answered 27/2, 2017 at 21:26 Comment(0)

Recommended topics

Hot tags