How in portable C to seek forward when reading from a pipe
Asked Answered
C

2

10

Since fseek() does not work on pipes what methods exist for simulating seeking forward? The naive approach is to use fread() and throw away the contents read into the memory buffer. For huge seeks to avoid huge buffers you would use the same buffer over and over with the final read using just a part of the buffer.

But is this the only approach? Is there another way which avoids the buffer and the potential multiple read?

Cully answered 27/4, 2011 at 15:22 Comment(0)
C
5

Yes, it is the only way. I would use a buffer somewhere around 1k-8k. With much smaller the syscall overhead for read will come into play, and with much larger you'll evict useful data from the cache.

Catechist answered 27/4, 2011 at 15:38 Comment(0)
E
6

Seeking doesn't make sense on pipes because the input is produced dynamically (not stored on disk). The lseek kernel system call is not implemented for pipes.

Also have in mind that a pipe is essentially a producer-consumer buffer of a limited, fixed size. When it gets full, the producer is suspended until the consumer reads the oldest data.

Extension answered 27/4, 2011 at 15:26 Comment(6)
@hippietrail: if there are concerns about a buffer and multiple read() calls to skip the data, perhaps it is better to not use a pipe at all. Have the source write to a disk file, then the sink end of the pipe can use lseek() family calls.Schechinger
Of course but sometimes the dynamically produced output is in a known format.Cully
@wallyk: Some reasons I have used pipes in the past include processing XML from huge compressed archives, and processing XML on the fly as it is arriving over the internet. Sometimes what you are looking for requires only a portion of the entire data, sometimes you don't have the disk space to have all such archives lying around uncompressed.Cully
@hippietrail: here is an attempt to implement seekable pipes in Linux that you might find interesting: lkml.indiana.edu/hypermail/linux/kernel/0411.3/0739.htmlExtension
@Blagovest Buyukliev: alas, that thread concludes without a solution.Schechinger
@hippietrail: Then the only reasonable solution is to re-implement the source end of the pipe yourself in a way that is useful for your purposes.Schechinger
C
5

Yes, it is the only way. I would use a buffer somewhere around 1k-8k. With much smaller the syscall overhead for read will come into play, and with much larger you'll evict useful data from the cache.

Catechist answered 27/4, 2011 at 15:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.