Why does this C program generate SIGPIPE later than expected?

Asked 5/5, 2013 at 0:30 Answered 5/5, 2013 at 3:45

This program generates SIGPIPE after piping it to "head -n 1", after a random time. I understand that because we're feeding more to "head -n 1" after the first line, we would expect it to generate SIGPIPE, but instead it will make it to a random number (usually > 20 and < 200) before exitting. Any idea why?

#include <stdio.h>
#include <stdlib.h>

main()
{
  int i;
  char *s = "ABCDEFGHIJKLMNOPQRSTUVWXYZ\n";

  i = 0;
  while (1) {
    fputs(s, stdout);
    fflush(stdout);
    fprintf(stderr, "Iteration %d done\n", i);
    i++;
  }
}

This is not homework, just something in my professor's notes that I do not understand.

Firebox answered 5/5, 2013 at 0:30 Comment(2)

Ehm, const char *s = ... – Untouchable 5/5, 2013 at 4:44

It takes time to stuff the tobacco. – Iamb 5/5, 2013 at 5:3

It's the vagaries of scheduling.

Your producer — let's call it alphabeta — is able to run for some amount time before head is able to read and exit (thus breaking the pipe).

That "some amount of time", of course, is variable.

Sometimes alphabeta runs 20 times before head can read stdin and exit. Sometimes 200 times. On my system, sometimes 300 or 1000 or 2000 times. Indeed, it can theoretically loop up to the capacity of the pipe connecting producer and consumer.

For demonstration, let's introduce some delay so we can be reasonably sure that head is stuck in a read() before alphabeta produces a single line of output:

so$ { sleep 5; ./alphabeta; } | head -n 1
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Iteration 0 done

(N.B. it's not guaranteed that alphabeta will only iterate once in the above. However, on an unloaded system, this is more-or-less always going to be the case: head will be ready, and its read/exit will happen more-or-less immediately.)

Watch instead what happens when we artificially delay head:

so$ ./alphabeta | { sleep 2; head -n 1; }
Iteration 0 done
...
Iteration 2415 done    # <--- My system *pauses* here as pipe capacity is reached ...
Iteration 2416 done    # <--- ... then it resumes as head completes its first read()
...
Iteration 2717 done    # <--- pipe capacity reached again; head didn't drain the pipe 
ABCDEFGHIJKLMNOPQRSTUVWXYZ

As an aside, @R.. is quite right in his remarks that SIGPIPE is synchronous. In your case, the first fflush-induced write to a broken pipe (after head has exited) will synchronously generate the signal. This is documented behavior.

Orchidectomy answered 5/5, 2013 at 3:45 Comment(0)

I think it is simply because signals are asynchronous.

Update: As others pointed out they (more precisely many, including SIGPIPE) are not. This was an unthoughtful answer :)

Thanos answered 5/5, 2013 at 0:50 Comment(3)

SIGPIPE is a synchronous signal. – Neoteric 5/5, 2013 at 3:16

@R.. stdio doesn't buffer past an fflush. – Pterous 5/5, 2013 at 3:21

Not this process's stdio. head's. – Neoteric 5/5, 2013 at 3:23

Socket writes are buffered and asynchronous, so you won't generally get an error arising from a specific write until a following read or writ).

Briticism answered 5/5, 2013 at 0:55 Comment(1)

I'm not sure why socket writes are material to a question about piped output. – Birnbaum 5/5, 2013 at 1:43

The head command is using stdio to read stdin and thus the first getc does not return until the buffer is full or EOF, whichever happens first.

Neoteric answered 5/5, 2013 at 3:18 Comment(4)

You can verify this with strace. – Neoteric 5/5, 2013 at 3:21

No, it's pipe capacity and scheduling. See my answer. – Orchidectomy 5/5, 2013 at 3:46

As pilcrow said, it's scheduling. The kernel's pipe buffer (typically 64-128 KiB) not filling up in this instance, since only about 200*27 = 5.4 KB of data is getting written. – Contractile 5/5, 2013 at 3:56

Indeed, that's correct. The read in head does not block until it fills the buffer; it returns immediately if there is some data available. – Neoteric 5/5, 2013 at 15:3

Recommended topics

Hot tags