Named Pipes (FIFOs) on Unix with multiple readers
Asked Answered
S

5

50

I have two programs, Writer and Reader.

I have a FIFO from Writer to Reader so when I write something to stdin in Writer, it gets printed out to stdout from Reader.

I tried doing this with TWO Readers open, and I got output to stdout from only one of the two Reader programs. Which Reader program Unix chooses to print stdout from seemed to be arbitrary each time I run this, but once it chooses one of the programs, each output to stdout gets printed from the same Reader program.

Does anyone know why this happens?

If I have two WRITER programs, they both write to the same pipe okay.

Sandhog answered 28/10, 2009 at 0:46 Comment(4)
Are you wanting to know why the data isn't "broadcast" to each reader or why the data isn't evenly distributed among each reader?Glans
I trust Writer is writing to its stdout (not stdin), which is the FIFO; each Reader is presumably reading it from its own stdin, which is the FIFO, and then writing the data to its own stdout.Spondaic
Jacob – I'm wanting to know why the data isn't going to both Readers, only one.Sandhog
+1 , this is good to have archived on SO. I've been asked the same thing by a few of my co-workers in the last year when Linux became the default development platform.Unstoppable
N
35

The O in FIFO means "out". Once your data is "out", it's gone. :-) So naturally if another process comes along and someone else has already issued a read, the data isn't going to be there twice.

To accomplish what you suggest you should look into Unix domain sockets. Manpage here. You can write a server which can write to client processes, binding to a filesystem path. See also socket(), bind(), listen(), accept(), connect(), all of which you'll want to use with PF_UNIX, AF_UNIX, and struct sockaddr_un.

Nival answered 28/10, 2009 at 0:50 Comment(0)
B
13

Linux tee() may suit your needs.
see here tee

NOTE: this function is Linux specific.

Bushing answered 28/10, 2009 at 6:23 Comment(4)
tee is not linux specific opengroup.org/onlinepubs/9699919799/utilities/tee.html; but I'm not sure if it will help with the use case in the original questionHowling
I was referring to C function tee, not command/ultitily tee. But yes, I'm not sure if this function was implemented on other platform/library.Bushing
tee is great. the hard part is, after you've teed your data stream to 30 preocesses so they can each process 1/30th of the data... how do you assemble the results... for that you need 1 reader and many writers. the trick is to make 30 fifos, and have the reader "select" on them, reading in whole-output. HADOOP is supposed to so this for you but it's a terible, bloated framework. Tools like 0mq do lightweight/clean IPC that works with most languages.Thayer
If you use tee on the readers, doesn't at least the last reader need to consume the data or else the data never gets consumed?Ekg
S
3

I don't think that the behaviour you observed is more than coincidental. Consider this trace, which uses 'sed' as the two readers and a loop as the writer:

Osiris JL: mkdir fifo
Osiris JL: cd fifo
Osiris JL: mkfifo fifo
Osiris JL: sed 's/^/1: /' < fifo &
[1] 4235
Osiris JL: sed 's/^/2: /' < fifo &
[2] 4237
Osiris JL: while read line ; do echo $line; done > fifo < /etc/passwd
1: ##
1: # User Database
1: #
1: # Note that this file is consulted directly only when the system is running
1: # in single-user mode. At other times this information is provided by
1: # Open Directory.
1: #
1: # This file will not be consulted for authentication unless the BSD local node
1: # is enabled via /Applications/Utilities/Directory Utility.app
1: #
1: # See the DirectoryService(8) man page for additional information about
1: # Open Directory.
1: ##
1: nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false
1: root:*:0:0:System Administrator:/var/root:/bin/sh
1: daemon:*:1:1:System Services:/var/root:/usr/bin/false
1: _uucp:*:4:4:Unix to Unix Copy Protocol:/var/spool/uucp:/usr/sbin/uucico
1: _lp:*:26:26:Printing Services:/var/spool/cups:/usr/bin/false
2: _postfix:*:27:27:Postfix Mail Server:/var/spool/postfix:/usr/bin/false
2: _mcxalr:*:54:54:MCX AppLaunch:/var/empty:/usr/bin/false
2: _pcastagent:*:55:55:Podcast Producer Agent:/var/pcast/agent:/usr/bin/false
2: _pcastserver:*:56:56:Podcast Producer Server:/var/pcast/server:/usr/bin/false
2: _serialnumberd:*:58:58:Serial Number Daemon:/var/empty:/usr/bin/false
2: _devdocs:*:59:59:Developer Documentation:/var/empty:/usr/bin/false
2: _sandbox:*:60:60:Seatbelt:/var/empty:/usr/bin/false
2: _mdnsresponder:*:65:65:mDNSResponder:/var/empty:/usr/bin/false
2: _ard:*:67:67:Apple Remote Desktop:/var/empty:/usr/bin/false
2: _www:*:70:70:World Wide Web Server:/Library/WebServer:/usr/bin/false
2: _eppc:*:71:71:Apple Events User:/var/empty:/usr/bin/false
2: _cvs:*:72:72:CVS Server:/var/empty:/usr/bin/false
2: _svn:*:73:73:SVN Server:/var/empty:/usr/bin/false
2: _mysql:*:74:74:MySQL Server:/var/empty:/usr/bin/false
2: _sshd:*:75:75:sshd Privilege separation:/var/empty:/usr/bin/false
2: _qtss:*:76:76:QuickTime Streaming Server:/var/empty:/usr/bin/false
2: _cyrus:*:77:6:Cyrus Administrator:/var/imap:/usr/bin/false
2: _mailman:*:78:78:Mailman List Server:/var/empty:/usr/bin/false
2: _appserver:*:79:79:Application Server:/var/empty:/usr/bin/false
2: _clamav:*:82:82:ClamAV Daemon:/var/virusmails:/usr/bin/false
2: _amavisd:*:83:83:AMaViS Daemon:/var/virusmails:/usr/bin/false
2: _jabber:*:84:84:Jabber XMPP Server:/var/empty:/usr/bin/false
2: _xgridcontroller:*:85:85:Xgrid Controller:/var/xgrid/controller:/usr/bin/false
2: _xgridagent:*:86:86:Xgrid Agent:/var/xgrid/agent:/usr/bin/false
2: _appowner:*:87:87:Application Owner:/var/empty:/usr/bin/false
2: _windowserver:*:88:88:WindowServer:/var/empty:/usr/bin/false
2: _spotlight:*:89:89:Spotlight:/var/empty:/usr/bin/false
2: _tokend:*:91:91:Token Daemon:/var/empty:/usr/bin/false
2: _securityagent:*:92:92:SecurityAgent:/var/empty:/usr/bin/false
2: _calendar:*:93:93:Calendar:/var/empty:/usr/bin/false
2: _teamsserver:*:94:94:TeamsServer:/var/teamsserver:/usr/bin/false
2: _update_sharing:*:95:-2:Update Sharing:/var/empty:/usr/bin/false
2: _installer:*:96:-2:Installer:/var/empty:/usr/bin/false
2: _atsserver:*:97:97:ATS Server:/var/empty:/usr/bin/false
2: _unknown:*:99:99:Unknown User:/var/empty:/usr/bin/false
Osiris JL:  jobs
[1]-  Running                 sed 's/^/1: /' < fifo &
[2]+  Done                    sed 's/^/2: /' < fifo
Osiris JL: echo > fifo
1: 
Osiris JL: jobs
[1]+  Done                    sed 's/^/1: /' < fifo
Osiris JL: 

As you can see, both readers got to read some of the data. Which reader was scheduled at any time depended on the whim of the o/s. Note that I carefully used an echo to print each line of the file; those were atomic writes that were read atomically.

Had I used a Perl script with, for example, a delay after reading and echoing a line, then I might well have seen more determinate behaviour with (generally) two lines from Reader 1 for every 1 line from Reader 2.

perl -n -e 'while(<>){ print "1: $_"; sleep 1; }' < fifo &
perl -n -e 'while(<>){ print "2: $_"; sleep 2; }' < fifo &

Experimentation done on MacOS X 10.5.8 (Leopard) - but likely to be similar most places.

Spondaic answered 28/10, 2009 at 2:0 Comment(3)
Oh, and for what it is worth, when I tried the Perl variant with 'sleep 1' in both reader scripts, everything got processed by reader 2 on one of the runs. I put the asymmetric sleeps in place to force the system's hand.Spondaic
that's interesting...it seems like after one reader reads from a FIFO, the data is erased and so the other reader can't read the same data.Sandhog
Of course - once the data is read, it is consumed and gone. That's the point. Same with terminals - if several processes are competing for the data, one gets it and the other doesn't. For some confusion, try doing 'more somebigfile | more'.Spondaic
W
1

I would like to add to the above explanations that writes (and presumable reads, though I couldn't confirm this from the manpages) to pipes are atomic up to a certain size (4KiB on Linux). So suppose we start with an empty pipe, and the writer writes <=4KiB data to to the pipe. Here's what I think happens:

a) The writer writes all data in one go. While this is happening no other process has a chance to read from (or write to) the pipe.

b) One of the readers is scheduled to do it's I/O.

c) The chosen reader reads all the data from the pipe in one go, and at some later time prints them to its stdout.

I think this could explain while you are seeing output from only one of the readers. Try writing in smaller chunks, and perhaps sleeping after each write.

Of course, others have answered why each datum is read by only process.

Wrongdoing answered 16/1, 2013 at 11:50 Comment(1)
The writer writes all data in one go. While this is happening no other process has a chance to read from (or write to) the pipe. Definitely false in Linux. The writer will block until a reader reads from the pipe. So the reads and writes always overlap in time (in a way that does not damage atomicity, however).Sienese
J
0

The sockets solution works, but becomes complicated if the server crashes. To allow any process to be the server, I use record locks at the end of a temporary file that contains location/length/data changes to the given file. I use a temporary named pipe to communicate append requests to whichever process has the write lock at the end of the temporary file.

Jackknife answered 7/6, 2019 at 17:5 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.