Is the Unix Philosophy falling out of favor in the Ruby community? [closed]
Asked Answered
D

6

22

David Korn, a proponent of the Unix philosophy, chided Perl programmers a few years ago in a Slashdot interview for writing monolithic Perl scripts without making use of the Unix toolkit through pipes, redirection, etc. "Unix is not just an operating system," he said, "it is a way of doing things, and the shell plays a key role by providing the glue that makes it work."

It seems that reminder could apply equally to the Ruby community. Ruby has great features for working together with other Unix tools through popen, STDIN, STDOUT, STDERR, ARGF, etc., yet it seems that increasingly, Rubyists are opting to use Ruby bindings and Ruby libraries and build monolithic Ruby programs.

I understand that there may be performance reasons in certain cases for going monolithic and doing everything in one Ruby process, but surely there are a lot of offline and asynchronous tasks that could be well handled by Ruby programs working together with other small programs each doing one thing well in the Unix fashion, with all the advantages that this approach offers.

Maybe I'm just missing something obvious. Is the Unix Philosophy still as relevant today as it was 10 years ago?

Damaris answered 21/12, 2009 at 3:49 Comment(2)
Why make a library OS-dependent?Jubal
Sorry to omit that. Here it is: news.slashdot.org/article.pl?sid=01/02/06/…Damaris
A
19

The Unix philosophy of pipes and simple tools is for text. It is still relevant, but perhaps not as relevant as it used to be:

  • We are seeing more tools whose output is not designed to be easily parseable by other programs.

  • We are seeing much more XML, where there is no particular advantage to piping text through filters, and where regular expressions are a risky gamble.

  • We are seeing more interactivity, whereas in Unix pipes information flows in one direction only.

But although the world has changed a little bit, I still agree with Korn's criticism. It is definitely poor design to create large, monolithic programs that cannot interoperate with other programs, no matter what the language. The rules are the same as they have always been:

  • Remember your own program's output may be another program's input.

  • If your program deals in a single kind of data (e.g., performance of code submitted by students, which is what I've been doing for the last week), make sure to use the same format for both input and output of that data.

  • For interoperability with existing Unix tools, inputs and outputs should be ASCII and line-oriented. Many IETF Internet protocols (SMTP, NNTP, HTTP) are sterling examples.

  • Instead of writing a big program, consider writing several small programs connected with existing programs by shell pipelines. For example, a while back the xkcd blog had a scary pipeline for finding anagrams in /usr/share/dict/words.

  • Work up to shell scripts gradually by making your interactive shell one you can also script with. (I use ksh but any POSIX-compatible shell is a reasonable choice.)

In conclusion there are really two highly relevant ways of reusing code:

  • Write small programs that fit together well when connected by shell pipelines (Unix).

  • Write small libraries that fit together well when connected by import, #include, load, require, or use (Ruby, C++ STL, C Interfaces and Implementations, and many others).

In the first paradigm, dependency structure is simple (always linear) and therefore easy to understand, but you're more limited in what you can express. In the second paradigm, your dependency structure can be any acyclic graph—lots more expressive power, but that includes the power to create gratuitous complexity.

Both paradigms are still relevant and important; for any particular project, which one you pick has more to do with your clients and your starting point than with any intrinsic merit of the paradigm. And of course they are not mutually exclusive!

Agonize answered 21/12, 2009 at 4:22 Comment(11)
Isn't XML still a text stream that could be handled by a Unix-style pipeline? For a recent project, I wrote a Ruby program that was basically a filter that took a stream of XML (Atom/RSS content) from curl as input and used a REXML streaming parser to generate YAML to STDOUT, which was then input into another small Ruby program for further processing. That YAML output could have as easily been piped into sed or awk in useful ways.Damaris
Could you give an example of a Ruby project that uses a "dependency graph"? I'm not sure what a dependency graph is.Damaris
@dan: yes, XML is text, but it really requires an XML parser to do anything, so the panoply of standard Unix tools (grep, sort, sed, awk, uniq) is not very useful. Of course we all hope that an ecology of equally useful Unix-like tools will grow up around XML, but we're still waiting...Agonize
@dan: Regarding the dependency graph, it's the graph created between modules by import directives, which in Ruby are notated require or load. I've edited the answer to try to clarify where the graph comes from and why it is important.Agonize
@dan: program A depends on library B and library C, which depends on libraries D and E, both of which depend on library F, etc.Zurkow
There is no text-only limitation to pipes on Unix. Pipes handle binary just fine. Consider the PBM series of graphic conversion tools, as one example.Peeve
@xcramps, true but rather there is loose coupling in the Unix pipeline--granted by design. i.e. you don't define an XML schema in the pipeline. You could with extra filters. I'm not sure if binary vs. ascii vs. ebcdic is important just that any data can be passed on the pipeline regardless of filter interface.Coeval
@xcramps: Although the PBM formats have been retrofitted to binary formats, the original formats were text, and programs were explicitly urged to accept anything remotely resembling an image. For my class I still write shell scripts and awk scripts to generate images in this text format (nothing susses out bugs in students' code like randomly generated graymaps). The prosecution rests :-)Agonize
@NormanRamsey, I thought P?M (pbmplus|netpbm) formats were always either text or binary as indicated by the magic format characters | was the P1, P2, P3 ASCII formats the predecessors to P4, P5 & P6?Coeval
@Xepoch: Correct. But back in the dawn ages when dinosaurs roamed the earth, there were only the P1, P2, P3 ASCII formats. The binary formats were added later. For most purposes the ASCII formats are deprecated, but when you want to use existing tools to filter, there is pnmtoplainpnm which specifically converts any PNM file to ASCII text format.Agonize
back@NormanR, LOL count me in as one of the last dinosaurs but I guess by the time I was doing any image processing binary was always an option (never used). I too have processed massive amounts of PNM (satellite and thumper data) in awk.Coeval
M
8

I think that the Unix philosophy started falling out of favor with the creation of Emacs.

Messer answered 21/12, 2009 at 4:10 Comment(3)
FYI David Korn said he prefers vim because it's more consistent with the Unix Philosophy.Damaris
Emacs is almost a decade older than Unix, and comes out of the anti-Unix community. I don't see how Emacs is relevant wrt. the Unix philosophy.Mosher
I'd love to know what David Korn said, but I couldn't find it on Google :( Though as a consolation I stumbled upon a couple of interesting quotes by him here: arp242.net/the-art-of-unix-programmingWychelm
B
6

My vote is yes. Subjective, but excellent programming question.

Just a personal anecdote during a time when we were re-writing a mass print output program for insurance carriers. We were literally scolded by advisors for "programming" in the shell. We made aware how it was too disconnected and the languages were too disparate to be complete.

Maybe.

All of a sudden multi-processor Intel boxen became commonplace and fork() didn't really perform as horribly as everyone was always warned in the new age of applications (think VB days). The bulk print programs (which queried a db, transformed to troff output and then to PostScript via msgsnd and then off to the LPD queue in hundreds of thousands) scaled perfectly well against all the systems and didn't require rewrites when the VB runtimes changed.

To your question: I'm with Mr. Korn, but it is not Perl's fault, it is the Perl programmers who decide that Perl is sufficient enough. In multi-process systems maybe it is good enough.

I hope that Ruby, Perl, Python, and (gasp) even Java developers can keep their edge in the shell pipeline. There is inherent value in the development philosophy for implicit scaling and interfacing, separation of duties, and modular design.

Approached properly, with our massively-cored processing units on the horizon, the Unix philosophy may again gain ground.

Blindage answered 21/12, 2009 at 4:11 Comment(1)
+1 Excellent answer. It is nice to see someone else that understands why fork(), multi-process programming, and multiple cores go hand in hand. I have been considering the inherent problems with multithreaded programming and multicore machines lately... quite an eye opening experience that made me dust off my UNIX fork and pipe method of doing things.Turgent
T
4

It does not appear to be completely lost. I read a recent entry blog entry by Ryan Tomayko that aggrandized the UNIX philosophy and how it is embraced by the Unicorn HTTP Server. However, he did have the same general feeling that the ruby community is ignoring the UNIX philosophy in general.

Turgent answered 21/12, 2009 at 4:9 Comment(2)
Unicorn is a good example: it is completely and utterly broken on Windows, which, after all, is the most popular operating system. Which IMO totally defies the point of writing it in a high-level language in the first place.Mosher
@Jorg - the UNIX philosophy is less about running on UNIX and more about constructing applications by wiring together a number of small and simple programs using pipes and filters. This does become difficult with the Win32 process model, but the idea can be extended to share-nothing threads on Windows.Turgent
L
2

I guess the rather simple explanation is that Unix tools are only available on Unix. The vast majority of users, however, run Windows.

I remember the last time I tried to install Nokogiri, it failed because it couldn't run uname -p. There was absolutely no valid reason to do that. All the information that can be obtained by running uname -p is also available from within Ruby. Plus, uname -p is actually not even Unix, it's a non-standard GNU extension which isn't even guaranteed to work on Unix, and is for example completely broken on several Linux distributions.

So, you could either use Unix and lose 90% of your users, or use Ruby.

Louanneloucks answered 21/12, 2009 at 4:8 Comment(3)
I think the question is not so much programming to specific Unices, but rather the Unix Philosophy in general.Coeval
Depends on who "your users" are. If I programmed against windows only libraries, I would lose about 90% of my users, plus 100% of my servers, which are more important.Headrick
If you're writing a Ruby library, your users are Ruby programmers. A hell of a lot more than 10% of Ruby programmers use a Unix.Dosimeter
D
1

No. Unix and Ruby are alive and well

Unix and Ruby are both alive and well, Unix vendor Apple's stock is headed into orbit, linux has an irrevocable dug-in position running servers, development, and lab systems, and Microsoft's desktop-software empire is surrounded by powerful SaaS barbarians like Google and an army of allies.

Unix has never had a brighter future, and Ruby is one of its key allies.

I'm not sure the software-tools pattern is such a key element of Unix anyway. It was awesome in its day given the general clunky borderline-worthless quality of competing CLI tools but Unix introduced many other things including an elegant process and I/O model.

Also, I think you will find that many of those Ruby programs use various Unix and software-tools interfaces internally. Check for popen() and various Process methods.

I think Ruby simply has its own sphere of influence.

Deglutition answered 22/12, 2009 at 7:27 Comment(2)
My question is whether it's a good thing for Ruby programmers to program more in the Unix small-tools-acting-as-filters-connected-by-pipes style. I think your answer is no, not really.Damaris
Agreed with dan here, Ruby /can/ be part of the pipeline, but you're rather referencing C lib calls. IMO the Unix scheduler is granular enough to favor an IO pipeline (vs. granting any given process a greater time between interrupts), though quite frankly I've never seen Windows or os/400 kernel code :OCoeval

© 2022 - 2024 — McMap. All rights reserved.