Bloated echo command
Asked Answered
Y

3

9

Look at the following implementations of the "echo" command:

As you go down the list, I'm sure you'll notice the increasing bloat in each implementation. What is the point of a 272 line echo program?

Yuki answered 20/7, 2010 at 13:57 Comment(0)
J
14

In their article 'Program Design in the UNIX Environment', Pike & Kernighan discuss how the cat program accreted control arguments. Somewhere, though not that article, there was a comment about 'cat came back from Berkeley waving flags'. This is a similar issue to the problem with echo developing options. (I found a reference to the relevant article in the BSD (Mac OS X) man page for cat: Rob Pike, "UNIX Style, or cat -v Considered Harmful", USENIX Summer Conference Proceedings, 1983. See also http://quotes.cat-v.org/programming/)

In their book 'The UNIX Programming Environment', Kernighan & Pike (yes, those two again) quote Doug McIlroy on the subject of what 'echo' should do with no arguments (circa 1984):

Another question of philosophy is what echo should do if given no arguments - specifically, should it print a blank line or nothing at all. All the current echo implementations we know print a blank line, but past versions didn't, and there were great debates on the subject. Doug McIlroy imparted the right feelings of mysticism in his discussion on the topic:

The UNIX and the Echo

There dwelt in the land of New Jersey the UNIX, a fair maid whom savants travelled far to admire. Dazzled by her purity, all sought to espouse her, one for her virginal grace, another for her polished civility, yet another for her agility in performing exacting tasks seldom accomplished even in much richer lands. So large of heart and accommodating of nature was she that the UNIX adopted all but the most insufferably rich of her suitors. Soon, many offspring grew and prospered and spread to the ends of the earth.

Nature herself smiled and answered to the UNIX more eagerly than to other mortal beings. Humbler folk, who knew little of more courtly manners, delighted in her echo, so precise and crystal clear they scarce believed she could be answered by the same rocks and woods that so garbled their own shouts into the wilderness. And the compliant UNIX obliged with perfect echoes of whatever she was asked.

When one impatient swain asked the UNIX, 'Echo nothing', the UNIX obligingly opened her mouth, echoed nothing, and closed it again.

'Whatever do you mean,' the youth demanded, 'opening your mouth like that? Henceforth never open your mouth when you are supposed to echo nothing!' And the UNIX obliged.

'But I want a perfect performance, even when you echo nothing,' pleaded a sensitive youth, 'and no perfect echoes can come from a closed mouth.' Not wishing to offend either one, the UNIX agreed to say different nothings for the impatient youth and the insensitive youth. She called the sensitive nothing '\n'.

Yet now when she said '\n', she was not really saying nothing so she had to open her mouth twice, once to say '\n' and once to say nothing, and so she did not please the sensitive youth, who said forthwith, 'The \n sounds like a perfect nothing to me, but the second one ruins it. I want you to take back one of them.' So the UNIX, who could not abide offending, agreed to undo some echoes, and called that '\c'. Now the sensitive youth could hear a perfect echo of nothing by asking for '\n' and '\c' together. But they say that he died of a surfeit of notation before he ever heard one.


The Korn shell introduced (or, at least, included) a printf command that was based on the C language printf() function, and that uses a format string to control how the material should appear. It is a better tool for complicated formatting than echo. But, because of the history outlined in the quote, echo doesn't just echo any more; it interprets what it is given to echo.

And interpreting the command line arguments to echo indubitably requires more code than not interpreting them. A basic echo command is:

#include <stdio.h>
int main(int argc, char **argv)
{
    const char *pad = "";
    while (*++argv)
    {
        fputs(pad, stdout);
        fputs(*argv, stdout);
        pad = " ";
    }
    fputc('\n', stdout);
    return 0;
}

There are other ways to achieve that. But the more complex versions of echo have to scrutinize their arguments before printing anything - and that takes more code. And different systems have decided that they want to do different amounts of interpretation of their arguments, leading to different amounts of code.

Johen answered 20/7, 2010 at 15:28 Comment(0)
S
8

You will notice there is not really that much bloat growth.

  1. Most of the lines of code are comments.
  2. Most of the lines of code that are not comments, are usage documentation, so when somebody goes 'echo --help' it will do something.
  3. Code outside of the above appears largely to be handling the arguments echo can take, as well as the "special" expansion for symbols such as \n and \t to be their equivalent characters instead of echoing them literally.

Also, most of the time, you're not even running the echo command, most of the time 'echo' invokes a shell built-in. At least on my machine, you have to type /bin/echo --help to get all the advanced functionality/documentation out of it, because echo --help merely echo's --help

For a good example, run this in your shell.

 echo '\e[31mhello\e[0m'

Then run this:

 echo -e '\e[31mhello\e[0m'

And you will note vastly different results.

The former will just emit the input as-is, but the latter will print hello coloured red.

Another example, using the 'hex' code:

$echo '\x64\x65\x66'
\x64\x65\x66

$echo -e '\x64\x65\x66'
def

Vastly different behaviour. The openbsd implementation cannot to this =).

Stomatology answered 20/7, 2010 at 14:1 Comment(7)
I don't know any C but somethings cleary look a bit bonkers. E.g. lines 90~103 in the GNU echo, it converts some hexadecimal stuff to integers. But why? OpenBSD can echo text without messing around with hexadecimal numbers or anything else that's not text. As I said, I don't know any C; so please correct me if I'm wrong.Yuki
@Yuki : the openbsd just copies command arguments to stdout, the gnu one actually can process the data. See my example. Then try it on openbsd using the absolute path to 'echo' instead of relying on the shell builtin.Stomatology
So GNU is implementing some simple regex in under 300 lines including comments & license. I'm impressed! Thanks.Yuki
@Kent Fredric: Still, it's quite nice considering the size. I can't believe I never knew!Yuki
echo --help should only ever echo '--help' and a newline. Anything else is a misinterpretation of what 'echo' is for.Johen
The GNU version actually does almost exactly the same(minus the valid BSD -n extension) when run with the POSIXLY_CORRECT environment variable defined and in a normal shell that already processes escaped characters.Phosphorus
And I wouldn't call FreeBSD version "bloated". They are obviously trying to have the fastest and smallest version at the expense of readability.Phosphorus
C
1

I'm not sure if I like the first implementation: it has one option too many!

If a -n option was required, then also adding a -- option to stop option processing would be helpful. That way if you are writing a shell script that reads and prints user input, you don't get inconsistent behavior if the user types -n.

Creditable answered 22/2, 2011 at 10:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.