What do people mean when they say “Perl is very good at parsing”? [closed]
Asked Answered
G

5

6

What do people mean when they say "Perl is very good at parsing"?

How is Perl any better or more powerful than other scripting languages such as Python or Ruby?

Guillotine answered 11/12, 2009 at 14:53 Comment(3)
To me it suggests that they don't know much about parsing and probably know less about languages like Python and Ruby ... much less about tools such as lex/flex and yacc/bison. It suggests that they are fixated by regular expressions and the extraction of patterns from simple data formats (which they conflate with "parsing"). Finally it strongly suggests that those people will, when faced with real parsing problem, create half-baked and fragile code which pass their simply concocted test cases while causing pain for those who depend on that code for real work.Vowel
Half-baked, fragile code is perfect for the sort of tedious, one-off tasks that one might reasonably expect to approach with perl in hand... Writing a BNF grammar to parse log files just doesn't sound like a good use of time.Participate
There are a lot of things that don't have a grammar, and Perl gives you a lot of tools to deal with that. Regexes aren't the only thing in Perl's toolbox.Trimmer
P
19

They mean that Perl was originally designed for processing text files and has many features that make it easy:

  • Perl has many functions for string processing: substr, index, chomp, length, grep, sort, reverse, lc, ucfirst, ...
  • Perl automatically converts between numbers and strings depending on how a value is used. (e.g. you can read the character string '100' from a file and add one to it without needing to do an string to integer conversion first)
  • Perl automatically handles conversion to and from the platform encoding (e.g. CRLF on Windows) and a logical newline ("\n") within your program.
  • Regular expressions are integrated into the syntax instead of being a separate library.
  • Perl's regular expressions are the "gold standard" for power and functionality.
  • Perl has full Unicode support.

Python and Ruby also have good facilities for text processing. (Ruby in particular took much inspiration from Perl, much as Perl has shamelessly borrowed from many other languages.) There's little point in asking which is better. Use what you like.

Parmenter answered 11/12, 2009 at 15:57 Comment(4)
Although some people from on $_, I think it belongs on that list. The idea that you have a "current topic" or thing that you're working on and applying various steps to it is very nice.Trimmer
I wouldn't say that Perl automatically handles line endings. I think you're confusing that with writing to a text file in Windows. Reading data coming back doesn't do anything special unless you tell Perl what to do.Trimmer
@brian: Conversion between the platform newline sequence and a logical "\n" happens on both reading and writing (ignoring binmode, of course). I know that you're well aware of this so I find your comment confusing. I suppose I could have said that "Perl lets you think in terms of logical newlines instead of worrying about whatever sequence your OS uses" without mentioning how it does that.Parmenter
@Michael: you're confusing the behavior of what a DOSish perl does and what the rest of the world does. Reading a file with Windows line endings on a unix machine still gives you Windows line endings. It's only a special feature of Perl on Windows and when Perl knows its writing to a tty. The issue of what "\n" is is an entirely different matter. See perlport for the details.Trimmer
T
12

Don't take a statement of Perl's strengths to be a statement of another language's failings. Perl is good for text processing, but that doesn't mean Ruby or Python suck.

When people talk about Perl being "good for parsing", they're mainly echoing Perl's history; it was invented in the day when heavy-duty text processing wasn't easy. Try doing some of that in C or C++ (Java hadn't been invented yet, either!). Back in the day, Larry was trying to do his work with sed and awk, but running into their limitations. He made a tool that made text even easier to work with.

Perl is still very good for text manipulation tasks, but now so are a lot of other languages.

Trimmer answered 11/12, 2009 at 19:17 Comment(0)
B
4

Perl is good for ETL or batch processing motions as well. It's a minimal amount of code to pick up the file; push it through split to get a map, perform some logical business actions on the record, and write it back out to disk.

I suppose that's more data processing then data parsing, but data processing is bulk data parsing.

Beneficent answered 11/12, 2009 at 17:38 Comment(0)
T
1

Perl is very good in text parsing, when compared to C/C++/Java.

Teddi answered 11/12, 2009 at 14:55 Comment(1)
Igor should probably expand his answer to note that when Perl came along, text processing wasn't a trivial task. 20 years later, people don't appreciate that pain now that everything has PCRE, etc.Trimmer
S
0

It's probably because people are used to what it was built for, as described in the perl documentation, so it has become commonplace for many people to associate parsing of text files with Perl. Not to exclude Ruby or Python, it's just more of a household name IMHO.

Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It's also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal).

Signac answered 12/12, 2009 at 1:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.