Hidden features of Perl?
Asked Answered
G

78

143

What are some really useful but esoteric language features in Perl that you've actually been able to employ to do useful work?

Guidelines:

  • Try to limit answers to the Perl core and not CPAN
  • Please give an example and a short description

Hidden Features also found in other languages' Hidden Features:

(These are all from Corion's answer)

  • C
    • Duff's Device
    • Portability and Standardness
  • C#
    • Quotes for whitespace delimited lists and strings
    • Aliasable namespaces
  • Java
    • Static Initalizers
  • JavaScript
    • Functions are First Class citizens
    • Block scope and closure
    • Calling methods and accessors indirectly through a variable
  • Ruby
    • Defining methods through code
  • PHP
    • Pervasive online documentation
    • Magic methods
    • Symbolic references
  • Python
    • One line value swapping
    • Ability to replace even core functions with your own functionality

Other Hidden Features:

Operators:

Quoting constructs:

Syntax and Names:

Modules, Pragmas, and command-line options:

Variables:

Loops and flow control:

Regular expressions:

Other features:

Other tricks, and meta-answers:


See Also:

Grating answered 2/10, 2008 at 11:49 Comment(1)
Most of these features are in everyday use, some occur in the majority of Perl scripts, and most listed under "Other" still stem from other languages, calling these "hidden" changes the intent of the question.Immoderate
M
54

The flip-flop operator is useful for skipping the first iteration when looping through the records (usually lines) returned by a file handle, without using a flag variable:

while(<$fh>)
{
  next if 1..1; # skip first record
  ...
}

Run perldoc perlop and search for "flip-flop" for more information and examples.

Municipal answered 2/10, 2008 at 11:49 Comment(2)
Actually that's taken from Awk, where you can do flip-flop between two patterns by writing pattern1, pattern2Addax
To clarify, the "hidden" aspect of this is that if either operand to scalar '..' is a constant the value is implicitly compared to the input line number ($.)Docilla
W
47

There are many non-obvious features in Perl.

For example, did you know that there can be a space after a sigil?

 $ perl -wle 'my $x = 3; print $ x'
 3

Or that you can give subs numeric names if you use symbolic references?

$ perl -lwe '*4 = sub { print "yes" }; 4->()' 
yes

There's also the "bool" quasi operator, that return 1 for true expressions and the empty string for false:

$ perl -wle 'print !!4'
1
$ perl -wle 'print !!"0 but true"'
1
$ perl -wle 'print !!0'
(empty line)

Other interesting stuff: with use overload you can overload string literals and numbers (and for example make them BigInts or whatever).

Many of these things are actually documented somewhere, or follow logically from the documented features, but nonetheless some are not very well known.

Update: Another nice one. Below the q{...} quoting constructs were mentioned, but did you know that you can use letters as delimiters?

$ perl -Mstrict  -wle 'print q bJet another perl hacker.b'
Jet another perl hacker.

Likewise you can write regular expressions:

m xabcx
# same as m/abc/
Wobble answered 2/10, 2008 at 11:49 Comment(4)
“Did you know that there can be a space after a sigil?” I am utterly flabbergasted. Wow.Bursary
Cool! !!$undef_var doesn't create a warning.Multiply
I think your example of using letters to delimit strings should be "Just another perl hacker" rather than "Jet another perl hacker" =PCompellation
The worst part is that you can use other things as delimiters, too. Even closing brackets. The following are valid: s}regex}replacement}xsmg; q]string literal];Tetrapody
B
46

Add support for compressed files via magic ARGV:

s{ 
    ^            # make sure to get whole filename
    ( 
      [^'] +     # at least one non-quote
      \.         # extension dot
      (?:        # now either suffix
          gz
        | Z 
       )
    )
    \z           # through the end
}{gzcat '$1' |}xs for @ARGV;

(quotes around $_ necessary to handle filenames with shell metacharacters in)

Now the <> feature will decompress any @ARGV files that end with ".gz" or ".Z":

while (<>) {
    print;
}
Bituminous answered 2/10, 2008 at 11:49 Comment(3)
I don't think you need to escape the | in the replacement.Compellation
I'm staring at this and I can't figure out how it works. At what point is zcat | parsed as a command to pipe through?Surly
@Surly => detecting pipes is a feature of the two argument open, which the diamond operator uses as it opens each file in @ARGVCellulose
T
40

One of my favourite features in Perl is using the boolean || operator to select between a set of choices.

 $x = $a || $b;

 # $x = $a, if $a is true.
 # $x = $b, otherwise

This means one can write:

 $x = $a || $b || $c || 0;

to take the first true value from $a, $b, and $c, or a default of 0 otherwise.

In Perl 5.10, there's also the // operator, which returns the left hand side if it's defined, and the right hand side otherwise. The following selects the first defined value from $a, $b, $c, or 0 otherwise:

$x = $a // $b // $c // 0;

These can also be used with their short-hand forms, which are very useful for providing defaults:

$x ||= 0;   # If $x was false, it now has a value of 0.

$x //= 0;   # If $x was undefined, it now has a value of zero.

Cheerio,

Paul

Trinomial answered 2/10, 2008 at 11:49 Comment(7)
This is such a common idiom that it hardly qualifies as a "hidden" feature.Docilla
shame the pretty printer thinks // is a comment :)Abscissa
Question, is there a "use feature" to use these new operators, or are they default enabled? I am still leaning Perl 5.10's features.Episternum
// is in there by default, no special tweaks needed. You can also backport it into 5.8.x with the dor-patch... see the authors/id/H/HM/HMBRAND/ directory on any CPAN mirror. FreeBSD 6.x and beyond does this for you in their perl package.Wellbred
When || or // is combined with do { }, you can encapsulate a more complex assignment, ie $x = $a || do { my $z; 3 or 4 lines of derivation; $z };Sailesh
@Episternum to clarify, the feature pragma guards access to new keywords that might otherwise step on user-defined subs -- like say, state, and given/when. Since that issue doesn't apply to symbolic operators like // and ~~ they're accessible even without feature (but you might want to throw in a use 5.010 or similar declaration on code that uses them, so as to produce a more useful error message if someone tries to run that code on older perls).Tifanie
@John Ferguson: fixed the ugly printer.Slimy
M
39

The operators ++ and unary - don't only work on numbers, but also on strings.

my $_ = "a"
print -$_

prints -a

print ++$_

prints b

$_ = 'z'
print ++$_

prints aa

Monney answered 2/10, 2008 at 11:49 Comment(6)
To quote perlvar: "The auto-decrement operator is not magical." So -- doesn't work on strings.Wobble
"aa" doesn't seem to be the natural element following "z". I would expect the next highest ascii value, which is "{".Surly
Don't ask a programmer what comes after "z"; ask a human. This feature is great for numbering items in a long list.Scrap
When new to Perl I implemented this feature myself with the exact z to aa behavior then showed it to a co-worker who laughed and me and said "let me show you something". I cried a bit but learned something.Neoma
I wish I had this in C#, great featureSomewhat
@Surly - If you want that, use numbers and autoconvert them to ASCII with ord(). Or, write a small class and overload the operators to do it for you.Compellation
B
36

As Perl has almost all "esoteric" parts from the other lists, I'll tell you the one thing that Perl can't:

The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions.

Just in case it wasn't obvious to you what features Perl offers, here's a selective list of the maybe not totally obvious entries:

Duff's Device - in Perl

Portability and Standardness - There are likely more computers with Perl than with a C compiler

A file/path manipulation class - File::Find works on even more operating systems than .Net does

Quotes for whitespace delimited lists and strings - Perl allows you to choose almost arbitrary quotes for your list and string delimiters

Aliasable namespaces - Perl has these through glob assignments:

*My::Namespace:: = \%Your::Namespace

Static initializers - Perl can run code in almost every phase of compilation and object instantiation, from BEGIN (code parse) to CHECK (after code parse) to import (at module import) to new (object instantiation) to DESTROY (object destruction) to END (program exit)

Functions are First Class citizens - just like in Perl

Block scope and closure - Perl has both

Calling methods and accessors indirectly through a variable - Perl does that too:

my $method = 'foo';
my $obj = My::Class->new();
$obj->$method( 'baz' ); # calls $obj->foo( 'baz' )

Defining methods through code - Perl allows that too:

*foo = sub { print "Hello world" };

Pervasive online documentation - Perl documentation is online and likely on your system too

Magic methods that get called whenever you call a "nonexisting" function - Perl implements that in the AUTOLOAD function

Symbolic references - you are well advised to stay away from these. They will eat your children. But of course, Perl allows you to offer your children to blood-thirsty demons.

One line value swapping - Perl allows list assignment

Ability to replace even core functions with your own functionality

use subs 'unlink'; 
sub unlink { print 'No.' }

or

BEGIN{
    *CORE::GLOBAL::unlink = sub {print 'no'}
};

unlink($_) for @ARGV
Bascinet answered 2/10, 2008 at 11:49 Comment(7)
I'm a fan of Perl's documentation compared to other languages, but I still think that for Regexes and references it could be rationalised a whole lot. e.g. the best primer for regexes is not Perlre, but PerlopAbscissa
"The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions." - this is utter nonsense.Dicot
Thanks for your insight.I've looked at some ways to have a bare http://... URL in Perl code without using a source filter,and didn't find a way.Maybe you can show how this is possible? // is used for regular expressions in Perl versions up to 5.8.x.In 5.10 it's repurposed for defined-or assignment.Bascinet
bare URL: why would you expect example.com to be a single token, let alone a string, in any language? (besides one delimited solely by whitespace and parenthetheth :)Vadim
Why/where would you want bare URLs in your code? I can't think of an example.Rentschler
Nobody would want that, it's just a Java meme. "foo.com" is the label http: and then "foo.com" in a comment. Some people find this interesting because... they are dumb.Amboina
Well, .... actually metacpan.org/module/Acme::URL , seems you can have bareword URLs =)Jetport
E
35

Autovivification. AFAIK no other language has it.

Episternum answered 2/10, 2008 at 11:49 Comment(9)
I had no idea that Python, et al, didn't support this.Affecting
@davidnicol: Really? Can you provide a link? My quick search on google didn't return anything. For those that don't know ECMAscript is the correct name for Javascript. en.wikipedia.org/wiki/ECMAScriptEpisternum
I thought I would miss this more when I moved to Python, but I think it's a blessing in disguise there.Appleby
And there is a module to disable autovivicationNuclei
@Gregg Lind - Given that Python automatically creates variables whenever you first assign to them, autovivification would create monstrous problems out of a single typo.Compellation
I find myself relieved that this feature has been confined to perl.Shortie
@Omnifarious, @Chris Lutz, @Gregg Lind: Autovivification is not a bug but a feature! I challenge you to write this in Python just as easily: for $i (1..10) { for $j (1 .. 10) { $a[$i][$j] = $i * $j } }.Slimy
@Slimy - a = [ [x*y for y in xrange(1,11)] for x in xrange(1,11) ]Shortie
Matlab actually has this, to a significant degree.Staid
B
31

It's simple to quote almost any kind of strange string in Perl.

my $url = q{http://my.url.com/any/arbitrary/path/in/the/url.html};

In fact, the various quoting mechanisms in Perl are quite interesting. The Perl regex-like quoting mechanisms allow you to quote anything, specifying the delimiters. You can use almost any special character like #, /, or open/close characters like (), [], or {}. Examples:

my $var  = q#some string where the pound is the final escape.#;
my $var2 = q{A more pleasant way of escaping.};
my $var3 = q(Others prefer parens as the quote mechanism.);

Quoting mechanisms:

q : literal quote; only character that needs to be escaped is the end character. qq : an interpreted quote; processes variables and escape characters. Great for strings that you need to quote:

my $var4 = qq{This "$mechanism" is broken.  Please inform "$user" at "$email" about it.};

qx : Works like qq, but then executes it as a system command, non interactively. Returns all the text generated from the standard out. (Redirection, if supported in the OS, also comes out) Also done with back quotes (the ` character).

my $output  = qx{type "$path"};      # get just the output
my $moreout = qx{type "$path" 2>&1}; # get stuff on stderr too

qr : Interprets like qq, but then compiles it as a regular expression. Works with the various options on the regex as well. You can now pass the regex around as a variable:

sub MyRegexCheck {
    my ($string, $regex) = @_;
    if ($string)
    {
       return ($string =~ $regex);
    }
    return; # returns 'null' or 'empty' in every context
}

my $regex = qr{http://[\w]\.com/([\w]+/)+};
@results = MyRegexCheck(q{http://myurl.com/subpath1/subpath2/}, $regex);

qw : A very, very useful quote operator. Turns a quoted set of whitespace separated words into a list. Great for filling in data in a unit test.


   my @allowed = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z { });
   my @badwords = qw(WORD1 word2 word3 word4);
   my @numbers = qw(one two three four 5 six seven); # works with numbers too
   my @list = ('string with space', qw(eight nine), "a $var"); # works in other lists
   my $arrayref = [ qw(and it works in arrays too) ]; 

They're great to use them whenever it makes things clearer. For qx, qq, and q, I most likely use the {} operators. The most common habit of people using qw is usually the () operator, but sometimes you also see qw//.

Bunnybunow answered 2/10, 2008 at 11:49 Comment(5)
I sometimes use qw"" so that syntax highlighters will highlight it correctly.Prophylactic
Works for me in SlickEdit. :)Bunnybunow
@fengshaun, The editors I generally use do highlight these correctly. I was referring, in part to the syntax highlighter on StackOverflow.Prophylactic
@Brad Gilbert: Stack Overflow can’t (well, (doesn’t) parse Perl worth diddly squat. ☹Slimy
my $moreout = qx{type "$path" 2>&1}; ... I didn't know you could do that! [TM]Wellbred
T
27

Not really hidden, but many every day Perl programmers don't know about CPAN. This especially applies to people who aren't full time programmers or don't program in Perl full time.

Turbit answered 2/10, 2008 at 11:49 Comment(0)
B
27

The "for" statement can be used the same way "with" is used in Pascal:

for ($item)
{
    s/&‎nbsp;/ /g;
    s/<.*?>/ /g;
    $_ = join(" ", split(" ", $_));
}

You can apply a sequence of s/// operations, etc. to the same variable without having to repeat the variable name.

NOTE: the non-breaking space above (&‎nbsp;) has hidden Unicode in it to circumvent the Markdown. Don't copy paste it :)

Bituminous answered 2/10, 2008 at 11:49 Comment(3)
And "map" does the same trick as well... map { .... } $item; One advantage of using "for" over "map" would be that you could use next to break out.Shumpert
Also, for has the item being manipulated listed before the code doing the manipulating, leading to better readability.Bunnybunow
@RobertP: That’s quite right. A topicalizer is useful in discourse.Slimy
D
26

The ability to parse data directly pasted into a DATA block. No need to save to a test file to be opened in the program or similar. For example:

my @lines = <DATA>;
for (@lines) {
    print if /bad/;
}

__DATA__
some good data
some bad data
more good data 
more good data 
Decalcify answered 2/10, 2008 at 11:49 Comment(5)
And very useful in little tests!Cervelat
@peter mortensen how would you have multiple blocks? And how do you end a block?Philomenaphiloo
@Toad: it is allan's answer (see the revision list). It is better to address that user. Or, as that user has left Stack Overflow, maybe address no one in particular (so a real Perl expert can straighten it out later).Meredi
@Hai: No it is not ugly — in fact, it’s precisely the opposite of ugly: it’s clean, svelte, minimal, and beautiful; in a word, it’s wonderful, and languages without it are a PITA. @peter mortensen, @toad: One answer to how to have multiple data blocks in the same program is to use the Inline::Files module off CPAN.Slimy
Inline::Files is implemented using source filters. There's also Data::Section that provides multiple inline blocks and does not use source filters.Predominant
W
26

The quoteword operator is one of my favourite things. Compare:

my @list = ('abc', 'def', 'ghi', 'jkl');

and

my @list = qw(abc def ghi jkl);

Much less noise, easier on the eye. Another really nice thing about Perl, that one really misses when writing SQL, is that a trailing comma is legal:

print 1, 2, 3, ;

That looks odd, but not if you indent the code another way:

print
    results_of_foo(),
    results_of_xyzzy(),
    results_of_quux(),
    ;

Adding an additional argument to the function call does not require you to fiddle around with commas on previous or trailing lines. The single line change has no impact on its surrounding lines.

This makes it very pleasant to work with variadic functions. This is perhaps one of the most under-rated features of Perl.

Wellbred answered 2/10, 2008 at 11:49 Comment(7)
An interesting corner case of Perl's syntax is that the following is valid: for $_ qw(a list of stuff) {...}Friesen
You can even abuse glob syntax for quoting words, as long as you don't use special characters such as *?. So you can write for (<a list of stuff>) { ... }Wobble
@ephemient: nearly. That only works with lexicals: for my $x qw(a b c) {...} For instance: for $_ qw(a b c) {print} # prints nothingWellbred
why add that extra lexical when you can enjoy perl's favourite default? for (qw/a b c d/) { print; }Cervelat
@fengshaun: it depends on how large the block is. A named lexical helps document the narrative. I tend to only use an implicit $_ as a statement modifier: print for qw(a b c);Wellbred
@ephemient, @fengshaun, @moritz, @dland: That’s “fixed” in blead; see this p5p thread.Slimy
And now released as part of 5.14.Bunnybunow
T
24

Taint checking. With taint checking enabled, perl will die (or warn, with -t) if you try to pass tainted data (roughly speaking, data from outside the program) to an unsafe function (opening a file, running an external command, etc.). It is very helpful when writing setuid scripts or CGIs or anything where the script has greater privileges than the person feeding it data.

Magic goto. goto &sub does an optimized tail call.

The debugger.

use strict and use warnings. These can save you from a bunch of typos.

Throne answered 2/10, 2008 at 11:49 Comment(1)
Why don't other languages have this feature? This feature used makes perl web scripts an order of magnitude more secure.Disincentive
M
24

New Block Operations

I'd say the ability to expand the language, creating pseudo block operations is one.

  1. You declare the prototype for a sub indicating that it takes a code reference first:

    sub do_stuff_with_a_hash (&\%) {
        my ( $block_of_code, $hash_ref ) = @_;
        while ( my ( $k, $v ) = each %$hash_ref ) { 
            $block_of_code->( $k, $v );
        }
    }
    
  2. You can then call it in the body like so

    use Data::Dumper;
    
    do_stuff_with_a_hash {
        local $Data::Dumper::Terse = 1;
        my ( $k, $v ) = @_;
        say qq(Hey, the key   is "$k"!);
        say sprintf qq(Hey, the value is "%v"!), Dumper( $v );
    
    } %stuff_for
    ;
    

(Data::Dumper::Dumper is another semi-hidden gem.) Notice how you don't need the sub keyword in front of the block, or the comma before the hash. It ends up looking a lot like: map { } @list

Source Filters

Also, there are source filters. Where Perl will pass you the code so you can manipulate it. Both this, and the block operations, are pretty much don't-try-this-at-home type of things.

I have done some neat things with source filters, for example like creating a very simple language to check the time, allowing short Perl one-liners for some decision making:

perl -MLib::DB -MLib::TL -e 'run_expensive_database_delete() if $hour_of_day < AM_7';

Lib::TL would just scan for both the "variables" and the constants, create them and substitute them as needed.

Again, source filters can be messy, but are powerful. But they can mess debuggers up something terrible--and even warnings can be printed with the wrong line numbers. I stopped using Damian's Switch because the debugger would lose all ability to tell me where I really was. But I've found that you can minimize the damage by modifying small sections of code, keeping them on the same line.

Signal Hooks

It's often enough done, but it's not all that obvious. Here's a die handler that piggy backs on the old one.

my $old_die_handler = $SIG{__DIE__};
$SIG{__DIE__}       
    = sub { say q(Hey! I'm DYIN' over here!); goto &$old_die_handler; }
    ;

That means whenever some other module in the code wants to die, they gotta come to you (unless someone else does a destructive overwrite on $SIG{__DIE__}). And you can be notified that somebody things something is an error.

Of course, for enough things you can just use an END { } block, if all you want to do is clean up.

overload::constant

You can inspect literals of a certain type in packages that include your module. For example, if you use this in your import sub:

overload::constant 
    integer => sub { 
        my $lit = shift;
        return $lit > 2_000_000_000 ? Math::BigInt->new( $lit ) : $lit 
    };

it will mean that every integer greater than 2 billion in the calling packages will get changed to a Math::BigInt object. (See overload::constant).

Grouped Integer Literals

While we're at it. Perl allows you to break up large numbers into groups of three digits and still get a parsable integer out of it. Note 2_000_000_000 above for 2 billion.

Multiply answered 2/10, 2008 at 11:49 Comment(3)
When using $SIG{DIE} handlers, its strongly recommended that you inspect $^S to see if your program is actually dying, or just throwing an exception which is going to be caught. Usually you don't want to interfere with the latter.Trinomial
The new block is very instructive ! I was thinking It was a language semantic! many thanks.Arbuthnot
An instructive use of the source filter is pdl's NiceSlice (pdl.perl.org/?docs=NiceSlice&title=PDL::NiceSlice) so that one doesn't need to use the ->slice as a method every time a slice is needed.Kassie
A
24

Binary "x" is the repetition operator:

print '-' x 80;     # print row of dashes

It also works with lists:

print for (1, 4, 9) x 3; # print 149149149
Addax answered 2/10, 2008 at 11:49 Comment(2)
This is one reason why Perl has been so popular with hackers. perl -e 'print 0x000 x 25';Episternum
My favorite use for this is generating placeholders for the last part of an SQL INSERT statement: @p = ('?') x $n; $p = join(", ", @p); $sql = "INSERT ... VALUES ($p)";Affecting
P
22

Based on the way the "-n" and "-p" switches are implemented in Perl 5, you can write a seemingly incorrect program including }{:

ls |perl -lne 'print $_; }{ print "$. Files"'

which is converted internally to this code:

LINE: while (defined($_ = <ARGV>)) {
    print $_; }{ print "$. Files";
}
Primaveras answered 2/10, 2008 at 11:49 Comment(2)
@martin clayton: Why is it called that?Slimy
@Slimy - because it, supposedly, looks like two people rubbing noses. In profile, if you see what I mean.Neisa
D
18

map - not only because it makes one's code more expressive, but because it gave me an impulse to read a little bit more about this "functional programming".

Driveway answered 2/10, 2008 at 11:49 Comment(0)
T
18

This is a meta-answer, but the Perl Tips archives contain all sorts of interesting tricks that can be done with Perl. The archive of previous tips is on-line for browsing, and can be subscribed to via mailing list or atom feed.

Some of my favourite tips include building executables with PAR, using autodie to throw exceptions automatically, and the use of the switch and smart-match constructs in Perl 5.10.

Disclosure: I'm one of the authors and maintainers of Perl Tips, so I obviously think very highly of them. ;)

Trinomial answered 2/10, 2008 at 11:49 Comment(1)
It's probably one of the best documented languages out there, and set the pattern for tools to search documentation. That the list in this question is probably not as needed as for other languages.Multiply
P
18

Let's start easy with the Spaceship Operator.

$a = 5 <=> 7;  # $a is set to -1
$a = 7 <=> 5;  # $a is set to 1
$a = 6 <=> 6;  # $a is set to 0
Primaveras answered 2/10, 2008 at 11:49 Comment(3)
@Leon: C/C++ doesn't do a 3 value return for numbers. If memory serves String comapre functions are the only 3 value return that I know of in the whole STL language. AFAIK Python doesn't have a 3 return numeric compare. Java doesn't have a number specific 3 return compare either.Episternum
It's worth mentioning what's so useful about -1/0/1 comparison operators, since not everyone might know: you can chain them together with the or-operator to do primary/secondary/etc. sorts. So ($a->lname cmp $b->lname) || ($a->fname cmp $b->fname) sorts people by their last names, but if two people have the same last name then they will be ordered by their first name.Tifanie
@Episternum Python does have a 3-value compare: cmp() >>> print (cmp(5,7), cmp(6,6), cmp(7,5)) (-1, 0, 1)Errhine
S
15

The continue clause on loops. It will be executed at the bottom of every loop, even those which are next'ed.

while( <> ){
  print "top of loop\n";
  chomp;

  next if /next/i;
  last if /last/i;

  print "bottom of loop\n";
}continue{
  print "continue\n";
}
Sleeper answered 2/10, 2008 at 11:49 Comment(0)
M
15

My vote would go for the (?{}) and (??{}) groups in Perl's regular expressions. The first executes Perl code, ignoring the return value, the second executes code, using the return value as a regular expression.

Monney answered 2/10, 2008 at 11:49 Comment(5)
perl invented so many regexp extensions that other programs now often use pcre (perl compatible regex) instead of the original regex language.Primaveras
Read the little blurb here perldoc.perl.org/… :-DEpisternum
Perl really has ( as far as I know ), lead the pack, when it comes to regexps.Prophylactic
This, as far as I'm aware, is still experimental, and may not work the same way in future Perls. Not to say that it isn't useful, but a slightly safer and just as useable version can be found in the s/// command's /e flag: s/(pattern)/reverse($1);/ge; # reverses all patterns.Compellation
@Chris Lutz, @Leon Timmerman: Note that those two constructs are now reëntrant. Also note that the second one need no longer be used to effect recursive patterns, now that we can recurse on capture groups. @Brad Gilbert: That’s right, although PCRE does a decent job of tracking us; one area of regex excellence where Perl is completely unchallenged is its access to Unicode properties; see my unitrio distribution of uninames, unichars, and especially uniprops to see just part of what I mean.Slimy
E
13
while(/\G(\b\w*\b)/g) {
     print "$1\n";
}

the \G anchor. It's hot.

Episternum answered 2/10, 2008 at 11:49 Comment(3)
...and it indicates the position of the end of the previous match.Surfacetosurface
But you have to call your regex in scalar context.Regicide
@davidnicol: The above code works. Can you clarify what you mean?Episternum
D
13

The m// operator has some obscure special cases:

  • If you use ? as the delimiter it only matches once unless you call reset.
  • If you use ' as the delimiter the pattern is not interpolated.
  • If the pattern is empty it uses the pattern from the last successful match.
Docilla answered 2/10, 2008 at 11:49 Comment(6)
These are more like hidden gotchas than hidden features! I don't know anyone who likes them. A thread on p5p some time back discussed the usefulness of a putative m/$foo/r flag, where /r would mean no interpolation (the letter isn't important) since no-one can ever remember the single quotes thing.Wellbred
@dland: Agreed; I'd call these hidden misfeatures and would never use them in production code.Docilla
I can't imagine a Perl programmer being unable to remember (or even guess) that single quotes stand for no interpolation. Its usage with this semantics is almost universal in the language that I'd rather expect this to be so...Pm
and if the pattern is empty and the last successful match was compiled with the /o modifier, from then on it will be stuck on that pattern.Regicide
I think the empty pattern behaviour has been deprecated. Primarily because a pattern like m/$foo/ becomes a nasty bug when $foo is empty.Flummery
@sundar: yeah, pretty much. But qq'...' and q"..." will throw you for a loop because of the cognitive dissonance, and because the quotes do not there override what really happens. qx'...' behaves right though.Slimy
S
12

The null filehandle diamond operator <> has its place in building command line tools. It acts like <FH> to read from a handle, except that it magically selects whichever is found first: command line filenames or STDIN. Taken from perlop:

while (<>) {
...         # code for each line
}
Scruffy answered 2/10, 2008 at 11:49 Comment(2)
It also follows the UNIX semantics of using "-" to mean "read from stdin. So you could do perl myscript.pl file1.txt - file2.txt, and perl would process the first file, then stdin, then the second file.Tetrapody
You can overload the <> operator on your own objects (<$var>) to work like an iterator. However it does not work as you could expect in list context.Subtotal
B
11
rename("$_.part", $_) for "data.txt";

renames data.txt.part to data.txt without having to repeat myself.

Bituminous answered 2/10, 2008 at 11:49 Comment(0)
A
11

Special code blocks such as BEGIN, CHECK and END. They come from Awk, but work differently in Perl, because it is not record-based.

The BEGIN block can be used to specify some code for the parsing phase; it is also executed when you do the syntax-and-variable-check perl -c. For example, to load in configuration variables:

BEGIN {
    eval {
        require 'config.local.pl';
    };
    if ($@) {
        require 'config.default.pl';
    }
}
Addax answered 2/10, 2008 at 11:49 Comment(0)
P
10

A bit obscure is the tilde-tilde "operator" which forces scalar context.

print ~~ localtime;

is the same as

print scalar localtime;

and different from

print localtime;
Primaveras answered 2/10, 2008 at 11:49 Comment(5)
This is especially obscure because perl5.10.0 also introduces the "smart match operator" ~~, which can do regex matches, can look if an item is contained in an array and so on.Wobble
That's not obscure, that's obfuscated (and useful for golf and JAPHs).Docilla
This is not correct! ~~ is not safe on references! It stringifies them.Monney
Well, yes. Stringification is what happens to references when forced into scalar context. How does that make "~~ forces scalar context" incorrect?Surfacetosurface
@Nomad Dervish: Scalar context /= stringification. e.g. "$n = @a" is scalar context. "$s = qq'@a'" is stringification. With regard to references, "$ref1 = $ref2" is scalar context, but does not stringify.Docilla
O
9

The input record separator can be set to a reference to a number to read fixed length records:

$/ = \3; print $_,"\n" while <>; # output three chars on each line
Ornithorhynchus answered 2/10, 2008 at 11:49 Comment(0)
J
9

The goatse operator*:

$_ = "foo bar";
my $count =()= /[aeiou]/g; #3

or

sub foo {
    return @_;
}

$count =()= foo(qw/a b c d/); #4

It works because list assignment in scalar context yields the number of elements in the list being assigned.

* Note, not really an operator

Jilli answered 2/10, 2008 at 11:49 Comment(1)
That is the most (well, least) beautiful "operator" ever.Compellation
I
9

The "desperation mode" of Perl's loop control constructs which causes them to look up the stack to find a matching label allows some curious behaviors which Test::More takes advantage of, for better or worse.

SKIP: {
    skip() if $something;

    print "Never printed";
}

sub skip {
    no warnings "exiting";
    last SKIP;
}

There's the little known .pmc file. "use Foo" will look for Foo.pmc in @INC before Foo.pm. This was intended to allow compiled bytecode to be loaded first, but Module::Compile takes advantage of this to cache source filtered modules for faster load times and easier debugging.

The ability to turn warnings into errors.

local $SIG{__WARN__} = sub { die @_ };
$num = "two";
$sum = 1 + $num;
print "Never reached";

That's what I can think of off the top of my head that hasn't been mentioned.

Iaria answered 2/10, 2008 at 11:49 Comment(0)
R
9

tie, the variable tying interface.

Regicide answered 2/10, 2008 at 11:49 Comment(0)
E
7

You can use @{[...]} to get an interpolated result of complex perl expressions

$a = 3;
$b = 4;

print "$a * $b = @{[$a * $b]}";

prints: 3 * 4 = 12

Eli answered 2/10, 2008 at 11:49 Comment(0)
I
7

This one isn't particularly useful, but it's extremely esoteric. I stumbled on this while digging around in the Perl parser.

Before there was POD, perl4 had a trick to allow you to embed the man page, as nroff, straight into your program so it wouldn't get lost. perl4 used a program called wrapman (see Pink Camel page 319 for some details) to cleverly embed an nroff man page into your script.

It worked by telling nroff to ignore all the code, and then put the meat of the man page after an END tag which tells Perl to stop processing code. Looked something like this:

#!/usr/bin/perl
'di';
'ig00';

...Perl code goes here, ignored by nroff...

.00;        # finish .ig

'di         \" finish the diversion
.nr nl 0-1  \" fake up transition to first page
.nr % 0     \" start at page 1
'; __END__

...man page goes here, ignored by Perl...

The details of the roff magic escape me, but you'll notice that the roff commands are strings or numbers in void context. Normally a constant in void context produces a warning. There are special exceptions in op.c to allow void context strings which start with certain roff commands.

              /* perl4's way of mixing documentation and code
                 (before the invention of POD) was based on a
                 trick to mix nroff and perl code. The trick was
                 built upon these three nroff macros being used in
                 void context. The pink camel has the details in
                 the script wrapman near page 319. */
                const char * const maybe_macro = SvPVX_const(sv);
                if (strnEQ(maybe_macro, "di", 2) ||
                    strnEQ(maybe_macro, "ds", 2) ||
                    strnEQ(maybe_macro, "ig", 2))
                        useless = NULL;

This means that 'di'; doesn't produce a warning, but neither does 'die'; 'did you get that thing I sentcha?'; or 'ignore this line';.

In addition, there are exceptions for the numeric constants 0 and 1 which allows the bare .00;. The code claims this was for more general purposes.

            /* the constants 0 and 1 are permitted as they are
               conventionally used as dummies in constructs like
                    1 while some_condition_with_side_effects;  */
            else if (SvNIOK(sv) && (SvNV(sv) == 0.0 || SvNV(sv) == 1.0))
                useless = NULL;

And what do you know, 2 while condition does warn!

Iaria answered 2/10, 2008 at 11:49 Comment(1)
This one is really a hidden feature of Perl!Subtotal
O
7

I don't know how esoteric it is, but one of my favorites is the hash slice. I use it for all kinds of things. For example to merge two hashes:

my %number_for = (one => 1, two => 2, three => 3);
my %your_numbers = (two => 2, four => 4, six => 6);
@number_for{keys %your_numbers} = values %your_numbers;
print sort values %number_for; # 12346
Osset answered 2/10, 2008 at 11:49 Comment(1)
%number_for = ( %number_for, %your_numbers );Placative
B
6
use diagnostics;

If you are starting to work with Perl and have never done so before, this module will save you tons of time and hassle. For almost every basic error message you can get, this module will give you a lengthy explanation as to why your code is breaking, including some helpful hints as to how to fix it. For example:

use strict;
use diagnostics;

$var = "foo";

gives you this helpful message:

Global symbol "$var" requires explicit package name at - line 4.
Execution of - aborted due to compilation errors (#1)
    (F) You've said "use strict vars", which indicates that all variables
    must either be lexically scoped (using "my"), declared beforehand using
    "our", or explicitly qualified to say which package the global variable
    is in (using "::").

Uncaught exception from user code:
        Global symbol "$var" requires explicit package name at - line 4.
Execution of - aborted due to compilation errors.
 at - line 5
use diagnostics;
use strict;

sub myname {
    print { " Some Error " };
};

you get this large, helpful chunk of text:

syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors (#1)
(F) Probably means you had a syntax error.  Common reasons include:

    A keyword is misspelled.
    A semicolon is missing.
    A comma is missing.
    An opening or closing parenthesis is missing.
    An opening or closing brace is missing.
    A closing quote is missing.

Often there will be another error message associated with the syntax
error giving more information.  (Sometimes it helps to turn on -w.)
The error message itself often tells you where it was in the line when
it decided to give up.  Sometimes the actual error is several tokens
before this, because Perl is good at understanding random input.
Occasionally the line number may be misleading, and once in a blue moon
the only way to figure out what's triggering the error is to call
perl -c repeatedly, chopping away half the program each time to see
if the error went away.  Sort of the cybernetic version of S.

Uncaught exception from user code:
    syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors.
at - line 7

From there you can go about deducing what might be wrong with your program (in this case, print is formatted entirely wrong). There's a large number of known errors with diagnostics. Now, while this would not be a good thing to use in production, it can serve as a great learning aid for those who are new to Perl.

Bunnybunow answered 2/10, 2008 at 11:49 Comment(0)
B
6
sub load_file
{
    local(@ARGV, $/) = shift;
    <>;
}

and a version that returns an array as appropriate:

sub load_file
{
    local @ARGV = shift;
    local $/ = wantarray? $/: undef;
    <>;
}
Bituminous answered 2/10, 2008 at 11:49 Comment(0)
E
5

@Schwern mentioned turning warnings into errors by localizing $SIG{__WARN__}. You can do also do this (lexically) with use warnings FATAL => "all";. See perldoc lexwarn.

On that note, since Perl 5.12, you've been able to say perldoc foo instead of the full perldoc perlfoo. Finally! :)

Epicardium answered 2/10, 2008 at 11:49 Comment(0)
H
5

($x, $y) = ($y, $x) is what made me want to learn Perl.

The list constructor 1..99 or 'a'..'zz' is also very nice.

Holmberg answered 2/10, 2008 at 11:49 Comment(0)
P
5

There also is $[ the variable which decides at which index an array starts. Default is 0 so an array is starting at 0. By setting

$[=1;

You can make Perl behave more like AWK (or Fortran) if you really want to.

Primaveras answered 2/10, 2008 at 11:49 Comment(3)
Although to quote from the perlvar documentuation: "Its use is highly discouraged.". Not many people expect the starting subscript of an array to change.Trinomial
I would only use this feature in a one-liner, if ever.Prophylactic
be warned: if you do this in a CPAN module, chromatic will find you and spray-paint your car. (this is a joke; chromatic has not been consulted regarding it.)Regicide
W
4

This one-liner illustrates how to use glob to generate all word combinations of an alphabet (A, T, C, and G -> DNA) for words of a specified length (4):

perl -MData::Dumper -e '@CONV = glob( "{A,T,C,G}" x 4 ); print Dumper( \@CONV )'
Woodchopper answered 2/10, 2008 at 11:49 Comment(0)
C
4

One useful composite operator for conditionally adding strings or lists into other lists is the x!!operator:

 print 'the meaning of ', join ' ' =>  
     'life,'                x!! $self->alive,
     'the universe,'        x!! ($location ~~ Universe),
     ('and', 'everything.') x!! 42; # this is added as a list

this operator allows for a reversed syntax similar to

 do_something() if test();
Cellulose answered 2/10, 2008 at 11:49 Comment(0)
C
4

The Schwartzian Transform is a technique that allows you to efficiently sort by a computed, secondary index. Let's say that you wanted to sort a list of strings by their md5 sum. The comments below are best read backwards (that's the order I always end up writing these anyways):

my @strings = ('one', 'two', 'three', 'four');

my $md5sorted_strings = 
    map { $_->[0] }               # 4) map back to the original value
    sort { $a->[1] cmp $b->[1] }  # 3) sort by the correct element of the list
    map { [$_, md5sum_func($_)] } # 2) create a list of anonymous lists
    @strings                      # 1) take strings

This way, you only have to do the expensive md5 computation N times, rather than N log N times.

Clammy answered 2/10, 2008 at 11:49 Comment(0)
I
4

Use lvalues to make your code really confusing:

my $foo = undef ;
sub bar:lvalue{ return $foo ;}

# Then later

bar = 5 ;
print bar ;
Inkling answered 2/10, 2008 at 11:49 Comment(0)
M
4

All right. Here is another. Dynamic Scoping. It was talked about a little in a different post, but I didn't see it here on the hidden features.

Dynamic Scoping like Autovivification has a very limited amount of languages that use it. Perl and Common Lisp are the only two I know of that use Dynamic Scoping.

Meredi answered 2/10, 2008 at 11:49 Comment(1)
Ohh ya, "local" designates a dynamically scoped variable, while "my" designates a variable in a static scope.Episternum
C
4

How about the ability to use

my @symbols = map { +{ 'key' => $_ } } @things;

to generate an array of hashrefs from an array -- the + in front of the hashref disambiguates the block so the interpreter knows that it's a hashref and not a code block. Awesome.

(Thanks to Dave Doyle for explaining this to me at the last Toronto Perlmongers meeting.)

Columelliform answered 2/10, 2008 at 11:49 Comment(0)
N
4

Core IO::Handle module. Most important thing for me is that it allows autoflush on filehandles. Example:

use IO::Handle;    
$log->autoflush(1);
Nuclei answered 2/10, 2008 at 11:49 Comment(1)
Not like we didn’t know how to do it before IO::Handle, you know.Slimy
C
4

Safe compartments.

With the Safe module you can build your own sandbox-style environment using nothing but perl. You would then be able to load perl scripts into the sandbox.

Best regards,

Callboy answered 2/10, 2008 at 11:49 Comment(0)
V
3

You can use different quotes on HEREDOCS to get different behaviors.

my $interpolation = "We will interpolated variables";
print <<"END";
With double quotes, $interpolation, just like normal HEREDOCS.
END

print <<'END';
With single quotes, the variable $foo will *not* be interpolated.
(You have probably seen this in other languages.)
END

## this is the fun and "hidden" one
my $shell_output = <<`END`;
echo With backticks, these commands will be executed in shell.
echo The output is returned.
ls | wc -l
END

print "shell output: $shell_output\n";
Ventricose answered 2/10, 2008 at 11:49 Comment(0)
I
3

use re debug
Doc on use re debug

and

perl -MO=Concise[,OPTIONS]
Doc on Concise

Besides being exquisitely flexible, expressive and amenable to programing in the style of C, Pascal, Python and other languages, there are several pragmas command switches that make Perl my 'goto' language for initial kanoodling on an algorithm, regex, or quick problems that needs to be solved. These two are unique to Perl I believe, and are among my favorites.

use re debug: Most modern flavors of regular expressions owe their current form and function to Perl. While there are many Perl forms of regex that cannot be expressed in other languages, there are almost no forms of other languages' flavor of regex that cannot be expressed in Perl. Additionally, Perl has a wonderful regex debugger built in to show how the regex engine is interpreting your regex and matching against the target string.

Example: I recently was trying to write a simple CSV routine. (Yes, yes, I know, I should have been using Text::CSV...) but the CSV values were not quoted and simple.

My first take was /^(^(?:(.*?),){$i}/ to extract the i record on n CSV records. That works fine -- except for the last record or n of n. I could see that without the debugger.

Next I tried /^(?:(.*?),|$){$i}/ This did not work, and I could not see immediately why. I thought I was saying (.*?) followed by a comma or EOL. Then I added use re debug at the top of a small test script. Ahh yes, the alteration between ,|$ was not being interpreted that way; it was being interpreted as ((.*?),) | ($) -- not what I wanted.

A new grouping was needed. So I arrived at the working /^(?:(.*?)(?:,|$)){$i}/. While I was in the regex debugger, I was surprised how many loops it took for a match towards the end of the string. It is the .*? term that is quite ambiguous and requires excessive backtracking to satisfy. So I tried /^(?:(?:^|,)([^,]*)){$i}/ This does two things: 1) reduces backtracking because of the greedy match of all but a comma 2) allowed the regex optimizer to only use the alteration once on the first field. Using Benchmark, this is 35% faster than the first regex. The regex debugger is wonderful and few use it.

perl -MO=Concise[,OPTIONS]: The B and Concise frameworks are tremendous tools to see how Perl is interpreting your masterpiece. Using the -MO=Concise prints the result of the Perl interpreters translation of your source code. There are many options to Concise and in B, you can write your own presentation of the OP codes.

As in this post, you can use Concise to compare different code structures. You can interleave your source lines with the OP codes those lines generate. Check it out.

Internuncio answered 2/10, 2008 at 11:49 Comment(0)
S
3

There is a more powerful way to check program for syntax errors:

perl -w -MO=Lint,no-context myscript.pl

The most important thing that it can do is reporting for 'unexistant subroutine' errors.

Salutatory answered 2/10, 2008 at 11:49 Comment(0)
M
3

Quantum::Superpositions

use Quantum::Superpositions;

if ($x == any($a, $b, $c)) { ...  }
Millipede answered 2/10, 2008 at 11:49 Comment(0)
C
3

I personally love the /e modifier to the s/// operation:

while(<>) {
  s/(\w{0,4})/reverse($1);/e; # reverses all words between 0 and 4 letters
  print;
}

Input:

This is a test of regular expressions
^D

Output (I think):

sihT si a tset fo regular expressions
Compellation answered 2/10, 2008 at 11:49 Comment(0)
O
3

Very late to the party, but: attributes.

Attributes essentially let you define arbitrary code to be associated with the declaration of a variable or subroutine. The best way to use these is with Attribute::Handlers; this makes it easy to define attributes (in terms of, what else, attributes!).

I did a presentation on using them to declaratively assemble a pluggable class and its plugins at YAPC::2006, online here. This is a pretty unique feature.

Otey answered 2/10, 2008 at 11:49 Comment(0)
D
3

You can replace the delimiter in regexes and strings with just about anything else. This is particularly useful for "leaning toothpick syndrome", exemplified here:

$url =~ /http:\/\/www\.stackoverflow\.com\//;

You can eliminate most of the back-whacking by changing the delimiter. /bar/ is shorthand for m/bar/ which is the same as m!bar!.

$url =~ m!http://www\.stackoverflow\.com/!;

You can even use balanced delimiters like {} and []. I personally love these. q{foo} is the same as 'foo'.

$code = q{
    if( this is awesome ) {
        print "Look ma, no escaping!";
    }
};

To confuse your friends (and your syntax highlighter) try this:

$string = qq'You owe me $1,000 dollars!';
Dues answered 2/10, 2008 at 11:49 Comment(1)
You should explicitly mention that, when using {} (and friends) as quote delimiters, Perl will balance the delimiters.Compellation
M
3

My favorite semi-hidden feature of Perl is the eof function. Here's an example pretty much directly from perldoc -f eof that shows how you can use it to reset the file name and $. (the current line number) easily across multiple files loaded up at the command line:

while (<>) {
  print "$ARGV:$.\t$_";
} 
continue {
  close ARGV if eof
}
Multifold answered 2/10, 2008 at 11:49 Comment(0)
D
2

The feature I like the best is statement modifiers.

Don't know how many times I've wanted to do:

say 'This will output' if 1;
say 'This will not output' unless 1;
say 'Will say this 3 times. The first Time: '.$_ for 1..3;

in other languages. etc...

The 'etc' reminded me of another 5.12 feature, the Yada Yada operator.

This is great, for the times when you just want a place holder.

sub something_really_important_to_implement_later {
    ...
} 

Check it out: Perl Docs on Yada Yada Operator.

Dabchick answered 2/10, 2008 at 11:49 Comment(1)
It’s an ellipsis, actually.Slimy
N
2

You can expand function calls in a string, for example;

print my $foo = "foo @{[scalar(localtime)]} bar";

foo Wed May 26 15:50:30 2010 bar

Narwhal answered 2/10, 2008 at 11:49 Comment(0)
D
2

The new -E option on the command line:

> perl -e "say 'hello"" # does not work 

String found where operator expected at -e line 1, near "say 'hello'"
        (Do you need to predeclare say?)
syntax error at -e line 1, near "say 'hello'"
Execution of -e aborted due to compilation errors.

> perl -E "say 'hello'" 
hello
Discriminant answered 2/10, 2008 at 11:49 Comment(0)
T
2

The ability to use a hash as a seen filter in a loop. I have yet to see something quite as nice in a different language. For example, I have not been able to duplicate this in python.

For example, I want to print a line if it has not been seen before.

my %seen;

for (<LINE>) {
  print $_ unless $seen{$_}++;
}
Tomasz answered 2/10, 2008 at 11:49 Comment(0)
H
2

Two things that work well together: IO handles on in-core strings, and using function prototypes to enable you to write your own functions with grep/map-like syntax.

sub with_output_to_string(&) {           # allows compiler to accept "yoursub {}" syntax.
  my $function = shift;
  my $string   = '';
  my $handle   = IO::Handle->new();
  open($handle, '>', \$string) || die $!; # IO handle on a plain scalar string ref
  my $old_handle = select $handle;
  eval { $function->() };
  select $old_handle;
  die $@ if $@;
  return $string;
}

my $greeting = with_output_to_string {
  print "Hello, world!";
};

print $greeting, "\n";
Hejira answered 2/10, 2008 at 11:49 Comment(0)
V
2

The following are just as short but more meaningful than "~~" since they indicate what is returned, and there's no confusion with the smart match operator:

print "".localtime;   # Request a string

print 0+@array;       # Request a number
Valoniah answered 2/10, 2008 at 11:49 Comment(0)
E
2

Axeman reminded me of how easy it is to wrap some of the built-in functions.

Before Perl 5.10 Perl didn't have a pretty print(say) like Python.

So in your local program you could do something like:

sub print {
     print @_, "\n";
}

or add in some debug.

sub print {
    exists $ENV{DEVELOPER} ?
    print Dumper(@_) :
    print @_;
}
Episternum answered 2/10, 2008 at 11:49 Comment(5)
It's also very easy to accidentally change the context! Your print subroutine (using . to concatenate) will print the number of items to be printed, rather than the items themselves. Using print @_, "\n" (note the comma) will preserve the context.Trinomial
:-D tks for the clarification. I will edit accordingly. :-D teach me to write code without running it :-PEpisternum
Er...except I don't think you can override print this way. Most other builtins I think you can, but not print. :|Bunnybunow
This is true - print cannot be overridden.Compellation
@Chris: It's true print cannot be overridden in this manner. But not that it cannot be overridden absolutely. Using some sub modules of B and some tricks you can find on perlmonks.com you can override it.Episternum
D
1

Next time you're at a geek party pull out this one-liner in a bash shell and the women will swarm you and your friends will worship you:

find . -name "*.txt"|xargs perl -pi -e 's/1:(\S+)/uc($1)/ge'

Process all *.txt files and do an in-place find and replace using perl's regex. This one converts text after a '1:' to upper case and removes the '1:'. Uses Perl's 'e' modifier to treat the second part of the find/replace regex as executable code. Instant one-line template system. Using xargs lets you process a huge number of files without running into bash's command line length limit.

Demulsify answered 2/10, 2008 at 11:49 Comment(0)
M
1

Add one for the unpack() and pack() functions, which are great if you need to import and/or export data in a format which is used by other programs.

Of course these days most programs will allow you to export data in XML, and many commonly used proprietary document formats have associated Perl modules written for them. But this is one of those features that is incredibly useful when you need it, and pack()/unpack() are probably the reason that people have been able to write CPAN modules for so many proprietary data formats.

Millsap answered 2/10, 2008 at 11:49 Comment(0)
T
1

Using hashes (where keys are unique) to obtain the unique elements of a list:

my %unique = map { $_ => 1 } @list;
my @unique = keys %unique;
Tribasic answered 2/10, 2008 at 11:49 Comment(0)
B
1

Interpolation of match regular expressions. A useful application of this is when matching on a blacklist. Without using interpolation it is written like so:

#detecting blacklist words in the current line
/foo|bar|baz/;

Can instead be written

@blacklistWords = ("foo", "bar", "baz");
$anyOfBlacklist = join "|", (@blacklistWords);
/$anyOfBlacklist/;

This is more verbose, but allows for population from a datafile. Also if the list is maintained in the source for whatever reason, it is easier to maintain the array then the RegExp.

Bethanybethe answered 2/10, 2008 at 11:49 Comment(0)
S
1

The expression defined &DB::DB returns true if the program is running from within the debugger.

Selfwill answered 2/10, 2008 at 11:49 Comment(0)
I
1

You might think you can do this to save memory:

@is_month{qw(jan feb mar apr may jun jul aug sep oct nov dec)} = undef;

print "It's a month" if exists $is_month{lc $mon};

but it doesn't do that. Perl still assigns a different scalar value to each key. Devel::Peek shows this. PVHV is the hash. Elt is a key and the SV that follows is its value. Note that each SV has a different memory address indicating they're not being shared.

Dump \%is_month, 12;

SV = RV(0x81c1bc) at 0x81c1b0
  REFCNT = 1
  FLAGS = (TEMP,ROK)
  RV = 0x812480
  SV = PVHV(0x80917c) at 0x812480
    REFCNT = 2
    FLAGS = (SHAREKEYS)
    ARRAY = 0x206f20  (0:8, 1:4, 2:4)
    hash quality = 101.2%
    KEYS = 12
    FILL = 8
    MAX = 15
    RITER = -1
    EITER = 0x0
    Elt "feb" HASH = 0xeb0d8580
    SV = NULL(0x0) at 0x804b40
      REFCNT = 1
      FLAGS = ()
    Elt "may" HASH = 0xf2290c53
    SV = NULL(0x0) at 0x812420
      REFCNT = 1
      FLAGS = ()

An undef scalar takes as much memory as an integer scalar, so you might ask well just assign them all to 1 and avoid the trap of forgetting to check with exists.

my %is_month = map { $_ => 1 } qw(jan feb mar apr may jun jul aug sep oct nov dec);

print "It's a month" if $is_month{lc $mon});
Iaria answered 2/10, 2008 at 11:49 Comment(3)
This doesn't save memory, and it generates a nice trap for the unsuspecting programmer. Perl still assigns an undef scalar value to each key and undef doesn't take less memory than 1. Use Devel::Peek to see.Iaria
You might be right that the "undef" construct doesn't save memory. However, in my opinion, it's better than your solution for several reasons: 1. the "undef" method tells the reader that the value isn't used 2. the "1" initializer is more complicated for no good reason 3. requiring "exists" is no more trap than many other things in PerlBituminous
Also, note that the "1" method does use more RAM than "undef"! Try creating a program that initialzes a million elements this way and then look at the memory footprint using ps. You'll see that the "1" method uses more memory. I think it's true that the data structures are the same size, but the initializer uses more memory.Bituminous
A
1

I'm a bit late to the party, but a vote for the built-in tied-hash function dbmopen() -- it's helped me a lot. It's not exactly a database, but if you need to save data to disk it takes away a lot of the problems and Just Works. It helped me get started when I didn't have a database, didn't understand Storable.pm, but I knew I wanted to progress beyond reading and writing to text files.

Andesine answered 2/10, 2008 at 11:49 Comment(0)
P
0

B::Deparse - Perl compiler backend to produce perl code. Not something you'd use in your daily Perl coding, but could be useful in special circumstances.

If you come across some piece of code that is obfuscated, or a complex expression, pass it through Deparse. Useful to figure out a JAPH or a Perl code that is golfed.

$ perl -e '$"=$,;*{;qq{@{[(A..Z)[qq[0020191411140003]=~m[..]g]]}}}=*_=sub{print/::(.*)/};$\=$/;q<Just another Perl Hacker>->();'
Just another Perl Hacker

$ perl -MO=Deparse -e '$"=$,;*{;qq{@{[(A..Z)[qq[0020191411140003]=~m[..]g]]}}}=*_=sub{print/::(.*)/};$\=$/;q<Just another Perl Hacker>->();'
$" = $,;
*{"@{[('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z')['0020191411140003' =~ /../g]];}";} = *_ = sub {
    print /::(.*)/;
}
;
$\ = $/;
'Just another Perl Hacker'->();
-e syntax OK

A more useful example is to use deparse to find out the code behind a coderef, that you might have received from another module, or

use B::Deparse;
my $deparse = B::Deparse->new;
$code = $deparse->coderef2text($coderef);
print $code;
Predominant answered 2/10, 2008 at 11:49 Comment(0)
F
0

"now"

sub _now { 
        my ($now) = localtime() =~ /([:\d]{8})/;
        return $now;
}

print _now(), "\n"; #  15:10:33
Fyn answered 2/10, 2008 at 11:49 Comment(1)
That's been answered at https://mcmap.net/q/48721/-hidden-features-of-perl/… already.Wernerwernerite
T
0

Perl is great as a flexible awk/sed.

For example lets use a simple replacement for ls | xargs stat, naively done like:

$ ls | perl -pe 'print "stat "' | sh 

This doesn't work well when the input (filenames) have spaces or shell special characters like |$\. So single quotes are frequently required in the Perl output.

One complication with calling perl via the command line -ne is that the shell gets first nibble at your one-liner. This often leads to torturous escaping to satisfy it.

One 'hidden' feature that I use all the time is \x27 to include a single quote instead of trying to use shell escaping '\''

So:

$ ls | perl -nle 'chomp; print "stat '\''$_'\''"' | sh

can be more safely written:

$ ls | perl -pe 's/(.*)/stat \x27$1\x27/' | sh

That won't work with funny characters in the filenames, even quoted like that. But this will:

$ ls | perl -pe 's/\n/\0/' | xargs -0 stat
Tefillin answered 2/10, 2008 at 11:49 Comment(0)
C
0

using bare blocks with redo or other control words to create custom looping constructs.

traverse a linked list of objects returning the first ->can('print') method:

sub get_printer {
    my $self = shift;
    {$self->can('print') or $self = $self->next and redo}
}
Cellulose answered 2/10, 2008 at 11:49 Comment(0)
O
0

$0 is the name of the perl script being executed. It can be used to get the context from which a module is being run.

# MyUsefulRoutines.pl

sub doSomethingUseful {
  my @args = @_;
  # ...
}

if ($0 =~ /MyUsefulRoutines.pl/) {
  # someone is running  perl MyUsefulRoutines.pl [args]  from the command line
  &doSomethingUseful (@ARGV);
} else {
  # someone is calling  require "MyUsefulRoutines.pl"  from another script
  1;
}

This idiom is helpful for treating a standalone script with some useful subroutines into a library that can be imported into other scripts. Python has similar functionality with the object.__name__ == "__main__" idiom.

Ornithology answered 2/10, 2008 at 11:49 Comment(1)
What you need is modulinos.Subtotal
M
0

One more...

Perl cache:

my $processed_input = $records || process_inputs($records_file);

On Elpeleg Open Source, Perl CMS http://www.web-app.net/

Mascot answered 2/10, 2008 at 11:49 Comment(0)
S
0

Showing progress in the script by printing on the same line:

$| = 1; # flush the buffer on the next output 

for $i(1..100) {
    print "Progress $i %\r"
}
Sweated answered 2/10, 2008 at 11:49 Comment(0)
L
0

@Corion - Bare URLs in Perl? Of course you can, even in interpolated strings. The only time it would matter is in a string that you were actually USING as a regular expression.

Lotson answered 2/10, 2008 at 11:49 Comment(1)
It comes from a joke where, in C++, you could embed, raw and without quotes or comments, a URL in your program: http://www.example.com (the http: is a label, and the // makes the rest a comment). This is what everyone is referring to.Compellation
M
-1

I like the way we can insert a element in any place in the array, such as

=> Insert $x in position $i in array @a

@a = ( 11, 22, 33, 44, 55, 66, 77 );
$x = 10;
$i = 3;

@a = ( @a[0..$i-1], $x, @a[$i..$#a] );
Moidore answered 2/10, 2008 at 11:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.