perl: iterate over a typeglob
Asked Answered
A

7

6

Given a typeglob, how can I find which types are actually defined?

In my application, we user PERL as a simple configuration format. I'd like to require() the user config file, then be able to see which variables are defined, as well as what types they are.

Code: (questionable quality advisory)

#!/usr/bin/env perl

use strict;
use warnings;

my %before = %main::;
require "/path/to/my.config";
my %after = %main::;

foreach my $key (sort keys %after) {
    next if exists $before{$symbol}; 

    local *myglob = $after{$symbol};
    #the SCALAR glob is always defined, so we check the value instead
    if ( defined ${ *myglob{SCALAR} } ) {
        my $val = ${ *myglob{SCALAR} };
        print "\$$symbol = '".$val."'\n" ;
    }
    if ( defined *myglob{ARRAY} ) {
        my @val = @{ *myglob{ARRAY} };
        print "\@$symbol = ( '". join("', '", @val) . "' )\n" ;
    }
    if ( defined *myglob{HASH} ) {
        my %val = %{ *myglob{HASH} };
        print "\%$symbol = ( ";
        while(  my ($key, $val) = each %val )  {
            print "$key=>'$val', ";
        }
        print ")\n" ;
    }
}

my.config:

@A = ( a, b, c );
%B = ( b=>'bee' );
$C = 'see';

output:

@A = ( 'a', 'b', 'c' )
%B = ( b=>'bee', )
$C = 'see'
$_<my.config = 'my.config'
Artefact answered 2/8, 2010 at 20:42 Comment(10)
Is your current code snippet working for you? If not, do you have a simple sample config file you could post?Blase
@molecules I've added a sample config. It's just very simple perl.Artefact
@molecules: If I understand it correctly, it means that I'll always get false positives for scalars, but then I can check if the value is undef, and also I should still be able to detect ARRAY and HASH correctly.Artefact
@bukzor. Correct, except I don't know how you would "directly" check if the value is undef because any use of *foo{SCALAR} causes it to be defined, even the simple phrase if defined *foo{SCALAR}. However, you have other options as shown below.Blase
I hate to ask, but instead of messing around with typeglobs, why not switch to a real configuration format? e.g. check out Config::Simple and YAML.Antipodes
@Antipodes my users are supposed to know simple PERL. At the time, we thought this would be simple way to configure things, but perhaps we were mistaken.Artefact
@buzkor: yeah, I wouldn't recommend messing around with typeglobs and symbol tables.. Perl gives you a lot of rope to work with, but that rope is also quite happy to self-noosify and self-tie-around-your-neck if you're not careful :)Antipodes
@Antipodes I agree. I got so caught up in thinking about typeglobs that I missed the real underlying need. What you (bukzor) really need is just what Ether prescribed.Blase
@Ether. Shouldn't you submit that as an answer?Blase
@molecules: probably; was feeling lazy. Shall do so now.Antipodes
O
7

In the fully general case, you can't do what you want thanks to the following excerpt from perlref:

*foo{THING} returns undef if that particular THING hasn't been used yet, except in the case of scalars. *foo{SCALAR} returns a reference to an anonymous scalar if $foo hasn't been used yet. This might change in a future release.

But if you're willing to accept the restriction that any scalar must have a defined value to be detected, then you might use code such as

#! /usr/bin/perl

use strict;
use warnings;

open my $fh, "<", \$_;  # get DynaLoader out of the way

my %before = %main::;
require "my.config";
my %after = %main::;

foreach my $name (sort keys %after) {
  unless (exists $before{$name}) {
    no strict 'refs';
    my $glob = $after{$name};
    print "\$$name\n"             if defined ${ *{$glob}{SCALAR} };
    print "\@$name\n"             if defined    *{$glob}{ARRAY};
    print "%$name\n"              if defined    *{$glob}{HASH};
    print "&$name\n"              if defined    *{$glob}{CODE};
    print "$name (format)\n"      if defined    *{$glob}{FORMAT};
    print "$name (filehandle)\n"  if defined    *{$glob}{IO};
  }
}

will get you there.

With my.config of

$JACKPOT = 3_756_788;
$YOU_CANT_SEE_ME = undef;

@OPTIONS = qw/ apple cherries bar orange lemon /;

%CREDITS = (1 => 1, 5 => 6, 10 => 15);

sub is_jackpot {
  local $" = ""; # " fix Stack Overflow highlighting
  "@_[0,1,2]" eq "barbarbar";
}

open FH, "<", \$JACKPOT;

format WinMessage =
You win!
.

the output is

%CREDITS
FH (filehandle)
$JACKPOT
@OPTIONS
WinMessage (format)
&is_jackpot

Printing the names takes a little work, but we can use the Data::Dumper module to take part of the burden. The front matter is similar:

#! /usr/bin/perl

use warnings;
use strict;

use Data::Dumper;
sub _dump {
  my($ref) = @_;
  local $Data::Dumper::Indent = 0;
  local $Data::Dumper::Terse  = 1;
  scalar Dumper $ref;
}

open my $fh, "<", \$_;  # get DynaLoader out of the way

my %before = %main::;
require "my.config";
my %after = %main::;

We need to dump the various slots slightly differently and in each case remove the trappings of references:

my %dump = (
  SCALAR => sub {
    my($ref,$name) = @_;
    return unless defined $$ref;
    "\$$name = " . substr _dump($ref), 1;
  },

  ARRAY => sub {
    my($ref,$name) = @_;
    return unless defined $ref;
    for ("\@$name = " . _dump $ref) {
      s/= \[/= (/;
      s/\]$/)/;
      return $_;
    }
  },

  HASH => sub {
    my($ref,$name) = @_;
    return unless defined $ref;
    for ("%$name = " . _dump $ref) {
      s/= \{/= (/;
      s/\}$/)/;
      return $_;
    }
  },
);

Finally, we loop over the set-difference between %before and %after:

foreach my $name (sort keys %after) {
  unless (exists $before{$name}) {
    no strict 'refs';
    my $glob = $after{$name};
    foreach my $slot (keys %dump) {
      my $var = $dump{$slot}(*{$glob}{$slot},$name);
      print $var, "\n" if defined $var;
    }
  }
}

Using the my.config from your question, the output is

$ ./prog.pl 
@A = ('a','b','c')
%B = ('b' => 'bee')
$C = 'see'
Overhappy answered 2/8, 2010 at 20:50 Comment(4)
I just looked at what Package::Stash does, and it goes with the obvious workaround: when looking at SCALAR it dereferences the scalarref from the glob and sees if the scalar is defined. So if for some reason you create a scalar but leave undef in it, it won't show up, but at least fictitious scalars don't get in the way.Tawanda
@hobbs: the difference between an undefined scalar and a scalar with an undef value is tenuous at best. I'm ok with lumping them in the same category.Artefact
quite nice. If you'll add values to the output, I'll accept this answer and remove my ugly attempt above.Artefact
In Perl 5.18.2 I have a typeglob that is a list constant (use constant LIST => (1,2,3), and when I query the typeglob $y like print defined *{$y}{$_} ? "$_: yes\n" : "$_: no\n" foreach (qw(ARRAY CODE HASH SCALAR))) I get defined (yes) for all four (when running in the Perl debugger). 8-(Klipspringer
M
3

Beginning in 5.010, you can distinguish whether a SCALAR exists using the B introspection module; see Detecting declared package variables in perl

Update: example copied from that answer:

# package main;
our $f;
sub f {}
sub g {}

use B;
use 5.010;
if ( ${ B::svref_2object(\*f)->SV } ) {
    say "f: Thar be a scalar tharrr!";
}
if ( ${ B::svref_2object(\*g)->SV } ) {
    say "g: Thar be a scalar tharrr!";
}

1;
Menell answered 2/8, 2010 at 21:47 Comment(3)
I wasn't able to garner much from that thread or the B documentation. Do you have a brief example?Artefact
@bukzor: copied the example from the linked answer; was there something else? The SV method will return a B::SPECIAL object for the null value in the SV slot, but that class is also used for a few other special values and doesn't provide good methods for determining which it is, but since B objects are just blessed references to scalars storing the numeric actual address, you can deref and test if that's 0 or not.Menell
I'm really a python guy. I don't know what most of that means.Artefact
T
3

Working code using a CPAN module that gets some of the hair out of the way, Package::Stash. As noted in my comment to gbacon's answer, this is blind to the config file doing $someval = undef but that seems to be unavoidable, and at least the other cases are caught. It also limits itself to the SCALAR, ARRAY, HASH, CODE, and IO types -- getting GLOB and FORMAT is possible but it makes the code less pretty and also creates noise in the output :)

#!perl

use strict;
use warnings;

use Package::Stash;

sub all_vars_in {
  my ($package) = @_;
  my @ret;

  my $stash = Package::Stash->new($package);
  for my $sym ($stash->list_all_package_symbols) {
    for my $sigil (qw($ @ % &), '') {
          my $fullsym = "$sigil$sym";
      push @ret, $fullsym if $stash->has_package_symbol($fullsym);
    }
  }
  @ret;
}

my %before;
$before{$_} ++ for all_vars_in('main');

require "my.config";

for my $var (all_vars_in('main')) {
  print "$var\n" unless exists $before{$var};
}
Tawanda answered 2/8, 2010 at 21:50 Comment(0)
B
1

UPDATE:
gbacon is right. *glob{SCALAR} is defined.

Here is the output I get using your code:

Name "main::glob" used only once:
possible typo at
test_glob_foo_thing.pl line 13. 
'FOO1' (SCALAR)
'FOO1' (GLOB)
'FOO2' (SCALAR)
'FOO2' (GLOB)
'_<my.config' (SCALAR)
'_<my.config' (GLOB)

This is despite FOO2 being defined as a hash, but not as a scalar.

ORIGINAL ANSWER:

If I understand you correctly, you simply need to use the defined built-in.

#!/usr/bin/env perl

use strict;
use warnings;

my %before = %main::;
require "/path/to/my.config";
my %after = %main::;

foreach my $key (sort keys %after) {
    if (not exists $before{$key}) {
        if(defined($after{$key}){
             my $val = $after{$key};
             my $what = ref($val);
             print "'$key' ($what)\n";
        }
    }
}
Blase answered 2/8, 2010 at 20:51 Comment(0)
A
1

I hate to ask, but instead of messing around with typeglobs, why not switch to a real configuration format? e.g. check out Config::Simple and YAML.

I wouldn't recommend messing around with typeglobs and symbol tables in normal cases (some CPAN modules do that, but only at the bottom levels of large systems - e.g. Moose in the lowest levels of Class::MOP). Perl gives you a lot of rope to work with, but that rope is also quite happy to self-noosify and self-tie-around-your-neck if you're not careful :)

See also: How do you manage configuration files in Perl?

Antipodes answered 3/8, 2010 at 14:55 Comment(1)
my users are supposed to know simple PERL. At the time, we thought this would be simple way to configure things, but perhaps we were mistaken. On the surface the config looks quite nice, so it won't change unless I can present an authoritative argument to management.Artefact
M
1
no strict 'refs';
my $func_name = 'myfunc';
*{$func_name}{CODE}()
use strict 'refs';
Merrifield answered 5/4, 2011 at 8:32 Comment(0)
B
0

If you don't mind parsing Data::Dump output, you could use it to tease out the differences.

use strict;
use warnings;
use Data::Dump qw{ dump };

my %before = %main::;
require "my.config";
my %after = %main::;

foreach my $key ( sort keys %after ) {
    if ( not exists $before{$key} ) {
        my $glob = $after{$key};
        print "'$key' " . dump( $glob) . "\n";
    }
}

Using this code with the following config file:

$FOO1 = 3;
$FOO2 = 'my_scalar';
%FOO2 = ( a=>'b', c=>'d' );
@FOO3 = ( 1 .. 5);
$FOO4 = [ 1 .. 5 ];

I believe that this output provides enough information to be able to figure out which parts of each type glob are defined:

'FOO1' do {
  my $a = *main::FOO1;
  $a = \3;
  $a;
}
'FOO2' do {
  my $a = *main::FOO2;
  $a = \"my_scalar";
  $a = { a => "b", c => "d" };
  $a;
}
'FOO3' do {
  my $a = *main::FOO3;
  $a = [1 .. 5];
  $a;
}
'FOO4' do {
  my $a = *main::FOO4;
  $a = \[1 .. 5];
  $a;
}
'_<my.config' do {
  my $a = *main::_<my.config;
  $a = \"my.config";
  $a;
}
Blase answered 2/8, 2010 at 22:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.