Why is Moose code so slow?

Asked 12/10, 2010 at 13:14 Answered 12/10, 2010 at 19:40

I'm trying to parse a large XML file. I read it using XML::SAX (using Expat, not the perl implementation) and put all the second level and below nodes into my "Node" class:

package Node;
use Moose;

has "name" =>
(
    isa =>  "Str",
    reader  => 'getName'
);

has "text" =>
( 
  is    =>  "rw",
  isa   =>  "Str"
);

has "attrs" =>
(
    is  =>  "rw",
    isa =>  "HashRef[Str]"

);

has "subNodes"  =>
(
    is  =>  "rw",
    isa =>  "ArrayRef[Node]",
    default => sub { [] }
);

sub subNode
{   
  my ($self, $name) = @_;
  my $subNodeRef = $self->subNodes;
  my @matchingSubnodes = grep { $_->getName eq $name } @$subNodeRef;

  if (scalar(@matchingSubnodes) == 1)
  {

    return $matchingSubnodes[0];
  }
  return undef;
}


1;

In the "end_element" sub, I check if this is a node I care about, and if it is, I do some further processing.

This all worked fine on my test files, but the day before yesterday I threw it at my real file, all 13 million lines of it, and it's taking forever. It's been running for over 36 hours. How do I tell if it's Moose or XML::SAX that's the bottleneck? Is Moose always this slow, or am I using it wrong?

Update Doing a profile on a 20,000 line subset of the data shows that it is Moose that's the bottleneck - specifically in Class::MOP::Class::compute_all_applicable_attributes (13.9%) and other Class and Moose classes.

Antho answered 12/10, 2010 at 13:14 Comment(7)

Moose may look slow, but don't get him angry... – Awaken 12/10, 2010 at 13:33

Class::MOP::Class::compute_all_applicable_attributes is something that needs to be done a lot if you don't __PACKAGE__->meta->make_immutable your classes like I suggested in my answer. How does the profile change if you do that? – Carrico 12/10, 2010 at 14:2

@Ether, that one was about start-up cost. Mine is about running cost, especially object creation and destruction. – Antho 12/10, 2010 at 15:17

@Paul: yep you're right, my mistake. – Mufti 12/10, 2010 at 16:3

try installing Mouse and see if s/Moose/Mouse/ helps your speed significantly. – Smythe 12/10, 2010 at 17:1

I would love to have an example of your XML data to test this with, I do something similar in XML::Toolkit (using the bundled XML::Filter::Moose) and would be curious to profile it. – Rowlett 12/10, 2010 at 18:37

the code you have shown is fairly short, why not rewrite it using traditional perl5 objects and see how that changes things? write it in a very trusting way, not checking any type constraints or using any defensive programming practices. that will give you an upper bound on the speeds you can hope to get out of perl's object model – Ninos 13/10, 2010 at 0:36

While Moose does quite some work at startup time, which sometimes makes it appear a little slow, the code it generates, especially things like accessors for attributes, are generally quite a bit faster than what the average perl programmer would be able to write. So given the runtime of your process is quite long, I doubt any overhead induced by Moose will be relevant.

However, from the code you've shown, I can't really tell what you're bottleneck is, even though I firmly believe it isn't Moose. I also want to point out that doing __PACKAGE__->meta->make_immutable, to state that you're class is now "finalised" allows Moose to do some further optimisations, but still I doubt this is what's causing you trouble.

How about you take a smaller sample of your data, so your program will finish in a reasonable time, and have a look at that in a profiler such as Devel::NYTProf. That'll be able to tell you where exactly the time in your program is spent, so you can optimise specifically those parts to get the greatest possible gain.

One possibility is that the type constraints you're using slow things down quite a bit. Actually validating instance attributes as thoroughly on every single write access to them (or on class instanciation), isn't something most programmers would usually do. You could try using simpler constraints, such as ArrayRef instead of ArrayRef[Node], if you're certain enough about the validity of your data. That way, only the type of the attribute value itself will be checked, not the value of every element in that array reference.

But still, profile your code. Don't guess.

Carrico answered 12/10, 2010 at 13:25 Comment(8)

Is NYTProf that much better than DProf? – Antho 12/10, 2010 at 13:41

The __PACKAGE__->meta->make_immutable thing made my subset go from 11 seconds down to 6 seconds. Now the profiler says it's spending 26% of its time in Moose::Meta::TypeConstraint::ArrayRef[Node] so I'm going to try your suggestion to loosen the constraint on that next. – Antho 12/10, 2010 at 14:14

Loosening the constraints didn't have any appreciable difference - my 20,000 line test file went from 6.4 seconds to 6.3 seconds. – Antho 12/10, 2010 at 15:17

@Paul: what does the time breakdown look like now? it should be mostly in XML processing now, not in Moose innards. – Mufti 12/10, 2010 at 18:59

dprofpp -I -u shows 64% in XML::SAX::Expat::_handle_start, of which 51% is in EADHandler::start_element (where I "new" the Node class), 26% is Moose::Meta::TypeConstrant::ArrayRef[Node]. – Antho 12/10, 2010 at 19:9

@Paul: how does the percentage change if you switch to using a type constraint of just ArrayRef? – Mufti 12/10, 2010 at 19:12

@Ether, almost exactly the same run time, but now it's 43% in XML::SAX::Expat::_handle_start, 26.5% in EADHandler::start_element and only 16.8% in Node::new. – Antho 12/10, 2010 at 19:28

See also: ttphttps://mcmap.net/q/958444/-is-moose-really-this-slow/3162676 – Mufti 17/5, 2011 at 22:36

I highly suspect that your speed problem is not in Moose so much as it is in memory allocation and disk swapping. Even without doing ->meta->make_immutable, based on your times for the 20K subset, your script should finish in about 2 hours (((11 * (13_000_000 / 20_000)) / 60) == ~119 min). By doing ->meta->make_immutable it would have cut it down to approx. 65 min or so.

Try running your big script again and see what your memory and swap are doing, I suspect your giving your disk an awful thrashing.

Bole answered 12/10, 2010 at 19:40 Comment(1)

Munin says I was hardly swapping at all during the first 36 hour run. See xcski.com/munin/xcski.com/allhats2.xcski.com-cpu.html, xcski.com/munin/xcski.com/allhats2.xcski.com-memory.html and xcski.com/munin/xcski.com/allhats2.xcski.com-swap.html from the evening of 10Oct to the mid morning of 12Oct. You can see more swap in use during the second run (starting about 10am on 12Oct) after I did make_immutable, but that may be related to other things I was doing at the same time. – Antho 12/10, 2010 at 20:10

I have successfully written large XML processing apps using XML::Twig 745mb file take less then an hour to run on a reasonably sized box.

But as other users have already mentioned you need to profile your code to figure out what exactly is causing the issue.

Bloated answered 12/10, 2010 at 14:8 Comment(0)

Recommended topics

Hot tags