Can I use a module, and later unload it shrinking the optree?
Asked Answered
T

2

9

Disclaimer I'm not sure I'm using the right terms. It may not be the optree responsible for the bloat mentioned below: it may be the symbols loaded by DynaLoader that are not freed.

Is it possible to use a module, like POSIX.pm, unload it and lessen (shrink, or prune) the optree without either

  1. Rexecing perl
  2. Forking

Things I've tried,

  1. Class::Unload->unload('POSIX');
  2. Symbol::delete_package('POSIX');
  3. no POSIX;

Here is an easy test create a file test.pl

$|++;
use Symbol;
use Class::Unload;
use POSIX;

print "GOT POSIX";
sleep(3);

no POSIX;
Class::Unload->unload('POSIX');
Symbol::delete_package('POSIX');
print "unloaded";

sleep(3);

Shell command

perl ./test.pl & watch -n1 'ps -C perl -o "cmd rss";'

You may or may not be able to see the RSS size increase (POSIX may load before watch spawns ps). But, I want to see it shrink back down.

Tracking down what exactly POSIX.pm does I see it uses XSLoader which uses DynaLoader.

Doing some quick comparative checks in /proc/$$/smaps I've determined that using POSIX.pm causes a heap allocation that represents the difference in space. The first allocation on the heap, is massively bigger when using POSIX.pm:

56122fe4c000-561230040000 rw-p 00000000 00:00 0                          [heap]
Size:               2000 kB
Rss:                1956 kB
Pss:                1956 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:      1956 kB
Referenced:         1956 kB
Anonymous:          1956 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr mr mw me ac sd

vs

560c9f6ba000-560c9f6fc000 rw-p 00000000 00:00 0                          [heap]
Size:                264 kB
Rss:                 220 kB
Pss:                 220 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:       220 kB
Referenced:          220 kB
Anonymous:           220 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB
VmFlags: rd wr mr mw me ac sd

I've confirmed a few things, nuking the namespace does not drop the open file handle to POSIX.so and Fnctl.so -- I determined this with lsof. That is in itself somewhat concerning. I would think it would make sense to allocate the handle on the callee's package. XSLoader also obscures that you can release that file handle -- a feature available in DynaLoader.

Further, it seems that in libc / dlfcn.h I have

dlclose()

The function dlclose() decrements the reference count on the dynamically loaded shared object referred to by handle. If the reference count drops to zero, then the object is unloaded. All shared objects that were automatically loaded when dlopen() was invoked on the object referred to by handle are recursively closed in the same manner.

A successful return from dlclose() does not guarantee that the symbols associated with handle are removed from the caller's address space. In addition to references resulting from explicit dlopen() calls, a shared object may have been implicitly loaded (and reference counted) because of dependencies in other shared objects. Only when all references have been released can the shared object be removed from the address space.

So I'm guessing that may be suspect, DynaLoader::dl_unload_file is calling dlclose and it does seems to work.

foreach my $dlref ( @DynaLoader::dl_librefs ) {
  print DynaLoader::dl_unload_file($dlref);
}

After I nuked all files loaded with DynaLoader and XSLoader by doing the above the RSS still did not drop.

Triturable answered 13/10, 2017 at 21:27 Comment(10)
You might be interested in the core Perl modules, AutoSplit, AutoLoader, and DynaLoader, which you could employ to load portions of modules you've authored just in time, but once loaded and compiled-in, the memory is consumed.Kingsley
@Kingsley using POSIX requires 2 mb of rss. I want to release that after I no longer need POSIX.Triturable
Subs are reference counted too. Removing all reference to the sub (i.e. removing it from the sym tab) will free it.Mutule
@EvanCarroll: What sort of system are you using where you need to conserve 2MB of memory?Xylem
@Xylem imagine a multitenant service that has 5,000 clients running on the same server, and many many servers. In that case, it'd be 2MB/server. Now imagine it not being multitenant. You can engineer around this problem in others ways, but I want to address it head on if possible.Triturable
So you have 5,000 or so identical Perl processes running? Is POSIX a specific problem? There are alternatives to most of the functions in that module. I would have thought addressing the problem head on would involve removing duplicate data, rather than minimising a specific overhead within each one. Why do you want to avoid fork?Xylem
Yes, POSIX is a specific problem, as are other modules that bloat the optree. We're talking about megs of growth here. As I said, I want to avoid fork. I don't to re-engineer the program. In Perl modules can be dynamically loaded. I want to dynamically unload them too.Triturable
Your test appears to be broken because #1 and #2 does exactly what you asked.Mutule
@Mutule no, they don't try it. See if you can get the resident size back to pre-POSIX import.Triturable
I'm concerned with the RSS which I believe to be a result of the opcodes. I believe that because after I use all of the methods above the RSS does not return to pre-import levels. If it's not the opcodes bloating the RSS after I run those above, then please feel free to answer with an explanation. Perhaps I'm not right in stating it's the opcodes?Triturable
H
5

Generally speaking, no. The gritty details are that almost no one shrinks their own memory because almost everyone uses the C library malloc (and friends) call to allocate memory, either directly or indirectly. And there is no (standard) way to tell the C library to deallocate memory (to send it back to the OS). Perl is no different here - once malloced and freed, the C library upon which Perl depends keeps the memory for future use so that if you need to reuse the memory, no expensive kernel calls are required (specifically, brk), and it can simply be reused. In fact, this is what your unload scenarios are doing - when you come back and reuse that next 2MB in the rest of your server process, you'll be re-using the memory, not calling brk, and you'll be that much faster.

It is possible to do if you take over memory allocation ownership and call brk yourself, but it's rarely worth it. And getting perl to use that allocator would require some code changes to perl and recompilation. Probably not what you want to do.

Other options are to either bite the bullet, load POSIX prior to forking off any servers (which should leave all of that in shared copy-on-write memory, and thus only take up 2MB of memory for 5k servers), or fork, load POSIX in the child, do the dirty work, exit the child, and continue in the parent. This seems to be relatively slow to me.

Hysterogenic answered 16/10, 2017 at 3:15 Comment(2)
I've compiled with -Uusemymalloc which is glibc and I've confirmed that Perl does in fact free from the heap back to the OS. It just doesn't free opcodes.Triturable
This answer is slightly misleading: free() may very well release memory back to the OS and does so in many implementations. This is expected in particular when large pages are mmap'ed. However, free() is not guaranteed to do this. If we do observe that subs do not appear to be freed, this either indicates that a freed sub does not result in a contiguous free memory region that could be released to the OS, or (and this is the problem in the question:) that Perl never frees the opcode structures.Insertion
H
5

Yes, you can.

But there are dragons, and practically not.

SV's and OP's are allocated in arenas. OP's hold pointers to their data, SV's. Those OP's and SV's can be freed via undef, with the malloc'ed parts being immediately freed and the arenas (~70 OPs) freed when all OPs therein are freed.

Then you have the globals which can be easily freed also by walking the namespace. But beware of not destroying data for which references from somewhere else still exist, and it's DESTROY handler cannot deal with that. There's a lot of unsafe DESTROY code out there, because nobody does this.

And sometimes to be deleted globals are referenced from somewhere else, so it will not be freed, just the refcount drops.

And then you have the external XS code, for which you have to call dl_unload_file().

In your case use POSIX creates tons of imports into the main:: namespace, GV aliases of all the imported functions. They need to be deleted also. use POSIX (); will skip the import, so will need MUCH less memory, and chances are it can be deleted fully.

To see what is not really undef'd see

#!/usr/bin/perl
$|++;
my $s = shift // 3;
sub rss { `ps -o "comm,rss,vsize" | grep perl` }
print "BEGIN ",scalar keys %main::," ",rss;
require Symbol;
#require Class::Unload;
require POSIX;

print "GOT POSIX ",scalar keys %main::," ",rss;
sleep($s);

POSIX->import;
print "IMPORT POSIX ",scalar keys %main::," ",rss;
sleep($s);

POSIX->unimport;
#Class::Unload->unload('POSIX');
Symbol::delete_package('POSIX');

for (keys %main::) {
  #print "$_\n";
  undef ${$_} unless /^(STD|!|0|1|2|\]|_)/;
  undef &{$_} unless /rss/;
  undef @{$_};
  # clear the GV
  undef *{$_} unless /^(STD...?|rss|main::|DynaLoader::|_|!)$/;
  # delete the GV
  delete $main::{$_} unless /^(STD...?|rss|main::|DynaLoader::|_|!)$/;
}
#Symbol::delete_package('main::'); # needs a patched Symbol
print "unloaded ",scalar keys %main::," ",rss;

sleep($s);

DynaLoader::dl_unload_file($_) for @DynaLoader::dl_librefs;
undef *DynaLoader::;
print "unload XS ",scalar keys %main::," ",rss;
#print "  $_\n" for keys %main::;
print "POSIX::$_\n" for keys %POSIX::;
print "freed ",scalar keys %main::," ",rss;
sleep($s);

result,

=>
  BEGIN 45 /usr/src/perl/bl   3192  2451188
  GOT POSIX 70 /usr/src/perl/bl   6112  2468844
  IMPORT POSIX 645 /usr/src/perl/bl   6928  2468844
  unloaded 8 /usr/src/perl/bl   7120  2468844
  unload XS 8 /usr/src/perl/bl   7040  2468596
  freed 8 /usr/src/perl/bl   7048  2468596

which shows that

  1. Symbol is unreliable deleting readonly, protected symbols, and
  2. the global symbols (in main::) are not freed by undef, but by deleting the stash entry.
  3. Do not import POSIX and such old import-heavy modules, rather use the full name. Purging these is hard.
  4. You cannot free SVs only OPs, memory will mostly increase, not shrink.

SV head and body arena's are never freed, they are just reused. So you can only shrink the optree, not the data.

The SV's behind the symbol are just set to TEMP if undef'd, so its memory is never freed, and the symbol itself (the GV) is only cleared with undef. OPs are deleted by undef'ing the CV, but system malloc rarely frees it, only if a full page is freed and with glibc with a call to malloc_trim(0), and perl's memory is too sprinkled out. It's still a linked list with not much compaction after all.

The rss goes a bit down from unload XS to freed, but it still higher than after the initial import.

My watcher is watch -n1 'ps -o "comm,rss,vsize" |grep perl;' because this works also on BSD/darwin.

I wrote a Internals::gc() for cperl to actually walk all arenas and free the empty ones, but it's pretty unstable and not recommended, as the VM can only "properly" deal with those free SVs during global destruction, not at run-time. See https://github.com/perl11/cperl/issues/336

Hockett answered 21/10, 2017 at 17:54 Comment(6)
"chances are it can be deleted fully" this is what I was thinking too. I tried both to clean up the exports into SIG and main that POSIX does, and I also tried to tell POSIX to export nothing. However, all of my attempts were fruitless. In this we go from 6.6mb, to 6.3mb RSS but perl without POSIX is 4.3MB rss. What accounts for the difference after we do all of this? BTW Great answer. And, you were the one person I thought could answer this. =)Triturable
symbols are just set to TEMP if undef'd, so its memory is never freed. this is exactly what I was suspicious of. Can Symbol.pm be modified/patched to clean up rather than set to TEMP, ie is that possible with XS? Or would that be a patch to perl? Is it possible? And, I'm assuming symbols in this context is also referring to opcodes?Triturable
Symbols hold GP slots to the &$%@ SV values. Those SV's are either freed or TEMP'd, but a GV (Symbol) is never freed, it is just cleared. That's why the number of %main:: keys stays that high. A Symbol is a GV. Opcodes are cleared by deleting the & (the function), which will clear also all attached non-global SVs.Hockett
So we're essentially concluding that the arena isn't free because of GV (Symbols) so it can't be released back to the OS. I'm guessing this is a patch to gv.c then if it is at all possible?Triturable
No. The GVs can be freed by deleting it's hash entry. The problem is entirely in freeing the arena's, which hold the SV heads and bodies. I'm currently working on this: github.com/perl11/cperl/issues/336 (see the attached commit with "add Internals::gc() WIP") I now can delete all empty head arenas, but not yet the bodies. WIPHockett
You are seriously a god amongst men. Having you tackle this side of it, I wonder if I can hack DynaLoader to do something sensible with name => handle resolution so we can dynamically unload these? I'll give it a shot.Triturable

© 2022 - 2024 — McMap. All rights reserved.