I've created a PHP extension with SWIG and everything works fine, but I'm observing some strange garbage collection behavior when chaining method calls. For example, this works:
$results = $response->results();
$row = $results->get(0)->iterator()->next();
printf('%s %s' . "\n", $row->getString(0), $row->getString(1));
But this seg faults:
$row = $response->results()->get(0)->iterator()->next();
printf('%s %s' . "\n", $row->getString(0), $row->getString(1));
The only difference is that the first creates $results
, while the second chains the calls together.
SWIG actually only exposes functions to PHP and generates PHP proxy classes to interact with them. These proxy classes basically hold a resource that is passed to each of the exposed functions along with whatever other arguments those functions would normally take. Thinking that maybe these proxy classes were the problem, I reworked the code to bypass them and instead use the exposed functions directly. As before, this works:
$results = InvocationResponse_results($response->_cPtr);
$row = TableIterator_next(Table_iterator(Tables_get($results, 0)));
printf('%s %s' . "\n", Row_getString($row, 0), Row_getString($row, 1));
And again, this seg faults:
$row = TableIterator_next(Table_iterator(Tables_get(InvocationResponse_results($response->_cPtr), 0)));
printf('%s %s' . "\n", Row_getString($row, 0), Row_getString($row, 1));
Again, the only difference is that the first creates $results
, while the second chains the calls together.
At this point, I spent awhile debugging in gdb/valgrind and determined that the destructor for what InvocationResponse_results
returns is called too early when chaining calls together. To observe, I inserted std::cout
statements at the tops of the exposed C++ functions and their destructors. This is the output without chaining:
InvocationResponse_results()
Tables_get()
Table_iterator()
TableIterator_next()
__wrap_delete_TableIterator
Row_getString()
Row_getString()
Hola Mundo
---
__wrap_delete_InvocationResponse
__wrap_delete_Row
__wrap_delete_Tables
I printed ---
at the end of the script to be able to differentiate what happens during the script's execution and what happens after. Hola Mundo
is from printf
. The rest is from C++. As you can see, everything gets called in the expected order. Destructors are only called after the script's execution, though the TableIterator
destructor is called earlier than I would have expected. However, this has not caused any problems and is likely unrelated. Now compare this to the output with chaining:
InvocationResponse_results()
Tables_get()
__wrap_delete_Tables
Table_iterator()
TableIterator_next()
__wrap_delete_TableIterator
Row_getString()
Segmentation fault (core dumped)
Without the return value of InvocationResponse_results
being saved into $results
, it is happily garbage collected before execution even gets out of the call chain (between Tables_get
and Table_iterator
) and this quickly causes problems down the road, ultimately leading to a seg fault.
I also inspected reference counts using xdebug_debug_zval()
in various places, but didn't spot anything unusual. Here is its output on $results
and $row
without chaining:
results: (refcount=1, is_ref=0)=resource(18) of type (_p_std__vectorT_voltdb__Table_t)
row: (refcount=1, is_ref=0)=resource(21) of type (_p_voltdb__Row)
And on $row
with chaining:
row: (refcount=1, is_ref=0)=resource(21) of type (_p_voltdb__Row)
I've spent a couple days on this now and I'm just about out of ideas, so really any insight on how to go about solving this would be greatly appreciated.
_zend_list_delete
and figure out why the calling code is deleting the resource. It may be the resource refcount hitting 0 or a direct delete. – Talley_zend_list_delete
while__wrap_delete_Tables
is being called and in both cases (no seg fault and seg fault), it is garbage collected because its refcount (--le->refcount
) is -1. – Guiltless__wrap_delete_Tables
is called at that specific time in one occasion but not in the other and continue going up. – Talleyrefcount
field (refcount__gc
in 5.3+). – Talleyrefcount
field of the zval, but the one I'm setting a watch point on doesn't seem to be the same as the one that gets cleaned up further down the road. I'm setting the watch point when the resource is created inzend_list_insert()
inzend_list.c
. It initializesrefcount
to 1 there, but that doesn't seem to be the memory I want to watch. Any tips on how to go about setting the watch point correctly? – Guiltlesszval_copy_ctor
or, directly,_zend_list_addref
), than the breakpoint won't catch it. Your best bet is to put a reading breakpoint on the value of the original zval in the hope that it's read when the shallow copy of the zval is created. – Talley