I was commenting on an answer that thread-local storage is nice and recalled another informative discussion about exceptions where I supposed
The only special thing about the execution environment within the throw block is that the exception object is referenced by rethrow.
Putting two and two together, wouldn't executing an entire thread inside a function-catch-block of its main function imbue it with thread-local storage?
It seems to work fine, albeit slowly. Is this novel or well-characterized? Is there another way of solving the problem? Was my initial premise correct? What kind of overhead does get_thread
incur on your platform? What's the potential for optimization?
#include <iostream>
#include <pthread.h>
using namespace std;
struct thlocal {
string name;
thlocal( string const &n ) : name(n) {}
};
struct thread_exception_base {
thlocal &th;
thread_exception_base( thlocal &in_th ) : th( in_th ) {}
thread_exception_base( thread_exception_base const &in ) : th( in.th ) {}
};
thlocal &get_thread() throw() {
try {
throw;
} catch( thread_exception_base &local ) {
return local.th;
}
}
void print_thread() {
cerr << get_thread().name << endl;
}
void *kid( void *local_v ) try {
thlocal &local = * static_cast< thlocal * >( local_v );
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
print_thread();
return NULL;
}
int main() {
thlocal local( "main" );
try {
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
print_thread();
pthread_t th;
thlocal kid_local( "kid" );
pthread_create( &th, NULL, &kid, &kid_local );
pthread_join( th, NULL );
print_thread();
}
return 0;
}
This does require defining new exception classes derived from thread_exception_base
, initializing the base with get_thread()
, but altogether this doesn't feel like an unproductive insomnia-ridden Sunday morning…
EDIT: Looks like GCC makes three calls to pthread_getspecific
in get_thread
. EDIT: and a lot of nasty introspection into the stack, environment, and executable format to find the catch
block I missed on the first walkthrough. This looks highly platform-dependent, as GCC is calling some libunwind
from the OS. Overhead on the order of 4000 cycles. I suppose it also has to traverse the class hierarchy but that can be kept under control.
thlocal
and that base always needs to be initialized withget_thread()
. Hmm, sounds like I need an intermediate pointer in there to avoid copying data. – Thielen