Probably NSThread is implemented using the pthreads library, the point is that the lower is the level of a concept, the more you have to do useless and repetitive tasks.
So the pthreads library isn't so hard to learn, my professor at university taught it, and even the most (call 'em so) slow at learning people were able to use the library, maybe randomly copying-pasting the code just for lazily but doing the job successfully.
So I definitely suggest you to implement a pthread wrapper class, it's easy to do it.
This way you eliminate the useless stuff, for example you may be doing this thousand of times:
pthread_mutex_init( mutex_ptr, NULL);
So (if that's your case, but it's just an example) you may be passing always NULL, and the same is valid for other functions.
Once implemented the class it isn't said that is faster than GCD.
GCD do some optimizations, for example two blocks may be ran in the same thread.
So I suggest to use your defined class only if it's faster than GCD, to test it with time profiler.