I'm writing an algorithm in OpenCL in which I'd need every work unit to remember a fair portion of data, say something between a long[70]
and a long[200]
or so per kernel.
Recent AMD devices have 32 KiB __local
memory, which is (for the given amount of data per kernel) enough to store the info for 20-58 work units. However, from what I understand from the architecture (and especially from this drawing), each shader core also has a dedicated amount of private memory. I however fail to find its size.
Can anyone tell me how to find out how much private memory each kernel has?
I'm particularly curious about the HD7970, since I plan to buy some of these soon.
Edit: Problem solved, the answer is here in appendix D.