Is it possible to emulate a GPU for CUDA/OpenCL unit testing purposes?
Asked Answered
E

2

15

I would like to develop a library with an algorithm that can run on the CPU or the GPU. The GPU can be Nvidia (then the algorithm will use CUDA) or not (then the algorithm will use OpenCL).

I would like to emulate a GPU in this project because maybe:

  • I will use different computer to develop the software and some of them don't have a GPU.

  • The software will be finally executed in servers that can have a GPU or not and the unit test must be executed and passed.

Is there a way to emulate a GPU for unit testing purposes?

In the following link:

GPU Emulator for CUDA programming without the hardware

They show a solution but only for CUDA, not for OpenCL and the software they propose "GPUOcelot" is no longer actively maintained.

Eulogium answered 7/11, 2016 at 9:31 Comment(4)
The performance will be obviouly worse on a CPU, but I need to know if there is a way to test that the algorithm works and it is correctly programmed when no GPUs are installed. In unit tests there are libraries to emulate that a database is listetning, for example fakemongo github.com/fakemongo emulates a mongodb database listining to test that your sql functions are correct.Eulogium
Possible duplicate of GPU Emulator for CUDA programming without the hardwareHip
In that link they only speak about cuda (not about opencl) and the solution they propose si a software that is not actively maintained.Eulogium
Using GPUOcelot or another functional GPU simulator can be a problem for unit testing, as these simulators do not simulate the parallelism of the GPU to the full extend. Some clearly incorrect code with broken synchronization will execute without errors in the simulator because of that. It's sure going to be better than no unit tests at all, but certainly worse than regular unit tests.Undergo
N
10

It depends on what you mean on emulation. You cannot emulate the speed of GPUs.

The GPU is architecturally very different from the CPU, with a lot of working threads (1000s, 10000s, ...), that's why we use it. The CPU can have only a few threads, even when you parallelize the code. They also have different instruction sets.

You can however emulate the execution using special softwares, like NVEmulate for NVIDIA GPUs and OpenCL Emulator-Debugger for AMD.

A related question: GPU Emulator for CUDA programming without the hardware, where the accepted answer recommends gpuocelot for CUDA emulation.

Nikitanikki answered 7/11, 2016 at 9:45 Comment(3)
Thank you very much, this is what I was looking for. I have a GPU in my computer, but sometimes I prefer to code on my sofa using an old laptop with no gpu and I need to know if the code works.Eulogium
@Eulogium in that case you may want to give rCUDA a look. It allows you to access a remote GPU from a node without a GPU. See: rcuda.net/index.php/what-s-rcuda.htmlAmesace
gpuocelot and NVEmulate no longer seem to be maintained. I added a new question and answer pair here for the state of the art as a per 2020 but its likely to be closed as off topic.Salesperson
S
5

I don't know the full state of the art but I can provide a very limited set of things to look at which may be useful.

The accepted answer for this question is now out of date. The question of compiling and runnning GPU code for CUDA or OpenCL on a machine that does not natively support it has come up on here several times (but sadly its often taken as off-topic). This answer is for those questions too.

Many of the answers refer to software solutions that have not been maintained. There seem to be only two answers which stand the test of time which treat this as a mu question.

  • Use a real GPU - i.e. buy a cheap cuda card if you don't already have one.
  • Rent someone elses GPU in the cloud

However emulators do exist.

Also GPU virtualization is well covered by the wikipedia page. There is strong support for getting virtual machines to use the hosts hardware.

Docker and virtualbox both for example support GPU passthough.

Reasons to emulate

  • To learn and keep up to date with changes to CUDA and OpenCL
  • To estimate the effect of the various APIs on performance.
  • To test that your code works on a variety of different platforms.
  • As a proxy for hardware you don't have access to (as per this question)

Kind of emulation

  • For testing you might accept a slow implementation as long as it is compliant and reliable.

  • For production running on different hardware you would more likely accept similar, but not 100% equivalent constructs but (e.g. different warp size, different high-level libraries for FFT, ...) and much more complicated performance-optimized implementations of primitives. You would probaly demand at least 80% of the Cuda speed for comparable hardware.

(Thanks to https://stackoverflow.com/users/13130048/sebastian for those two points)

For the second case you would likely need not just GPU virtualisation but additional optimisation passes.

Why are there less emulators and why don't they survive the test of time?

  • GPUs are affordable. It is only high performance that costs.
  • GPUs (not to mention TPUs and FPGAs) are developing rapidly.
  • Some hardware tricks are kept secret from competitors so emulating actual hardware is difficult.
  • The CUDA and openCL standards are changing too but less quickly.

There is arguably a need for more programmers that understand them. Compiling your code without running and testing it would simply be unprofessional. There would seem to be an obvious need for emulation where you don't have all the possible or interesting hardware combinations physical available.

That being the case its surprising that so many of these emulation projects have not stood the test of time or been endorsed/provided by GPU manufacturers.

There are some active emulation projects however.

Active GPU EMulation Projects

There are at least two active emulation projects maintained as of October 2022:

I cannot speak to how good these are and how commonly they are used compared to using real GPUs (either your own or rented).

Honorable mentions

Cuda to OpenCL source to source transpilers. These appear to be maintained but are not themselves emulators.

Why is this not a solved problem?

There are a number of challenges to overcome. My take on these would be something like:

  1. provide a runtime emulating a particular version of the CUDA or openCL standard
  2. provide a compiler targeting this runtime (ideally gcc or clang)
  3. get the backing of a vendor (e.g. Nvidia or the kronos group)
  4. get the backing of a community (i.e. a decent userbase and set of contributors)
  5. build support into a popular emulation environment (e.g. virtualbox)

You could also argue the case that almost all people working in this area have access to real GPUs so this is not necessary at all.

The vendors of point 3 are doing well with points 1 and 2 and 4. An emulator has to both build on that and take some mindshare of its own. This is an uphill struggle. I hope and believe there will be success in the future.

Looking at virtualbox the last discussion I can find is from 2011.

Seemingly retired projects

These have been mentioned in answers to previous attempts to ask and answer this kind of question.

  • gpuocelot - no longer maintained
  • mcuda - looks unmaintained
  • cuda-waste - on google code which was frozen long ago
  • nvemulate - cude emulator Nvidia - retired a while back

Other seemingly retired projects of interest:

Earlier (out of date) questions:

Salesperson answered 31/10, 2022 at 16:8 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.