gpgpu Questions

2

Solved

Background I'm trying to understand whether a GPU's Last-Level Cache is invalidated or preserved across multiple kernel launches, so that the effective memory bandwidth can be increased. I'm aware ...
Spellbound asked 2/9, 2023 at 8:26

3

Solved

CUDA document does not specific how many CUDA process can share one GPU. For example, if I launch more than one CUDA programs by the same user with only one GPU card installed in the system, what i...
Columbarium asked 27/7, 2015 at 0:55

5

Solved

When a computer has multiple CUDA-capable GPUs, each GPU is assigned a device ID. By default, CUDA kernels execute on device ID 0. You can use cudaSetDevice(int device) to select a different device...
Biddle asked 8/12, 2012 at 20:42

5

Solved

I am working on high performance code in C++ and have been using both CUDA and OpenCL and more recently C++AMP, which I like very much. I am however a little worried that it is not being developed ...
Putter asked 23/1, 2016 at 21:48

2

Solved

I am trying to get NVIDIA's CUDA setup and installed on my PC which has an NVIDIA GEFORCE RTX 2080 SUPER graphics card. After hours of trying different things and lots of research I have gotten CUD...
Cnidoblast asked 23/7, 2020 at 18:45

13

Solved

My CUDA program crashed during execution, before memory was flushed. As a result, device memory remained occupied. I'm running on a GTX 580, for which nvidia-smi --gpu-reset is not supported. Pla...
Sandbox asked 4/3, 2013 at 8:22

1

Solved

In SYCL, there are three types of memory: host memory, device memory, and Unified Shared Memory (USM). For host and device memory, data exchange requires explicit copying. Meanwhile, data movement ...
Frequent asked 16/7, 2023 at 20:36

3

Solved

What are the key practical differences between GPGPU and regular multicore/multithreaded CPU programming, from the programmer's perspective? Specifically: What types of problems are better suited...

6

Solved

I'm not sure if it's possible. I want to study OpenCL in-depth, so I was wondering if there is a tool to disassemble an compiled OpenCL kernel. For normal x86 executable, I can use objdump to get ...
Prolusion asked 14/7, 2011 at 6:25

3

Solved

I am writing a code to compute dot product of two vectors using CUBLAS routine of dot product but it returns the value in host memory. I want to use the dot product for further computation on GPGPU...
Ronnyronsard asked 13/9, 2012 at 6:18

9

Solved

I'd like to extend my skill set into GPU computing. I am familiar with raytracing and realtime graphics(OpenGL), but the next generation of graphics and high performance computing seems to be in GP...
Mischiefmaker asked 10/10, 2012 at 21:2

1

I have a discrete NVIDIA GPU (say, Kepler or Maxwell). I want to clear my L2 cache before some kernel is scheduled, so as not to taint my test results. I could do something like allocate a large s...
Instill asked 15/7, 2015 at 11:39

1

So I'm exploring WebGPU and figured it would be an interesting exercise to implement a basic neural network in it. Having little understanding of both GPU shader programming and neural networks and...
Bowing asked 27/4, 2022 at 21:35

4

Solved

I am writing a cuda program and trying to print something inside the cuda kernels using the printf function. But when I am compiling the program then I am getting an error error : calling a host f...
Tyrannicide asked 31/12, 2012 at 22:45

3

Solved

I am trying to generate random number random numbers within the cuda kernel. I wish to generate the random numbers from uniform distribution and in the integer form, starting from 1 up to 8. The ra...
Insulation asked 29/8, 2013 at 1:49

4

Solved

I want to import a PGP public key into my keychain in a script, but I don't want it to write the contents to a file. Right now my script does this: curl http://example.com/pgp-public-key -o /tmp/p...
Classroom asked 9/9, 2016 at 12:35

4

when is calling to the cudaDeviceSynchronize function really needed?. As far as I understand from the CUDA documentation, CUDA kernels are asynchronous, so it seems that we should call cudaDevice...
Psilomelane asked 9/8, 2012 at 17:25

7

Solved

In CUDA, there is a concept of a warp, which is defined as the maximum number of threads that can execute the same instruction simultaneously within a single processing element. For NVIDIA, this wa...
Coefficient asked 17/8, 2011 at 13:15

2

Solved

Under what circumstances should you use the volatile keyword with a CUDA kernel's shared memory? I understand that volatile tells the compiler never to cache any values, but my question is about th...
Marthena asked 11/3, 2013 at 4:2

2

Continuous integration services are wonderful for continually testing updates to packages for various languages. These include services like Travis-CI, Jenkins, and Shippable among many others. How...
Site asked 1/5, 2015 at 12:35

2

Solved

I know that nvidia-smi -l 1 will give the GPU usage every one second (similarly to the following). However, I would appreciate an explanation on what Volatile GPU-Util really means. Is that the num...
Gladsome asked 2/12, 2016 at 17:31

11

Solved

What features make OpenCL unique to choose over OpenGL with GLSL for calculations? Despite the graphic related terminology and inpractical datatypes, is there any real caveat to OpenGL? For examp...
Dorsman asked 26/10, 2011 at 18:57

1

With recent NVIDIA micro-architectures, there's a new (?) taxonomy of warp stall reasons / warp scheduler states. Two of the items in this taxonomy are: Short scoreboard - scoreboard dependency on...
Korney asked 9/2, 2021 at 17:14

2

I am a fairly new cuda user. I'm practicing on my first cuda application where I try to accelerate kmeans algorithm by using GPU(GTX 670). Briefly, each thread works on a single point which is co...
Etrem asked 21/3, 2015 at 20:7

4

Solved

I understand there's an openCL C++ API, but I'm having trouble compiling my kernels... do the kernels have to be written in C? And then it's just the host code that's allowed to be written in C++? ...
Sporogenesis asked 7/7, 2016 at 17:29

© 2022 - 2025 — McMap. All rights reserved.