C# Rendering OpenCL-generated image
Asked Answered
S

2

0

Problem: I'm trying to render a dynamic Julia fractal in real time. Because the fractal is constantly changing, I need to be able to render at least 20 frames per second, preferably more. What you need to know about a Julia fractal is that every pixel can be calculated independently, so the task is easy parallelizable.

First approach: Because I'm already used to Monogame in C#, I tried writing a shader in HLSL that would do the job, but the compiler kept complaining because I used up more than the allowable 64 arithmetic slots (I need at least a thousand).

Second approach: Using the CPU, it took, as could be expected, about two minutes to generate one frame.

Third approach: I started learning the basics of OpenCL using a wrapper called Cloo. I actually got a quick, nice result by calculating the image data using OpenCL, then getting the data from the GPU, storing the data in a Texture2D and drawing the texture to the screen. For a 1000x1000 image I get about 13 frames a second. This is still not quite what I had hoped for, as the image should be 1920x1080 to fill up my screen, and the frame rate is pretty noticeable. I realised that I'm actually generating the image on the GPU, sending the data to the CPU and then sending it back to the GPU, so this seems like an unnecessary step that, if could be removed, will probably solve my problem. I read on some fora that OpenGL is able to do this, but I haven't been able to find specific information.

Questions: Firstly, is there a simple way to draw the data generated by OpenCL directly without involving CPU (preferably compatible with Monogame)? If this isn't the case, is it possible to implement it using OpenGL and afterwards combine it with Monogame? Secondly, why isn't this possible with a simple HLSL shader? As HLSL and OpenCL both use the GPU, why is HLSL so much more limited when it comes to doing many arithmetic operations?

Edit

I found this site that does roughly what I want, but using a GLSL shader. This again questions my fait in HLSL. Unfortunately, as monogame doesn't support GLSL (yet), my questions remain unanswered.

Spectroradiometer answered 13/6, 2017 at 18:3 Comment(2)
hmm I see your edit so if you can not use GLSL so why tag OpenGL ? If my memory serves well HLSL is DirectX ... Also If you can use OpenGL then you can use GLSL so where is the problem (just use extensions wrangler like GLEW to access newer stuff)?Doubling
@Doubling I'm currently using Monogame to avoid using OpenGL and before considering switching to OpenGL I want to make sure there isn't any easier solution that allows me to keep using Monogame. I tagged OpenGL in case there isn't an easier solution, so if this is the case, your answer is probably acceptable. However, I first need to be sure. You know, the irony is that monogame actually translates HLSL to GLSL before compiling.Spectroradiometer
D
1

Sorry I do not use OpenCL nor C# but You can do this fully inside shaders using GLSL (but you might have precision problems as for Julia like fractals is sometimes even 64bit double not enough). Anyway here a simple example of Mandelbrot set I did some years back...

CPU side app C++/OpenGL/GLSL/VCL code::

//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop
#include "Unit1.h" // VCL window header
#include "gl\\OpenGL3D_double.cpp" // my GL engine
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
OpenGLscreen scr;
GLSLprogram shd;
float mx=0.0,my=0.0,mx0=0.0,my0=0.0,mx1=0.0,my1=0.0;
TShiftState sh0,sh1;
int xs=1,ys=1;
int txrmap=-1;
float zoom=1.000;
unsigned int queryID[2];
//---------------------------------------------------------------------------
void gl_draw()
    {
    float x,y,dx,dy;
    scr.cls();
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    // matrix for old GL rendering
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();
    glMatrixMode(GL_TEXTURE);
    glLoadIdentity();


    // GLSL uniforms
    shd.bind();
    shd.set1i("txrmap",0);      // texture unit
    shd.set2f("p0",mx,my);      // pan position
    shd.set1f("zoom",zoom);     // zoom

    // issue the first query
    // Records the time only after all previous
    // commands have been completed
    glQueryCounter(queryID[0], GL_TIMESTAMP);

    // QUAD covering screen
    scr.txrs.bind(txrmap);
    glColor3f(1.0,1.0,1.0);
    glBegin(GL_QUADS);
    glTexCoord2f(0.0,0.0); glVertex2f(-1.0,+1.0);
    glTexCoord2f(0.0,1.0); glVertex2f(-1.0,-1.0);
    glTexCoord2f(1.0,1.0); glVertex2f(+1.0,-1.0);
    glTexCoord2f(1.0,0.0); glVertex2f(+1.0,+1.0);
    glEnd();
    shd.unbind();
    scr.txrs.unbind();

    // issue the second query
    // records the time when the sequence of OpenGL
    // commands has been fully executed
    glQueryCounter(queryID[1], GL_TIMESTAMP);


    // GL driver info and GLSL log
    scr.text_init_pix(1.0);
    glColor4f(0.0,0.2,1.0,0.8);
    scr.text(glGetAnsiString(GL_VENDOR));
    scr.text(glGetAnsiString(GL_RENDERER));
    scr.text("OpenGL ver: "+glGetAnsiString(GL_VERSION));
    glColor4f(0.4,0.7,0.8,0.8);
    for (int i=1;i<=shd.log.Length();) scr.text(str_load_lin(shd.log,i,true));
    scr.text_exit();

    scr.exe();
    scr.rfs();

    // wait until the results are available
    int e;
    unsigned __int64 t0,t1;
    for (e=0;!e;) glGetQueryObjectiv(queryID[0],GL_QUERY_RESULT_AVAILABLE,&e);
    for (e=0;!e;) glGetQueryObjectiv(queryID[1],GL_QUERY_RESULT_AVAILABLE,&e);
    glGetQueryObjectui64v(queryID[0], GL_QUERY_RESULT, &t0);
    glGetQueryObjectui64v(queryID[1], GL_QUERY_RESULT, &t1);
    Form1->Caption=AnsiString().sprintf("Time spent on the GPU: %f ms\n", (t1-t0)/1000000.0);
    }
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner):TForm(Owner)
    {
    scr.init(this);

    OpenGLtexture txr;
    txr.load      ("gradient.jpg");
    txrmap=scr.txrs.add(txr);

    shd.set_source_file("","","","Mandelbrot_set.glsl_vert","Mandelbrot_set.glsl_frag");

    glGenQueries(2, queryID);
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormDestroy(TObject *Sender)
    {
    scr.exit();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormResize(TObject *Sender)
    {
    scr.resize();
    xs=ClientWidth;
    ys=ClientHeight;
    gl_draw();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormPaint(TObject *Sender)
    {
    gl_draw();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseMove(TObject *Sender, TShiftState Shift, int X,int Y)
    {
    bool q0,q1;
    mx1=1.0-divide(X+X,xs-1);
    my1=divide(Y+Y,ys-1)-1.0;
    sh1=Shift;
    q0=sh0.Contains(ssLeft);
    q1=sh1.Contains(ssLeft);
    if (q1)
        {
        mx-=(mx1-mx0)*zoom;
        my-=(my1-my0)*zoom;
        }
    mx0=mx1; my0=my1; sh0=sh1;
    gl_draw();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseDown(TObject *Sender, TMouseButton Button,TShiftState Shift, int X, int Y)
    {
    FormMouseMove(Sender,Shift,X,Y);
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseUp(TObject *Sender, TMouseButton Button,TShiftState Shift, int X, int Y)
    {
    FormMouseMove(Sender,Shift,X,Y);
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseWheelDown(TObject *Sender, TShiftState Shift, TPoint &MousePos, bool &Handled)
    {
    zoom*=1.2;
    gl_draw();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseWheelUp(TObject *Sender, TShiftState Shift, TPoint &MousePos, bool &Handled)
    {
    zoom/=1.2;
    gl_draw();
    }
//---------------------------------------------------------------------------

You can ignore most of the code the important stuff is gl_draw() rendering single QUAD covering whole screen and passing zoom and pan position. This code uses old style glBegin/glEnd and default nVidia locations so it may not work on different vendor gfx drivers. The mesh should be in VAO/VBO so the layout locations will match to see how to do it take a look at the link on the end of answer or port the shaders to compatibility profile.

Vertex:

// Vertex
#version 420 core
layout(location=0) in vec2 pos;     // glVertex2f <-1,+1>
out smooth vec2 p;                  // texture end point <0,1>
void main()
    {
    p=pos;
    gl_Position=vec4(pos,0.0,1.0);
    }

Fragment:

// Fragment
#version 420 core
uniform sampler2D txrmap;           // texture unit for light map
uniform vec2 p0=vec2(0.0,0.0);      // mouse position <-1,+1>
uniform float zoom=1.000;           // zoom [-]
in smooth vec2 p;
out vec4 col;
void main()
    {
    int i,n;
    vec2 pp;
    float x,y,q,xx,yy;
    pp=(p*zoom)-p0;         // y (-1, 1)
    pp.x=(1.75*pp.x)-0.75;  // x (-2.5, 1)
    for (x=0.0,y=0.0,xx=0.0,yy=0.0,i=0,n=200;(i<n)&&(xx+yy<4.0);i++)
        {
        q=xx-yy+pp.x;
        y=(2.0*x*y)+pp.y;
        x=q;
        xx=x*x;
        yy=y*y;
        }
    q=float(i)/float(n);
    col=texture2D(txrmap,vec2(q,0.5));
//  col=vec4(q,q,q,1.0);
    }

using this texture as gradient:

gradient

Here result screenshot:

screenshot

In case you need to get started with GLSL (to replace my gl engine stuff) see:

but I am sure there must be tons of tutorials for this in C# so google

If you are interested in color enhancing see:

Doubling answered 6/7, 2017 at 10:6 Comment(0)
Z
1

To cover the questions: Yes, OpenCL can paint, but Monogame apparently doesn't encapsulate over the top of CL, so No to Question 1. Question 2 is the right question: maybe, see suggestions below. Question 3: HLSL is essentially PS 1.1 so "why isn't it possible" is because PS evolved to 2.x to manage parallelization through wider data pipes...so you want Dx12 support or GLSL/OpenGL.

Since you are close to your performance expectations using CLoo, why not try OpenCL.Net and/or OpenTK to bind the Julia calculations more closely to the Monogame API? --If you have to go GPU-CPU-GPU at least make that as wide a pipeline as possible.

Alternately, a slightly sideways solution to your parallelization and framerate problem might be integrating GP-GPU wrappers such as Quanta's Alea with your Monogame solution. I'd suggest looking at Cudafy, but Alea is more robust and cross-vendor GPU supported.

The build process will decide which portion of the Julia code will calculate on GPU via Alea, and the Monogame portions will receive the pixel-field for rendering. The sticking points will be library "play-nice" compatibility, and ultimately, frame-rate if you get it working.

Bottom line: you're stuck, by choice, in HLSL (read: Microsoft Dx9) and Monogame doesn't support GLSL/Dx12....so you will have to maneuver creatively to get un-stuck.

Zolner answered 12/7, 2017 at 0:22 Comment(2)
reading this make me even more happy that I stopped using "crappy" 3th party libs where I do not need to decade ago ...Doubling
As OpenCL.Net is just another wrapper around OpenCL, will it not have the same problem as I'm having now with Cloo? I would still have to send the pixel data from to GPU to the CPU and back to the GPU, so the real bottleneck would still be there. For OpenTK I haven't been able to find a Monogame-compatible way of using it, so that also seems like a dead end. Alea GPU seemed promising, but as I understand it needs a CUDA-compatible GPU, wich I don't have (I have AMD Radeon). Any thoughts?Spectroradiometer

© 2022 - 2024 — McMap. All rights reserved.