In trying to get my head around graphics programming using c++ and OpenGL3+ I have come across a slightly specialized understanding problem with the char type, the pointers to it and potential implicit or explicit conversion to other char pointer types. I think I have been able to find a solution, but I would like to doublecheck by asking for your take on this.
The current (October 2014) OpenGL4.5 core profile specification (Table 2.2 in chapter 2.2 Command Syntax) lists the OpenGL data types and explicitly states
GL types are not C types. Thus, for example, GL type int is referred to as GLint outside this document, and is not necessarily equivalent to the C type int. An implementation must use exactly the number of bits indicated in the table to represent a GL type.
The GLchar type in this table is specified as a type of bit width 8 that is used to represent characters which make up a string.
To further narrow down what GLchar has to provide, we can have a look at the GLSL Specification (OpenGL Shading Language 4.50, July 2014, Chapter 3.1 Character Set and Phases of Compilation):
The source character set used for the OpenGL shading languages is Unicode in the UTF-8 encoding scheme.
Now the way this is implemented in any OpenGL library header I cared to look for is a simple
typedef char GLchar;
which of course flies in the face of the statement "GL types are not C types" I just quoted.
Normally, this wouldn't be a problem, seeing as typedefs are meant for just such a situation where the underlying type might change in the future.
The problem starts in the user implementation.
Going through a few tutorials on OpenGL, I came across various ways to assign the GLSL source code to a GLchar array needed for processing it. (Please forgive me for not providing all the links. Currently, I do not have the reputation needed to do so.)
The site open.gl likes to do this:
const GLchar* vertexSource =
"#version 150 core\n"
"in vec2 position;"
"void main() {"
" gl_Position = vec4(position, 0.0, 1.0);"
"}";
or this:
// Shader macro
#define GLSL(src) "#version 150 core\n" #src
// Vertex shader
const GLchar* vertexShaderSrc = GLSL(
in vec2 pos;
void main() {
gl_Position = vec4(pos, 0.0, 1.0);
}
);
On lazyfoo.net (Chapter 30 Loading Text File Shaders), the source code is read from a file (my preferred method) into a std::string shaderString
variable which is then used to initialize the GL string:
const GLchar* shaderSource = shaderString.c_str();
The most adventurous approach I have seen yet is the first one I get when I google loading shader file - the ClockworkCoders tutorial on loading hosted at the OpenGL SDK that uses an explicit cast - not to GLchar*
but to GLubyte*
- like this:
GLchar** ShaderSource;
unsigned long len;
ifstream file;
// . . .
len = getFileLength(file);
// . . .
*ShaderSource = (GLubyte*) new char[len+1];
Any decent c++ compiler will give an invalid conversion error here. The g++ compiler will let it go with a warning only if the -fpermissive flag is set. Compiling it that way, the code will work because GLubyte
is in the end just a typedef
alias of the fundamental type unsigned char
which is the same length as char
. In this case an implicit pointer conversion may generate a warning but should still do the right thing. This goes against C++ standard, where char*
is not compatible with signed
or unsigned char*
, so doing it this way is bad practice. Which brings me to the problem I had:
My point is, all these tutorials rely on the basic fact that the implementation of the OpenGL specification is currently just window dressing in the form of typedefs for fundamental types. This assumption is in no way covered by the specification. Worse, it is explicitly discouraged to think of GL types as C types.
If at any point in the future the OpenGL implementation should change - for whatever reason - so that GLchar
is no longer a simple typedef
alias of char
, code like this will no longer compile as there are no implicit conversions between pointers to incompatible types. While it is certainly possible in some cases to tell the compiler to just ignore the invalid pointer conversion, opening the gates to bad programming like that may and will lead to all kinds of other problems in your code.
I have seen exactly one place that does it right to my understanding: the official opengl.org wiki example on Shader Compilation, i.e.:
std::string vertexSource = //Get source code for vertex shader.
// . . .
const GLchar *source = (const GLchar *)vertexSource.c_str();
The sole difference to other tutorials is an explicit cast to const GLchar*
before the assignment. Ugly, I know, yet, as far as I can see, it makes the code secure against any valid future implementation of the OpenGL specification (summed up): a type of bit size 8 representing characters in the UTF-8 encoding scheme.
To illustrate my reasoning, I have written a simple class GLchar2
that fulfils this specification but no longer allows implicit pointer conversion to or from any fundamental type:
// GLchar2.h - a char type of 1 byte length
#include <iostream>
#include <locale> // handle whitespaces
class GLchar2 {
char element; // value of the GLchar2 variable
public:
// default constructor
GLchar2 () {}
// user defined conversion from char to GLchar2
GLchar2 (char element) : element(element) {}
// copy constructor
GLchar2 (const GLchar2& c) : element(c.element) {}
// destructor
~GLchar2 () {}
// assignment operator
GLchar2& operator= (const GLchar2& c) {element = c; return *this;}
// user defined conversion to integral c++ type char
operator char () const {return element;}
};
// overloading the output operator to correctly handle GLchar2
// due to implicit conversion of GLchar2 to char, implementation is unnecessary
//std::ostream& operator<< (std::ostream& o, const GLchar2 character) {
// char out = character;
// return o << out;
//}
// overloading the output operator to correctly handle GLchar2*
std::ostream& operator<< (std::ostream& o, const GLchar2* output_string) {
for (const GLchar2* string_it = output_string; *string_it != '\0'; ++string_it) {
o << *string_it;
}
return o;
}
// overloading the input operator to correctly handle GLchar2
std::istream& operator>> (std::istream& i, GLchar2& input_char) {
char in;
if (i >> in) input_char = in; // this is where the magic happens
return i;
}
// overloading the input operator to correctly handle GLchar2*
std::istream& operator>> (std::istream& i, GLchar2* input_string) {
GLchar2* string_it;
int width = i.width();
std::locale loc;
while (std::isspace((char)i.peek(),loc)) i.ignore(); // ignore leading whitespaces
for (string_it = input_string; (((i.width() == 0 || --width > 0) && !std::isspace((char)i.peek(),loc)) && i >> *string_it); ++string_it);
*string_it = '\0'; // terminate with null character
i.width(0); // reset width of i
return i;
}
Note that in addition to writing the class, I have implemented overloads of the input and output stream operators to correctly handle reading and writing from the class as well as c-string style null-terminated GLchar2
arrays. This is possible without knowing the internal structure of the class, as long as it provides implicit conversions between the types char
and GLchar2
(but not their pointers). No explicit conversions between char
and GLchar2
or their pointer types are necessary.
I don't claim that this implementation of GLchar
is worthwhile or complete, but it should do for the purpose of demonstration. Comparing it to a typedef char GLchar1;
I find what I can and cannot do with this type:
// program: test_GLchar.cpp - testing implementation of GLchar
#include <iostream>
#include <fstream>
#include <locale> // handle whitespaces
#include "GLchar2.h"
typedef char GLchar1;
int main () {
// byte size comparison
std::cout << "GLchar1 has a size of " << sizeof(GLchar1) << " byte.\n"; // 1
std::cout << "GLchar2 has a size of " << sizeof(GLchar2) << " byte.\n"; // 1
// char constructor
const GLchar1 test_char1 = 'o';
const GLchar2 test_char2 = 't';
// default constructor
GLchar2 test_char3;
// char conversion
test_char3 = '3';
// assignment operator
GLchar2 test_char4;
GLchar2 test_char5;
test_char5 = test_char4 = 65; // ASCII value 'A'
// copy constructor
GLchar2 test_char6 = test_char5;
// pointer conversion
const GLchar1* test_string1 = "test string one"; // compiles
//const GLchar1* test_string1 = (const GLchar1*)"test string one"; // compiles
//const GLchar2* test_string2 = "test string two"; // does *not* compile!
const GLchar2* test_string2 = (const GLchar2*)"test string two"; // compiles
std::cout << "A test character of type GLchar1: " << test_char1 << ".\n"; // o
std::cout << "A test character of type GLchar2: " << test_char2 << ".\n"; // t
std::cout << "A test character of type GLchar2: " << test_char3 << ".\n"; // 3
std::cout << "A test character of type GLchar2: " << test_char4 << ".\n"; // A
std::cout << "A test character of type GLchar2: " << test_char5 << ".\n"; // A
std::cout << "A test character of type GLchar2: " << test_char6 << ".\n"; // A
std::cout << "A test string of type GLchar1: " << test_string1 << ".\n";
// OUT: A test string of type GLchar1: test string one.\n
std::cout << "A test string of type GLchar2: " << test_string2 << ".\n";
// OUT: A test string of type GLchar2: test string two.\n
// input operator comparison
// test_input_file.vert has the content
// If you can read this,
// you can read this.
// (one whitespace before each line to test implementation)
GLchar1* test_string3;
GLchar2* test_string4;
GLchar1* test_string5;
GLchar2* test_string6;
// read character by character
std::ifstream test_file("test_input_file.vert");
if (test_file) {
test_file.seekg(0, test_file.end);
int length = test_file.tellg();
test_file.seekg(0, test_file.beg);
test_string3 = new GLchar1[length+1];
GLchar1* test_it = test_string3;
std::locale loc;
while (test_file >> *test_it) {
++test_it;
while (std::isspace((char)test_file.peek(),loc)) {
*test_it = test_file.peek(); // add whitespaces
test_file.ignore();
++test_it;
}
}
*test_it = '\0';
std::cout << test_string3 << "\n";
// OUT: If you can read this,\n you can read this.\n
std::cout << length << " " <<test_it - test_string3 << "\n";
// OUT: 42 41\n
delete[] test_string3;
test_file.close();
}
std::ifstream test_file2("test_input_file.vert");
if (test_file2) {
test_file2.seekg(0, test_file2.end);
int length = test_file2.tellg();
test_file2.seekg(0, test_file2.beg);
test_string4 = new GLchar2[length+1];
GLchar2* test_it = test_string4;
std::locale loc;
while (test_file2 >> *test_it) {
++test_it;
while (std::isspace((char)test_file2.peek(),loc)) {
*test_it = test_file2.peek(); // add whitespaces
test_file2.ignore();
++test_it;
}
}
*test_it = '\0';
std::cout << test_string4 << "\n";
// OUT: If you can read this,\n you can read this.\n
std::cout << length << " " << test_it - test_string4 << "\n";
// OUT: 42 41\n
delete[] test_string4;
test_file2.close();
}
// read a word (until delimiter whitespace)
test_file.open("test_input_file.vert");
if (test_file) {
test_file.seekg(0, test_file.end);
int length = test_file.tellg();
test_file.seekg(0, test_file.beg);
test_string5 = new GLchar1[length+1];
//test_file.width(2);
test_file >> test_string5;
std::cout << test_string5 << "\n";
// OUT: If\n
delete[] test_string5;
test_file.close();
}
test_file2.open("test_input_file.vert");
if (test_file2) {
test_file2.seekg(0, test_file2.end);
int length = test_file2.tellg();
test_file2.seekg(0, test_file2.beg);
test_string6 = new GLchar2[length+1];
//test_file2.width(2);
test_file2 >> test_string6;
std::cout << test_string6 << "\n";
// OUT: If\n
delete[] test_string6;
test_file2.close();
}
// read word by word
test_file.open("test_input_file.vert");
if (test_file) {
test_file.seekg(0, test_file.end);
int length = test_file.tellg();
test_file.seekg(0, test_file.beg);
test_string5 = new GLchar1[length+1];
GLchar1* test_it = test_string5;
std::locale loc;
while (test_file >> test_it) {
while (*test_it != '\0') ++test_it; // test_it points to null character
while (std::isspace((char)test_file.peek(),loc)) {
*test_it = test_file.peek(); // add whitespaces
test_file.ignore();
++test_it;
}
}
std::cout << test_string5 << "\n";
// OUT: If you can read this,\n you can read this.\n
delete[] test_string5;
test_file.close();
}
test_file2.open("test_input_file.vert");
if (test_file2) {
test_file2.seekg(0, test_file2.end);
int length = test_file2.tellg();
test_file2.seekg(0, test_file2.beg);
test_string6 = new GLchar2[length+1];
GLchar2* test_it = test_string6;
std::locale loc;
while (test_file2 >> test_it) {
while (*test_it != '\0') ++test_it; // test_it points to null character
while (std::isspace((char)test_file2.peek(), loc)) {
*test_it = test_file2.peek(); // add whitespaces
test_file2.ignore();
++test_it;
}
}
std::cout << test_string6 << "\n";
// OUT: If you can read this,\n you can read this.\n
delete[] test_string6;
test_file2.close();
}
// read whole file with std::istream::getline
test_file.open("test_input_file.vert");
if (test_file) {
test_file.seekg(0, test_file.end);
int length = test_file.tellg();
test_file.seekg(0, test_file.beg);
test_string5 = new GLchar1[length+1];
std::locale loc;
while (std::isspace((char)test_file.peek(),loc)) test_file.ignore(); // ignore leading whitespaces
test_file.getline(test_string5, length, '\0');
std::cout << test_string5 << "\n";
// OUT: If you can read this,\n you can read this.\n
delete[] test_string5;
test_file.close();
}
// no way to do this for a string of GLchar2 as far as I can see
// the getline function that returns c-strings rather than std::string is
// a member of istream and expects to return *this, so overloading is a no go
// however, this works as above:
// read whole file with std::getline
test_file.open("test_input_file.vert");
if (test_file) {
std::locale loc;
while (std::isspace((char)test_file.peek(),loc)) test_file.ignore(); // ignore leading whitespaces
std::string test_stdstring1;
std::getline(test_file, test_stdstring1, '\0');
test_string5 = (GLchar1*) test_stdstring1.c_str();
std::cout << test_string5 << "\n";
// OUT: If you can read this,\n you can read this.\n
test_file.close();
}
test_file2.open("test_input_file.vert");
if (test_file2) {
std::locale loc;
while (std::isspace((char)test_file2.peek(),loc)) test_file2.ignore(); // ignore leading whitespaces
std::string test_stdstring2;
std::getline(test_file2, test_stdstring2, '\0');
test_string6 = (GLchar2*) test_stdstring2.c_str();
std::cout << test_string6 << "\n";
// OUT: If you can read this,\n you can read this.\n
test_file.close();
}
return 0;
}
I conclude that there are at least two viable ways to write code that will always handle GLchar
strings correctly without violating C++ standards:
Use an explicit conversion from a char array to a
GLchar
array (untidy, but doable).const GLchar* sourceCode = (const GLchar*)"some code";
std::string sourceString = std::string("some code"); // can be from a file GLchar* sourceCode = (GLchar*) sourceString.c_str();
Use the input stream operator to read the string from a file directly into a
GLchar
array.
The second method has the advantage that no explicit conversion is necessary, but to implement it, space for the string must be allocated dynamically. Another potential downside is that OpenGL won't necessarily provide overloads for the input and output stream operators to handle their type or their pointer type. However, as I have shown, writing these overloads yourself is no matter of witchcraft as long as at least the type conversion to and from char has been implemented.
So far, I have not found any other viable overload for input from files that provides exactly the same syntax as for c-strings.
Now my question is this: Have I thought this through correctly so that my code will remain safe against possible changes made by OpenGL and - no matter whether the answer is yes or no - is there a better (i.e. safer) way to ensure upward compatibility of my code?
Also, I have read this stackoverflow question and answer, but as far as I am aware, it does not cover strings, since they are not fundamental types.
I am also not asking how to write a class that does provide implicit pointer conversions (though that would be an interesting exercise). The point of this example class is to prohibit implicit pointer assignment, since there is no guarantee that OpenGL would provide such if they decided to change their implementation.
GL/gl.h
. Each platform generates those typedefs using the appropriate underlying data type to match GL's specification, so this is a complete non-issue. – Apical