X11 is getting old. Consider Wayland and read its FAQ. And there are several X related protocols, read about X11 Core protocol, and also conventions like EWMH.
You'll need to read several thousands of pages. Just understanding the protocols would require several months. OReilly published (in the previous century) a serie of 8 to 10 books related to X11.
You might use a low level X11 library, like XCB -or the older Xlib -, it is used by graphical toolkits like Qt or GTK (or FOX, etc..), which are both free software toolkits so you could study their source code (and Xlib & XCB are also free software).
Notice that today's GUI are usually not displaying fonts using X11 text display requests. They often use Xft (and the font is on the client side, not in the Xorg server). Actually, I heard that most of the graphics rendering practically happens on the client side, and that today most X11 requests are just sending pixmaps to the server (so X11 core protocol requests for drawing lines or circles are barely used today). More generally, the trend in major X11 based toolkits like Qt or GTK is to avoid using
the server-side drawing abilities of X11 (e.g. Xlib's XDrawLine or XDrawText), because the toolkit is drawing a pixmap image client side and sending it to the server.
You could consider using a low-level library like libSDL
The important thing to understand is that X11 applications are event driven and based upon an event loop (generally provided by the toolkit) above some multiplexing syscall like poll(2). They are asked by X11 expose or damage events to redraw some screen area (and, of course, keyboard, mouse, and the Xorg server itself are sending events).
See also this answer to a similar question.
DISPLAY
environment variable. For how to find the corresponding socket, RTFM. – Riviera