Reflection support in C
Asked Answered
O

10

38

I know it is not supported, but I am wondering if there are any tricks around it. Any tips?

Ornament answered 30/8, 2009 at 3:47 Comment(6)
If you want reflection, C and C++ are the wrong languages for you. It is contrary to their philosophy of "you don't pay for what you don't use."Scotism
You can get the effects of reflection by using mechanisms outside fo the C/C++ langauges. See other answers.Kirstiekirstin
The thing I am trying to do is find what the parameters for a function before calling dlsym(3) . Thanks for the answersOrnament
@adk: then you need a way to extact the parameter (name?) order and type information from the source code. C won't do this. You have to step outside the C language. This is practical. See other answers.Kirstiekirstin
Implementing basic COM interfaces could bring a little reflection into your code.Damage
@Crashworks, "you don't pay for what you don't use." Sure, but isn't the whole point of C/C++ that you can opt in and pay for what you do want to use?Sterner
K
30

Reflection in general is a means for a program to analyze the structure of some code. This analysis is used to change the effective behavior of the code.

Reflection as analysis is generally very weak; usually it can only provide access to function and field names. This weakness comes from the language implementers essentially not wanting to make the full source code available at runtime, along with the appropriate analysis routines to extract what one wants from the source code.

Another approach is tackle program analysis head on, by using a strong program analysis tool, e.g., one that can parse the source text exactly the way the compiler does it. (Often people propose to abuse the compiler itself to do this, but that usually doesn't work; the compiler machinery wants to be a compiler and it is darn hard to bend it to other purposes).

What is needed is a tool that:

  • Parses language source text
  • Builds abstract syntax trees representing every detail of the program. (It is helpful if the ASTs retain comments and other details of the source code layout such as column numbers, literal radix values, etc.)
  • Builds symbol tables showing the scope and meaning of every identifier
  • Can extract control flows from functions
  • Can extact data flow from the code
  • Can construct a call graph for the system
  • Can determine what each pointer points-to
  • Enables the construction of custom analyzers using the above facts
  • Can transform the code according to such custom analyses (usually by revising the ASTs that represent the parsed code)
  • Can regenerate source text (including layout and comments) from the revised ASTs.

Using such machinery, one implements analysis at whatever level of detail is needed, and then transforms the code to achieve the effect that runtime reflection would accomplish. There are several major benefits:

  • The detail level or amount of analysis is a matter of ambition (e.g., it isn't limited by what runtime reflection can only do)
  • There isn't any runtime overhead to achieve the reflected change in behavior
  • The machinery involved can be general and applied across many languages, rather than be limited to what a specific language implementation provides.
  • This is compatible with the C/C++ idea that you don't pay for what you don't use. If you don't need reflection, you don't need this machinery. And your language doesn't need to have the intellectual baggage of weak reflection built in.

See our DMS Software Reengineering Toolkit for a system that can do all of the above for C, Java, and COBOL, and most of it for C++.

[EDIT August 2017: Now handles C11 and C++2017]

Kirstiekirstin answered 30/8, 2009 at 4:58 Comment(2)
If you don't have source code, you'd better hope that reflection that answers the specific question you have is available in your programming system. The OP's problem was about C, which has no reflection built in at all. If you have a system with reflection, like Java, you often find it won't give you enough detail; does Java provide the names and types of function parameters, as the OP requested? The point is that you need a powerful system to analyze code, and most programming systems/languages aren't going to provide. So step outside and get a tool that can, with no exceptions.Kirstiekirstin
Here's another way to think about. Reflection in langauges that have it is about how much of the source code the compiler is willing to leave in your object code to enable reflection. Unless it keeps all the source code around, reflection will be limited in its ability to analyze the available facts about the source code.Kirstiekirstin
K
13

Tips and tricks always exists. Take a look at Metaresc library https://github.com/alexanderchuranov/Metaresc

It provides interface for types declaration that will also generate meta-data for the type. Based on meta-data you can easily serialize/deserialize objects of any complexity. Out of the box you can serialize/deserialize XML, JSON, YAML, XDR, Lisp-like notation, C-init notation.

Here is a simple example:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#include "metaresc.h"

TYPEDEF_STRUCT (point_t,
                double x,
                double y
                );

int main (int argc, char * argv[])
{
  point_t point = {
    .x = M_PI,
    .y = M_E,
  };
  MR_PRINT ((point_t, &point, XML));
  return (EXIT_SUCCESS);
}

This program will output

$ ./point
<?xml version="1.0"?>
<point_t>
  <x>3.1415926535897931</x>
  <y>2.7182818284590451</y>
</point_t>

Library works fine for latest gcc and clang on Linux, MacOs, FreeBSD and Windows. Custom macro language is one of the options. User could do declaration as usual and generate types descriptors from DWARF debug info. This moves complexity to the build process, but makes adoption much easier.

Karrah answered 9/8, 2015 at 20:4 Comment(1)
Do you understand how the macro trick works? I'm trying to use this reflection trick but I don't understand the logic behind it. Trying to read the source code but it's very confusing.Morello
M
9

any tricks around it? Any tips?

The compiler will probably optionally generate 'debug symbol file', which a debugger can use to help debug the code. The linker may also generate a 'map file'.

A trick/tip might be to generate and then read these files.

Mainz answered 30/8, 2009 at 4:27 Comment(0)
M
6

I know of the following options, but all come at cost and a lot of limitations:

  • Use libdl (#include <dfcln.h>)
  • Call a tool like objdump or nm
  • Parse the object files yourself (using a corresponding library)
  • Involve a parser and generate the necessary information at compile time.
  • "Abuse" the linker to generate symbol arrays.

I'll use a bit of unit test frameworks as examples further down, because automatic test discovery for unit test frameworks is a typical example where reflection comes in very handy, and it's something that most unit test frameworks for C fall short of.

Using libdl (#include <dfcln.h>) (POSIX)

If you're on a POSIX environment, a little bit of reflection can be done using libdl. Plugins are developed that way.

Use

#include <dfcln.h>

in your source code and link with -ldl.

Then you have access to functions dlopen(), dlerror(), dlsym() and dlclose() with which you could load and access / run shared objects at runtime. However, it does not give you easy access to the symbol table.

Another disadvantage of this approach is that you basically restrict reflection to objects loaded as dynamic library (shared object loaded at runtime via dlopen()).

Running nm or objdump

You could run nm or objdump to show the symbol table and parse the output. For me, nm -P --defined-only -g xyz.o gives good results, and parsing the output is trivial. You'd be interested in the first word of each line only, which is the symbol name, and maybe the second one, which is the section type.

If you do not know the object name in some static way, i.e. the object is actually a shared object, at least on Linux you then might want to skip symbol names starting with '_'.

objdump, nm or similar tools are also often available outside POSIX environments.

Parsing the object files yourself

You could parse the object files yourself. You probably don't want to implement that from scratch but use an existing library for that. This is how nm, objdump and even libdl are implemented. You could peek at the source code of nm, objdump and libdl and the libraries they use in order to find out how they do what they do.

Involving a Parser

You could write a parser and code generator which generates the necessary reflective information at compile time and stores it in the object file. Then you have a lot of freedom and could even implement primitive forms of annotations. That's what some unit test frameworks like AceUnit do.

I found that writing a parser which covers straight-forward C syntax is fairly trivial. Writing a parser which really understands C and could deal with all cases is NOT trivial. So, this has limitations which depend on how exotic the C syntax is that you want to reflect upon.

"Abusing" the linker to generate symbol arrays

You could put references to symbols which you want to reflect upon in a special section and use a linker configuration to emit the section boundaries so you can access them in C.

I've described here N-Dependency injection in C - better way than linker-defined arrays? how this works.

But beware, this is depending on a lot of things and not very portable. I have only tried this with GCC/ld, and I know it doesn't work with all compilers / linkers. Also, it's almost guaranteed that dead code elimination will not detect how you call this stuff, so if you use dead code elimination, you will have to add all the reflected symbols as entry points.

Pitfalls

For some of the mechanisms, dead code elimination can be a problem, in particular when you "abuse" the linker to generate a symbol arrays. It can be worked around by telling the reflected symbols as entry points to the linker, and depending on the amount of symbols this might be neither nice nor convenient.

Conclusion

Combining nm and libdl can actually give quite good results. The combination can be almost as powerful as the level of Reflection used by JUnit 3.x in Java. The level of reflection given is sufficient to implement a JUnit 3.x-style unit test framework for C, including test-case discovery by naming convention.

Involving a parser is more work and limited to objects that you compile yourself, but gives you most power and freedom. The level of reflection given can be sufficient to implement a JUnit 4.x-style unit test framework for C, including test-case discovery by annotations. AceUnit is a unit test framework for C that does exactly this.

Combining parsing and the linker to generate symbol arrays can give very nice results - if your environment is so much under your control that you can ensure that working with the linker that way works for you.

And of course you can combine all approaches to stitch together the bits and pieces until they fit your needs.

Monarski answered 7/3, 2015 at 16:52 Comment(0)
E
5

Based on the responses to How can I add reflection to a C++ application? (Stack Overflow) and the fact that C++ is considered a "superset" of C, I would say you're out of luck.

There's also a nice long answer about why C++ doesn't have reflection (Stack Overflow).

Epidermis answered 30/8, 2009 at 3:51 Comment(1)
You can't get it in the language. You can get the equivalent effect by stepping outside the language; see other answers.Kirstiekirstin
A
5

I needed reflection in a bunch of structs in a C++ project.
I created a xml file with the description of all those structs - fortunately the fields types were primitive types.
I used a template (not C++ template) to auto generate a class for each struct along with setter/getter methods.
In each class I used a map to associate string names and class members (pointers to members).

I didn't regret using reflection because it opened new ways to design my core functionality that I couldn't even imagine without reflection.
(BTW, it was an external report generator for a program that uses a raw database)

So, I used code generation, function pointers and maps to simulate reflection.

Aryn answered 30/8, 2009 at 4:40 Comment(2)
So you used an ad hoc approach to achieve the desire effect. If that works for you, great. A structured approach to doing what you wanted would have allowed you to skip the hand-coded XML description of those structs, and "reflect" the actual data from the struct declarations themselves. See other answers to this question.Kirstiekirstin
Just what I had in mind as well, going to write a PHP script analyzing my .cpp files which then dumps C++ code, which does the reflection. I need it to use C structs via a scripting engine.Voyeur
S
3

You would need to implement it from yourself from the ground up. In straight C, there is no runtime information whatsoever kept on structure and composite types. Metadata simply does not exist in the standard.

Scotism answered 30/8, 2009 at 3:59 Comment(1)
Or, if you find implementing a full reflection system all by yourself a bit daunting, you can use a tool designed to analyze/transform code that has all the necessary machinery. See others answers to this question.Kirstiekirstin
H
3
  1. Implementing reflection for C would be much simpler... because C is simple language.
  2. There is some basic options for analazing program, like detect if function exists by calling dlopen/dlsym -- depends on your needs.
  3. There are tools for creating code that can modify/extend itselfusing tcc.
  4. You may use the above tool in order to create your own code analizers.
Hanny answered 30/8, 2009 at 4:21 Comment(2)
can you find the parameters of a function, say sin using dlopen/dlsym? and what are some good code analyzers?Ornament
"can you find the parameters of a function" No, just that such symbol exists.Hanny
M
2

For similar reasons to the author of the question, I have been working on a C-type-reflection-API along with a C reflection graph database format and a clang plug-in that writes reflection metadata.

The intent is to use the C reflection API for writing serialization and deserialization routines, such as mappers for ASN.1, function argument printers, function proxies, fuzzers, etc. Clang and GCC both have plugin APIs that allow access to the AST but there currently is no standard graph format for C reflection metadata.

The proposed C reflection API is called Crefl:

https://github.com/michaeljclark/crefl

The Crefl API provides runtime access to reflection metadata for C structure declarations with support for arbitrarily nested combinations of: intrinsic, set, enum, struct, union, field (member), array, constant, variable.

  • The Crefl reflection graph database format for portable reflection metadata.
  • The Crefl clang plug-in outputs C reflection metadata used by the library.
  • The Crefl API provides task-oriented query access to C reflection metadata

A C reflection API provides access to runtime reflection metadata for C structure declarations with support for arbitrarily nested combinations of: intrinsic, set, enum, struct, union, field, array, constant, variable. The Crefl C reflection data model is essentially a transcription of the C data types in ISO/IEC 9899:9999.

  • C intrinsic data types.
    • integer types.
    • floating-point types.
    • complex number types.
    • boolean type.
  • nested struct, union, field, and bitfield
  • arrays and pointers
  • typedef type aliases
  • enum and enum constants
  • functions and function parameters
  • const, volatile and restrict qualifiers
  • GNU-C style attributes using (__attribute__).

The library is still a work in progress. The hope is to find others who are interested in reflection support in C.

Macaulay answered 10/3, 2021 at 9:7 Comment(0)
S
-1

Parsers and Debug Symbols are great ideas. However, the gotcha is that C does not really have arrays. Just pointers to stuff.

For example, there is no way by reading the source code to know whether a char * points to a character, a string, or a fixed array of bytes based on some "nearby" length field. This is a problem for human readers let alone any automated tool.

Why not use a modern language, like Java or .Net? Can be faster than C as well.

Skyros answered 19/10, 2018 at 4:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.