Is there a way to specify and use a list of all data members belonging to a C++ class
Asked Answered
M

4

5

When developing C++ code, I often find myself trying to do something for all the data members belonging to a class. Classic examples are in the copy constructor and assignment operator. Another place is when implementing serialization functions. In all of these situations, I have spent a lot of time tracking down bugs in large codebases where someone has added a data member to a class but has not added its usage to one of these functions where it is needed.

With C++11, 14, 17, and 20, there are many template programming techniques that are quite sophisiticated in what they can do. Unfortunately, I understand only a little of template metaprogramming. I am hoping that someone can point me to a way to specify a list of variables (as a class member and/or type) that I can use to help reduce errors where someone has inadvertently left a member out. I am okay with both a compile-time and run-time penalty, as long as there is an easy way at build time to specify whether or not to use such instrumentation.

A notional usage might look like:

class Widget {
    template <typename Archive> void serialize(Archive ar) {
        auto myvl = vl(); // make a new list from the one defined in the constructor
        ar(a);
        ar(x);
        myvl.pop(a);
        myvl.pop(x);

        // a run-time check that would be violated because s is still in myvl.
        if (!myvl.empty())
            throw std::string{"ill-definied serialize method; an expected variable was not used"};
        // even better would be a compile-time check
    }

private:
    int a;
    double x;
    std::string s;
    VariableList vl(a, x, s);
};

Or perhaps some static analysis, or ...

I am just looking for a way to improve the quality of my code. Thanks for any help.

Municipal answered 5/2, 2021 at 2:11 Comment(4)
In standard C++ compiler I don't know of a way to get list of members or methods. But I'm sure there are some C++ parsers (maybe CLang parser) that can give you AST tree where you can find this information. I have one suggestion for your case - if you mark structures as packed ( __attribute__((packed)) in gcc/clang) then you may check that sizeof of all members that you serialize should be equal in sum to sizeof of whole class. If class is virtual then extra 8 bytes have to be added (for 64-bit). This is like a fast error check to find missing members in serialization list.Firedamp
Thanks for the sizeof tip. I would prefer to not alter the native packing of the class content (I often use Eigen or other data structures that care about alignment.), but something is better than nothing. Yes, this seems like a great use of a static analysis tool, but which one, and how? (I am more of an application programmer than a C++ or tool expert, so I need something more well defined than having an AST tree - a good idea, but it is not a practical help to someone of my ability.)Municipal
Regarding attributes for sizeof - you don't really need to modify your structures permanently. You just put some macros like struct ALL_ATTRS S { ... };, then you do two phases, regular compile and testing comile. When you do regular compile you just set #define ALL_ATTRS i.e. you just make them empty, when you do testing compile to check all members you do #define ALL_ATTRS __attribute__((packed)). Regarding AST, right now I'm writing code for doing AST work, I'll post an answer when I'm ready!Firedamp
Thanks for the tips. I look forward to hearing about your AST work. I must admit that I am hooked on Visual Studio (2019) - perhaps this will provide the final motivation I need to figure out the Visual Studio integration capabilities with clang.Municipal
H
3

This is no way to do this without reflection support. The alternative way is to transform your customized struct into the tuple of your member reference then using std::apply to operate the elements of the tuple one by one. You can see CppCon 2016: "C++14 Reflections Without Macros, Markup nor External Tooling" for the details. Here are the concepts:

First, we need to detect your customized struct's fields count:

template <auto I>
struct any_type {
  template <class T> constexpr operator T& () const noexcept;
  template <class T> constexpr operator T&&() const noexcept;
};

template <class T, auto... Is>
constexpr auto detect_fields_count(std::index_sequence<Is...>) noexcept {
  if constexpr (requires { T{any_type<Is>{}...}; }) return sizeof...(Is);
  else 
    return detect_fields_count<T>(std::make_index_sequence<sizeof...(Is) - 1>{});
}

template <class T>
constexpr auto fields_count() noexcept {
  return detect_fields_count<T>(std::make_index_sequence<sizeof(T)>{});
}

Then we can transform your struct into tuple according to the fields_count traits (to illustrate, I only support the fields_count up to 8):

template <class S>
constexpr auto to_tuple(S& s) noexcept {
  if constexpr (constexpr auto count = fields_count<S>(); count == 8) {
    auto& [f0, f1, f2, f3, f4, f5, f6, f7] = s;
    return std::tie(f0, f1, f2, f3, f4, f5, f6, f7);
  } else if constexpr (count == 7) {
    auto& [f0, f1, f2, f3, f4, f5, f6] = s;
    return std::tie(f0, f1, f2, f3, f4, f5, f6);
  } else if constexpr (count == 6) {
    auto& [f0, f1, f2, f3, f4, f5] = s;
    return std::tie(f0, f1, f2, f3, f4, f5);
  } else if constexpr (count == 5) {
    auto& [f0, f1, f2, f3, f4] = s;
    return std::tie(f0, f1, f2, f3, f4);
  } else if constexpr (count == 4) {
    auto& [f0, f1, f2, f3] = s;
    return std::tie(f0, f1, f2, f3);
  } else if constexpr (count == 3) {
    auto& [f0, f1, f2] = s;
    return std::tie(f0, f1, f2);
  } else if constexpr (count == 2) {
    auto& [f0, f1] = s;
    return std::tie(f0, f1);
  } else if constexpr (count == 1) {
    auto& [f0] = s;
    return std::tie(f0);
  } else if constexpr (count == 0) {
    return std::tie();
  }
}

Then you can use this utility in your own serialize functions:

struct Widget {
template <typename Archive>
  void serialize(Archive ar) {    
    std::apply([ar](auto&... x) { (ar(x), ...); }, to_tuple(*this));
  }
};

See godbolt for the live demo.

Herzel answered 5/2, 2021 at 4:54 Comment(2)
Thank you for the excellent answer, and especially for the excellent example of godbolt! You have given me an automated way to implement serialization for ALL the members without having to list them explicitly in the serialize method. This is nice. But ... what if I have a member of the Widget that I want to leave out of the serialization?Municipal
@Municipal This solution above gives you all members without names. It means if you want to skip some member you have to know its position among all members. You just remove from tuple obtained by to_tuple(...) one element at that position.Firedamp
M
3

This feature is coming with the (compile time) reflection feature. https://root.cern/blog/the-status-of-reflection/ talks about its status at a technical level last year.

Reflection is a c++23 priority, and is likely to be there.

Before that, one approach I do is write a single point of failure for all such operations. I call it as_tie:

struct Foo {
  int x,y;
  template<class Self, std::enable_if_t<std::is_same_v<Foo, std::decay_t<Self>>, bool> =true>
  friend auto as_tie(Self&& self){
    static_assert(sizeof(self)==8);
    return std::forward_as_tuple( decltype(self)(self).x, decltype(self)(self).y );
  }
  friend bool operator==(Foo const&lhs, Foo const& rhs){
    return as_tie(lhs)==as_tie(rhs);
  }
};

or somesuch depending on dialect.

Then your seializer/deserializer/etc can use as_tie, maybe using foreach_tuple_element. Versioning can even be done; as_tie_v2_2_0 for an obsolete tie.

And if someone adds a member, the sizeof static assert probably fires.

Morale answered 5/2, 2021 at 3:18 Comment(4)
Thanks for the reply, but, unfortunately, I do not understand how as_tie helps. Is there some documentation providing background and more details?Municipal
@phil do you know what a tuple is, and how to use it? How about std::tie? I added a == implemented using it.Morale
Yes, I understand tie and tuple, but I am not familiar with forward_as_tuple or foreach_tuple_element. And I do not see how I would use as_tie in the serialize method to detect that a data member was not included in the serialization implementation.Municipal
@phil "foreach tuple c++" on google returns https://mcmap.net/q/17208/-how-can-you-iterate-over-the-elements-of-an-std-tuple/1774667 - there are many implementations depending on your exact standard version. If you are elementwise serializing, you can just foreach over the tie and serialize each;mco,ponents that are tied you can make substructures (so asto make missing stuff less likely). If doing something fancier that isn't elementwise, you'll have to get creative; store poimters from tie and eliminate them as you serialize? Do the same with counting? You have basic reflection now, use it?Morale
H
3

This is no way to do this without reflection support. The alternative way is to transform your customized struct into the tuple of your member reference then using std::apply to operate the elements of the tuple one by one. You can see CppCon 2016: "C++14 Reflections Without Macros, Markup nor External Tooling" for the details. Here are the concepts:

First, we need to detect your customized struct's fields count:

template <auto I>
struct any_type {
  template <class T> constexpr operator T& () const noexcept;
  template <class T> constexpr operator T&&() const noexcept;
};

template <class T, auto... Is>
constexpr auto detect_fields_count(std::index_sequence<Is...>) noexcept {
  if constexpr (requires { T{any_type<Is>{}...}; }) return sizeof...(Is);
  else 
    return detect_fields_count<T>(std::make_index_sequence<sizeof...(Is) - 1>{});
}

template <class T>
constexpr auto fields_count() noexcept {
  return detect_fields_count<T>(std::make_index_sequence<sizeof(T)>{});
}

Then we can transform your struct into tuple according to the fields_count traits (to illustrate, I only support the fields_count up to 8):

template <class S>
constexpr auto to_tuple(S& s) noexcept {
  if constexpr (constexpr auto count = fields_count<S>(); count == 8) {
    auto& [f0, f1, f2, f3, f4, f5, f6, f7] = s;
    return std::tie(f0, f1, f2, f3, f4, f5, f6, f7);
  } else if constexpr (count == 7) {
    auto& [f0, f1, f2, f3, f4, f5, f6] = s;
    return std::tie(f0, f1, f2, f3, f4, f5, f6);
  } else if constexpr (count == 6) {
    auto& [f0, f1, f2, f3, f4, f5] = s;
    return std::tie(f0, f1, f2, f3, f4, f5);
  } else if constexpr (count == 5) {
    auto& [f0, f1, f2, f3, f4] = s;
    return std::tie(f0, f1, f2, f3, f4);
  } else if constexpr (count == 4) {
    auto& [f0, f1, f2, f3] = s;
    return std::tie(f0, f1, f2, f3);
  } else if constexpr (count == 3) {
    auto& [f0, f1, f2] = s;
    return std::tie(f0, f1, f2);
  } else if constexpr (count == 2) {
    auto& [f0, f1] = s;
    return std::tie(f0, f1);
  } else if constexpr (count == 1) {
    auto& [f0] = s;
    return std::tie(f0);
  } else if constexpr (count == 0) {
    return std::tie();
  }
}

Then you can use this utility in your own serialize functions:

struct Widget {
template <typename Archive>
  void serialize(Archive ar) {    
    std::apply([ar](auto&... x) { (ar(x), ...); }, to_tuple(*this));
  }
};

See godbolt for the live demo.

Herzel answered 5/2, 2021 at 4:54 Comment(2)
Thank you for the excellent answer, and especially for the excellent example of godbolt! You have given me an automated way to implement serialization for ALL the members without having to list them explicitly in the serialize method. This is nice. But ... what if I have a member of the Widget that I want to leave out of the serialization?Municipal
@Municipal This solution above gives you all members without names. It means if you want to skip some member you have to know its position among all members. You just remove from tuple obtained by to_tuple(...) one element at that position.Firedamp
F
1

PART 1 of 2 (see part 2 below)

I decided to make a special tool that uses CLang's AST tree.

As you're working on Windows, I wrote next instructions for Windows.

CLang library (SDK) as I found is very Linux oriented, it is difficult to use it straight away from sources on Windows. That's why I decided to use binary distribution of CLang to solve your task.

LLVM for Windows can be downloaded from github releases page, particularly current release is 11.0.1. To use it on windows you have to download LLVM-11.0.1-win64.exe. Install it to some folder, in my example I installed it into C:/bin/llvm/.

Also Visual Studio has its own CLang packaged inside, it also can be used, but it is a bit outdated, so maybe very new C++20 features are not supported.

Find clang++.exe in your LLVM installation, for my case it is C:/bin/llvm/bin/clang++.exe, this path is used in my script as c_clang variable in the beginning of script.

I used Python to write parsing tool, as this is well known and popular scripting language. I used my script to parse console output of CLang AST dump. You can install Python by download it from here.

Also AST tree can be parsed and processed at C++ level using CLang's SDK, example of AST Visitor implementation is located here, but this SDK can be probably used well only on Windows. That's why I chosen to use binary Windows distribution and parsing of console output. Binary distribution under Linux can also be used with my script.

You may try my script online on Linux server by clicking Try it online! link below.

Script can be run using python script.py prog.cpp, this will produce output prog.cpp.json with parsed tree of namespaces and classes.

As a base script uses command clang++ -cc1 -ast-dump prog.cpp to parse .cpp file into AST. You may try running command manually to see what it outputs, for example part of example output looks like this:

..................
|-CXXRecordDecl 0x25293912570 <line:10:13, line:13:13> line:10:19 class P definition
| |-DefinitionData pass_in_registers standard_layout trivially_copyable trivial literal
| | |-DefaultConstructor exists trivial needs_implicit
| | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveConstructor exists simple trivial needs_implicit
| | |-CopyAssignment simple trivial has_const_param needs_implicit implicit_has_const_param
| | |-MoveAssignment exists simple trivial needs_implicit
| | `-Destructor simple irrelevant trivial needs_implicit
| |-CXXRecordDecl 0x25293912690 <col:13, col:19> col:19 implicit class P
| |-FieldDecl 0x25293912738 <line:11:17, col:30> col:30 x 'const char *'
| `-FieldDecl 0x252939127a0 <line:12:17, col:22> col:22 y 'bool'
..............

I parse this output to produce JSON output file. JSON file will look like this (part of file):

.............
{
    "node": "NamespaceDecl",
    "name": "ns2",
    "loc": "line:3:5, line:18:5",
    "tree": [
        {
            "node": "CXXRecordDecl",
            "type": "struct",
            "name": "R",
            "loc": "line:4:9, line:6:9",
            "tree": [
                {
                    "node": "FieldDecl",
                    "type": "bool *",
                    "name": "pb",
                    "loc": "line:5:13, col:20"
                }
            ]
        },
.............

As you can see JSON file has next fields: node tells CLang's name of node, it can be NamespaceDecl for namespace, CXXRecordDecl for struct/class/union, FieldDecl for fields of struct (members). I hope you can easily find opensource JSON C++ parsers if you need, because JSON is the most simple format for storing structured data.

Also in JSON there are field name with name of namespace/class/field, type with type of class or field, loc that says location inside file of namespace/class/field definition, tree having a list of child nodes (for namespace node children are other namespaces or classes, for class node children are fields or other inlined classes).

Also my program prints to console simplified form, just list of classes (with full qualified name including namespaces) plus list of fields. For my example input .cpp it prints:

ns1::ns2::R - pb
ns1::ns2::S::P - x y
ns1::ns2::S::Q - r
ns1::ns2::S - i j b

Example input .cpp used:

// Start
namespace ns1 {
    namespace ns2 {
        struct R {
            bool * pb;
        };
        struct S {
            int i, j;
            bool b;
            class P {
                char const * x;
                bool y;
            };
            class Q {
                R r;
            };
        };
    }
}

int main() {
}

I also tested my script on quite complex .cpp having thousands of lines and dozens of classes.

You can use my script next way - after your C++ project is ready you run my script on your .cpp files. Using script output you can figure out what classes you have and what fields each class has. Then you can check somehow if this list of fields is same as your serialization code has, you can write simple macros for doing auto-checking. I think getting list of fields is the main feature that is needed for you. Running my script can be some preprocessing stage before compilation.

If you don't know Python and want to suggest me any improvements to my code, tell me, I'll update my code!

Try it online!

import subprocess, re, os, sys, json, copy, tempfile, secrets

c_file = ''
c_clang = 'C:/bin/llvm/bin/clang++.exe'

def get_ast(fname, *, enc = 'utf-8', opts = [], preprocessed = False, ignore_clang_errors = True):
    try:
        if not preprocessed:
            fnameo = fname
            r = subprocess.run([c_clang, '-cc1', '-ast-dump'] + opts + [fnameo], capture_output = True)
            assert r.returncode == 0
        else:
            with tempfile.TemporaryDirectory() as td:
                tds = str(td)
                fnameo = tds + '/' + secrets.token_hex(8).upper()
                r = subprocess.run([c_clang, '-E'] + opts + [f'-o', fnameo, fname], capture_output = True)
                assert r.returncode == 0
                r = subprocess.run([c_clang, '-cc1', '-ast-dump', fnameo], capture_output = True)
                assert r.returncode == 0
    except:
        if not ignore_clang_errors:
            #sys.stdout.write(r.stdout.decode(enc)); sys.stdout.flush()
            sys.stderr.write(r.stderr.decode(enc)); sys.stderr.flush()
            raise
        pass
    return r.stdout.decode(enc), fnameo
    
def proc_file(fpath, fout = None, *, clang_opts = [], preprocessed = False, ignore_clang_errors = True):
    def set_tree(tree, path, **value):
        assert len(path) > 0
        if len(tree) <= path[0][0]:
            tree.extend([{} for i in range(path[0][0] - len(tree) + 1)])
        if 'node' not in tree[path[0][0]]:
            tree[path[0][0]]['node'] = path[0][1]
        if 'tree' not in tree[path[0][0]] and len(path) > 1:
            tree[path[0][0]]['tree'] = []
        if len(path) > 1:
            set_tree(tree[path[0][0]]['tree'], path[1:], **value)
        elif len(path) == 1:
            tree[path[0][0]].update(value)
    def clean_tree(tree):
        if type(tree) is list:
            for i in range(len(tree) - 1, -1, -1):
                if tree[i] == {}:
                    tree[:] = tree[:i] + tree[i+1:]
            for e in tree:
                clean_tree(e)
        elif 'tree' in tree:
            clean_tree(tree['tree'])
    def flat_tree(tree, name = (), fields = ()):
        for e in tree:
            if e['node'] == 'NamespaceDecl':
                if 'tree' in e:
                    flat_tree(e['tree'], name + (e['name'],), ())
            elif e['node'] == 'CXXRecordDecl':
                if 'tree' in e:
                    flat_tree(e['tree'], name + (e['name'],), ())
            elif e['node'] == 'FieldDecl':
                fields = fields + (e['name'],)
                assert 'tree' not in e['node']
            elif 'tree' in e:
                flat_tree(e['tree'], name, ())
        if len(fields) > 0:
            print('::'.join(name), ' - ', ' '.join(fields), sep = '')
    ast, fpath = get_ast(fpath, opts = clang_opts, preprocessed = preprocessed, ignore_clang_errors = ignore_clang_errors)
    fname = os.path.basename(fpath)
    ipath, path, tree = [],(), []
    st = lambda **value: set_tree(tree, path, **value)
    inode, pindent = 0, None
    for line in ast.splitlines():
        debug = (path, line)
        if not line.strip():
            continue
        m = re.fullmatch(r'^([|`\- ]*)(\S+)(?:\s+.*)?$', line)
        assert m, debug
        assert len(m.group(1)) % 2 == 0, debug
        indent = len(m.group(1)) // 2
        node = m.group(2)
        debug = (node,) + debug
        if indent >= len(path) - 1:
            assert indent in [len(path), len(path) - 1], debug
        while len(ipath) <= indent:
            ipath += [-1]
        ipath = ipath[:indent + 1]
        ipath[indent] += 1
        path = path[:indent] + ((ipath[indent], node),)
        line_col, iline = None, None
        m = re.fullmatch(r'^.*\<((?:(?:' + re.escape(fpath) + r'|line|col)\:\d+(?:\:\d+)?(?:\, )?){1,2})\>.*$', line)
        if m: #re.fullmatch(r'^.*\<.*?\>.*$', line) and not 'invalid sloc' in line and '<<<' not in line:
            assert m, debug
            line_col = m.group(1).replace(fpath, 'line')
            if False:
                for e in line_col.split(', '):
                    if 'line' in e:
                        iline = int(e.split(':')[1])
                if 'line' not in line_col:
                    assert iline is not None, debug
                    line_col = f'line:{iline}, ' + line_col
        changed = False
        if node == 'NamespaceDecl':
            m = re.fullmatch(r'^.+?\s+?(\S+)\s*$', line)
            assert m, debug
            st(name = m.group(1))
            changed = True
        elif node == 'CXXRecordDecl' and line.rstrip().endswith(' definition') and ' implicit ' not in line:
            m = re.fullmatch(r'^.+?\s+(union|struct|class)\s+(?:(\S+)\s+)?definition\s*$', line)
            assert m, debug
            st(type = m.group(1), name = m.group(2))
            changed = True
        elif node == 'FieldDecl':
            m = re.fullmatch(r'^.+?\s+(\S+?)\s+\'(.+?)\'\s*$', line)
            assert m, debug
            st(type = m.group(2), name = m.group(1))
            changed = True
        if changed and line_col is not None:
            st(loc = line_col)
    clean_tree(tree)
    if fout is None:
        fout = fpath + '.json'
    assert fout.endswith('.json'), fout
    with open(fout, 'wb') as f:
        f.write(json.dumps(tree, indent = 4).encode('utf-8'))
    flat_tree(tree)
    
if __name__ == '__main__':
    if c_file:
        proc_file(c_file)
    else:
        assert len(sys.argv) > 1
        proc_file(sys.argv[1])

Input:

// Start
namespace ns1 {
    namespace ns2 {
        struct R {
            bool * pb;
        };
        struct S {
            int i, j;
            bool b;
            class P {
                char const * x;
                bool y;
            };
            class Q {
                R r;
            };
        };
    }
}

int main() {
}

Output:

ns1::ns2::R - pb
ns1::ns2::S::P - x y
ns1::ns2::S::Q - r
ns1::ns2::S - i j b

JSON output:

[
    {
        "node": "TranslationUnitDecl",
        "tree": [
            {
                "node": "NamespaceDecl",
                "name": "ns1",
                "loc": "line:2:1, line:19:1",
                "tree": [
                    {
                        "node": "NamespaceDecl",
                        "name": "ns2",
                        "loc": "line:3:5, line:18:5",
                        "tree": [
                            {
                                "node": "CXXRecordDecl",
                                "type": "struct",
                                "name": "R",
                                "loc": "line:4:9, line:6:9",
                                "tree": [
                                    {
                                        "node": "FieldDecl",
                                        "type": "bool *",
                                        "name": "pb",
                                        "loc": "line:5:13, col:20"
                                    }
                                ]
                            },
                            {
                                "node": "CXXRecordDecl",
                                "type": "struct",
                                "name": "S",
                                "loc": "line:7:9, line:17:9",
                                "tree": [
                                    {
                                        "node": "FieldDecl",
                                        "type": "int",
                                        "name": "i",
                                        "loc": "line:8:13, col:17"
                                    },
                                    {
                                        "node": "FieldDecl",
                                        "type": "int",
                                        "name": "j",
                                        "loc": "col:13, col:20"
                                    },
                                    {
                                        "node": "FieldDecl",
                                        "type": "bool",
                                        "name": "b",
                                        "loc": "line:9:13, col:18"
                                    },
                                    {
                                        "node": "CXXRecordDecl",
                                        "type": "class",
                                        "name": "P",
                                        "loc": "line:10:13, line:13:13",
                                        "tree": [
                                            {
                                                "node": "FieldDecl",
                                                "type": "const char *",
                                                "name": "x",
                                                "loc": "line:11:17, col:30"
                                            },
                                            {
                                                "node": "FieldDecl",
                                                "type": "bool",
                                                "name": "y",
                                                "loc": "line:12:17, col:22"
                                            }
                                        ]
                                    },
                                    {
                                        "node": "CXXRecordDecl",
                                        "type": "class",
                                        "name": "Q",
                                        "loc": "line:14:13, line:16:13",
                                        "tree": [
                                            {
                                                "node": "FieldDecl",
                                                "type": "ns1::ns2::R",
                                                "name": "r",
                                                "loc": "line:15:17, col:19"
                                            }
                                        ]
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }
]

PART 2 of 2

Digging inside sources of CLang I just found out that there is a way to dump into JSON directly from CLang, by specifying -ast-dump=json (read PART 1 above for clarification), so PART1 code is not very useful, PART2 code is a better solution. Full AST dumping command would be clang++ -cc1 -ast-dump=json prog.cpp.

I just wrote simple Python script to extract simple information from JSON dump, almost same like in PART1. On each line it prints full qualified struct/class/union name (including namespaces), then space, then separated by | list of fields, each field is field type then ; then field name. First lines of script should be modified to correct path to clang++.exe location (read PART1).

Code below that collects fields names and types for all classes can be easily implemented also in C++ if desired. And even used at runtime to provide different useful meta-information, for your case checking if all fields where serialized and in correct order. This code uses just JSON format parser which is available everywhere for all programming languages.

Next script can be run same like first one by python script.py prog.cpp.

import subprocess, json, sys

c_file = ''
c_clang = 'C:/bin/llvm/bin/clang++.exe'

r = subprocess.run([c_clang, '-cc1', '-ast-dump=json', c_file or sys.argv[1]], check = False, capture_output = True)
text = r.stdout.decode('utf-8')
data = json.loads(text)

def flat_tree(tree, path = (), fields = ()):
    is_rec = False
    if 'kind' in tree:
        if tree['kind'] == 'NamespaceDecl':
            path = path + (tree['name'],)
        elif tree['kind'] == 'CXXRecordDecl' and 'name' in tree:
            path = path + (tree['name'],)
            is_rec = True
    if 'inner' in tree:
        for e in tree['inner']:
            if e.get('kind', None) == 'FieldDecl':
                assert is_rec
                fields = fields + ((e['name'], e.get('type', {}).get('qualType', '')),)
            else:
                flat_tree(e, path, ())
    if len(fields) > 0:
        print('::'.join(path), '|'.join([f'{e[1]};{e[0]}' for e in fields]))

flat_tree(data)

Output:

ns1::ns2::R bool *;pb
ns1::ns2::S::P const char *;x|bool;y
ns1::ns2::S::Q ns1::ns2::R;r
ns1::ns2::S int;i|int;j|bool;b

For input:

// Start
namespace ns1 {
    namespace ns2 {
        struct R {
            bool * pb;
        };
        struct S {
            int i, j;
            bool b;
            class P {
                char const * x;
                bool y;
            };
            class Q {
                R r;
            };
        };
    }
}

int main() {
}

CLang's AST JSON partial example output:

...............
{
    "id":"0x1600853a388",
    "kind":"CXXRecordDecl",
    "loc":{
        "offset":189,
        "line":10,
        "col":19,
        "tokLen":1
    },
    "range":{
        "begin":{
            "offset":183,
            "col":13,
            "tokLen":5
        },
        "end":{
            "offset":264,
            "line":13,
            "col":13,
            "tokLen":1
        }
    },
    "name":"P",
    "tagUsed":"class",
    "completeDefinition":true,
    "definitionData":{
        "canPassInRegisters":true,
        "copyAssign":{
            "hasConstParam":true,
            "implicitHasConstParam":true,
            "needsImplicit":true,
            "trivial":true
        },
        "copyCtor":{
            "hasConstParam":true,
            "implicitHasConstParam":true,
            "needsImplicit":true,
            "simple":true,
            "trivial":true
        },
        "defaultCtor":{
            "exists":true,
            "needsImplicit":true,
            "trivial":true
        },
        "dtor":{
            "irrelevant":true,
            "needsImplicit":true,
            "simple":true,
            "trivial":true
        },
        "isLiteral":true,
        "isStandardLayout":true,
        "isTrivial":true,
        "isTriviallyCopyable":true,
        "moveAssign":{
            "exists":true,
            "needsImplicit":true,
            "simple":true,
            "trivial":true
        },
        "moveCtor":{
            "exists":true,
            "needsImplicit":true,
            "simple":true,
            "trivial":true
        }
    },
    "inner":[
        {
            "id":"0x1600853a4a8",
            "kind":"CXXRecordDecl",
            "loc":{
                "offset":189,
                "line":10,
                "col":19,
                "tokLen":1
            },
            "range":{
                "begin":{
                    "offset":183,
                    "col":13,
                    "tokLen":5
                },
                "end":{
                    "offset":189,
                    "col":19,
                    "tokLen":1
                }
            },
            "isImplicit":true,
            "name":"P",
            "tagUsed":"class"
        },
        {
            "id":"0x1600853a550",
            "kind":"FieldDecl",
            "loc":{
                "offset":223,
                "line":11,
                "col":30,
                "tokLen":1
            },
            "range":{
                "begin":{
                    "offset":210,
                    "col":17,
                    "tokLen":4
                },
                "end":{
                    "offset":223,
                    "col":30,
                    "tokLen":1
                }
            },
            "name":"x",
            "type":{
                "qualType":"const char *"
            }
        },
        {
            "id":"0x1600853a5b8",
            "kind":"FieldDecl",
            "loc":{
                "offset":248,
                "line":12,
                "col":22,
                "tokLen":1
            },
            "range":{
                "begin":{
                    "offset":243,
                    "col":17,
                    "tokLen":4
                },
                "end":{
                    "offset":248,
                    "col":22,
                    "tokLen":1
                }
            },
            "name":"y",
            "type":{
                "qualType":"bool"
            }
        }
    ]
},
...............
Firedamp answered 5/2, 2021 at 6:14 Comment(0)
L
0

Is there a way to specify and use a list of all data members belonging to a C++ class

Yes, if you use a recent GCC compiler (GCC 10 in start of 2021). Code your GCC plugin doing so.

See also the DECODER project and the Bismon software and this draft report.

I am just looking for a way to improve the quality of my code

Consider using tools like the Clang static analyzer or Frama-C++.

Lingerie answered 5/2, 2021 at 10:36 Comment(1)
Yes, I know static analysis tools will help me improve the quality of my code. But I do not know how to use a static analyzer to help me address this particular kind of problem - where I have a class method, like serialization or assignment operator, that needs to do something with almost all the class data members. (I say almost because there may be some class members that do not participate in the serialization step.) So when someone adds a data member, I want the tool to warn me that the new member is not being used in that method. Then the method can be adjust accordingly.Municipal

© 2022 - 2024 — McMap. All rights reserved.