How to use compile_commands.json with clang python bindings?
Asked Answered
P

3

8

I have the following script that attempts to print out all the AST nodes in a given C++ file. This works fine when using it on a simple file with trivial includes (header file in the same directory, etc).

#!/usr/bin/env python
from argparse import ArgumentParser, FileType
from clang import cindex


def node_info(node):
    return {'kind': node.kind,
            'usr': node.get_usr(),
            'spelling': node.spelling,
            'location': node.location,
            'file': node.location.file.name,
            'extent.start': node.extent.start,
            'extent.end': node.extent.end,
            'is_definition': node.is_definition()
            }


def get_nodes_in_file(node, filename, ls=None):
    ls = ls if ls is not None else []
    for n in node.get_children():
        if n.location.file is not None and n.location.file.name == filename:
            ls.append(n)
            get_nodes_in_file(n, filename, ls)
    return ls


def main():
    arg_parser = ArgumentParser()
    arg_parser.add_argument('source_file', type=FileType('r+'),
                            help='C++ source file to parse.')
    arg_parser.add_argument('compilation_database', type=FileType('r+'),
                            help='The compile_commands.json to use to parse the source file.')
    args = arg_parser.parse_args()
    compilation_database_path = args.compilation_database.name
    source_file_path = args.source_file.name
    clang_args = ['-x', 'c++', '-std=c++11', '-p', compilation_database_path]
    index = cindex.Index.create()
    translation_unit = index.parse(source_file_path, clang_args)
    file_nodes = get_nodes_in_file(translation_unit.cursor, source_file_path)
    print [p.spelling for p in file_nodes]


if __name__ == '__main__':
    main()

However, I get a clang.cindex.TranslationUnitLoadError: Error parsing translation unit. when I run the script and provide a valid C++ file that has a compile_commands.json file in its parent directory. This code runs and builds fine using CMake with clang, but I can't seem to figure out how to pass the argument for pointing to the compile_commands.json correctly.

I also had difficulty finding this option in the clang documentation and could not get -ast-dump to work. However, clang-check works fine by just passing the file path!

Palaestra answered 15/4, 2016 at 16:37 Comment(0)
P
14

Your own accepted answer is incorrect. libclang does support compilation databases and so does cindex.py, the libclang python binding.

The main source of confusion might be that the compilation flags that libclang knows/uses are only a subset of all arguments that can be passed to the clang frontend. The compilation database is supported but does not work automatically: it must be loaded and queried manually. Something like this should work:

#!/usr/bin/env python
from argparse import ArgumentParser, FileType
from clang import cindex

compilation_database_path = args.compilation_database.name
source_file_path = args.source_file.name
index = cindex.Index.create()

# Step 1: load the compilation database
compdb = cindex.CompilationDatabase.fromDirectory(compilation_database_path)

# Step 2: query compilation flags
try:
    file_args = compdb.getCompileCommands(source_file_path)
    translation_unit = index.parse(source_file_path, file_args)
    file_nodes = get_nodes_in_file(translation_unit.cursor, source_file_path)
    print [p.spelling for p in file_nodes]
except CompilationDatabaseError:
    print 'Could not load compilation flags for', source_file_path
Preclinical answered 13/7, 2016 at 13:40 Comment(8)
What version of clang was this first introduced? When I wrote this question I was using clang 3.4 of both libclang and cindex.py.Palaestra
No idea, but I'm pretty sure that at the time it was 3.8 already which had it for ages.Transcaucasia
Looks like it's in 3.4 as well, must have missed it.Palaestra
Great answer. Thank you so much, I'm sure that will help many people.Cheston
Thanks @Cheston for the bounty and kind words.Transcaucasia
@TamásSzelei : could you add imports to make your snippet easier to use?Cheston
@Cheston Sure.Transcaucasia
I think this way of creating the arguments has become deprecated. If anyone can update the answer to reflect the changes to the python API that would be awesome. I copy pasted this as is, and the file_args type seems incompatible with the expected type of the args parameter of the parsing method.Require
R
3

The accepted answer seems to be deprecated, at minimum it did not work for me, I had to do this:

import clang.cindex


def main():
    index = clang.cindex.Index.create()

    compdb = clang.cindex.CompilationDatabase.fromDirectory(
        "dir/")

    source_file_path = 'path/to/file.cpp'
    commands = compdb.getCompileCommands(source_file_path)

    file_args = []
    for command in commands:
        for argument in command.arguments:
            file_args.append(argument)
    file_args = file_args[3:-3]
    print(file_args)
    translation_unit = index.parse(source_file_path, args=file_args)

    comment_tokens = GetDoxygenCommentTokens(translation_unit)


if __name__ == "__main__":
    main()

Basically I had to iterate over the commands and the arguments to create a string, and then eliminate some g++ specific flags.

Require answered 12/7, 2021 at 5:7 Comment(2)
Worked on clang 16.0.1. The accepted answer reported error: 'CompileCommand' object has no attribute 'encode'Merrilee
On OSX, the file_args shuld be treated carefully with file_args = file_args[2:-4]Merrilee
P
0

From what I can tell Libclang does not support the compilation database but Libtooling does. To get around this I took the path to the compile_commands.json as an argument and ended up parsing it myself to find the file of interest and the relevant includes (the -I and -isystem includes).

Palaestra answered 18/4, 2016 at 15:17 Comment(1)
My answer is not correct, please see the accepted answer.Palaestra

© 2022 - 2024 — McMap. All rights reserved.