How do I match a for loop with vector declaring in it's body?
Asked Answered
C

1

8

I hope to match all for-loop with vector declaring in it's body (might extend to while-loop and do-while-loop later):

#include <vector>

int main() {
  for (int i = 0; i < 10; i++) {
    std::vector<int> foo;
  }

  return 0;
}

The AST I got from this program is

`-FunctionDecl 0x55c1d33c8bc8 <main.cpp:3:1, line:9:1> line:3:5 main 'int ()'
  `-CompoundStmt 0x55c1d3402a48 <col:12, line:9:1>
    |-ForStmt 0x55c1d34029e0 <line:4:3, line:6:3>
    | |-DeclStmt 0x55c1d33c8d40 <line:4:8, col:17>
    | | `-VarDecl 0x55c1d33c8cb8 <col:8, col:16> col:12 used i 'int' cinit
    | |   `-IntegerLiteral 0x55c1d33c8d20 <col:16> 'int' 0
    | |-<<<NULL>>>
    | |-BinaryOperator 0x55c1d33c8db0 <col:19, col:23> 'bool' '<'
    | | |-ImplicitCastExpr 0x55c1d33c8d98 <col:19> 'int' <LValueToRValue>
    | | | `-DeclRefExpr 0x55c1d33c8d58 <col:19> 'int' lvalue Var 0x55c1d33c8cb8 'i' 'int'
    | | `-IntegerLiteral 0x55c1d33c8d78 <col:23> 'int' 10
    | |-UnaryOperator 0x55c1d33c8df0 <col:27, col:28> 'int' postfix '++'
    | | `-DeclRefExpr 0x55c1d33c8dd0 <col:27> 'int' lvalue Var 0x55c1d33c8cb8 'i' 'int'
    | `-CompoundStmt 0x55c1d34029c8 <col:32, line:6:3>
    |   `-DeclStmt 0x55c1d34029b0 <line:5:5, col:25>
    |     `-VarDecl 0x55c1d33cd388 <col:5, col:22> col:22 foo 'std::vector<int>' callinit destroyed
    |       `-CXXConstructExpr 0x55c1d3402988 <col:22> 'std::vector<int>' 'void () noexcept'
    `-ReturnStmt 0x55c1d3402a38 <line:8:3, col:10>
      `-IntegerLiteral 0x55c1d3402a18 <col:10> 'int' 0

Now how do I write a matcher (in clang-tidy customized check) to match such pattern?

I read the document for customizing clang-tidy checks and Matching the Clang AST, but they don't seem to provide enough information on how I should actually combine each APIs.


update: with @Scott McPeak's answer, I can match the for loop with clang-query in terminal but I have trouble transplanting this matcher into my clang-tidy check:

void LowPerfLoopCheck::registerMatchers(MatchFinder *Finder) {
  Finder->addMatcher(
      forStmt(hasDescendant(varDecl(hasType(hasDeclaration(cxxRecordDecl(
                                        matchesName("^::std::vector$")))))
                                .bind("vector-in-for"))),
      this);
}

When building clang-tidy, it says call of overloaded 'hasType()' is ambiguous:

llvm-project/clang-tools-extra/clang-tidy/misc/LowPerfLoopCheck.cpp: In member function ‘virtual void clang::tidy::misc::LowPerfLoopCheck::registerMatchers(clang::ast_matchers::MatchFinder*)’:
/home/wentao/Desktop/llvm-project/clang-tools-extra/clang-tidy/misc/LowPerfLoopCheck.cpp:19:44: error: call of overloaded ‘hasType(clang::ast_matchers::internal::PolymorphicMatcher<clang::ast_matchers::internal::HasDeclarationMatcher, void(clang::ast_matchers::internal::TypeList<clang::CallExpr, clang::CXXConstructExpr, clang::CXXNewExpr, clang::DeclRefExpr, clang::EnumType, clang::ElaboratedType, clang::InjectedClassNameType, clang::LabelStmt, clang::AddrLabelExpr, clang::MemberExpr, clang::QualType, clang::RecordType, clang::TagType, clang::TemplateSpecializationType, clang::TemplateTypeParmType, clang::TypedefType, clang::UnresolvedUsingType, clang::ObjCIvarRefExpr>), clang::ast_matchers::internal::Matcher<clang::Decl> >)’ is ambiguous
   19 |       forStmt(hasDescendant(varDecl(hasType(hasDeclaration(cxxRecordDecl(
      |                                     ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   20 |                                         matchesName("^::std::vector$")))))
      |                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Why does the same code works for clang-query but not for clang-tidy checks?

Cloy answered 9/1 at 7:34 Comment(0)
B
2

You can find all for loops that contain, somewhere, a declaration of a std::vector variable using this AST matcher:

#!/bin/sh

PATH=$HOME/opt/clang+llvm-16.0.0-x86_64-linux-gnu-ubuntu-18.04/bin:$PATH

# In this query, the comments are ignored because clang-query (not the
# shell) recognizes and discards them.
query='m
  forStmt(                        # Report any "for" statement
    hasDescendant(                # where a descendant
      varDecl(                    # is a variable declaration
        hasType(                  # whose type
          cxxRecordDecl(          # is a struct/class/union (possibly templatized)
            matchesName(          # whose name matches the regex
              "^::std::vector$"   # given here.
            )
          )
        )
      ).bind("vector_var_decl")   # Bind to the declaration AST node.
    )
  )
'

clang-query \
  -c="$query" \
  test.cc -- -w

# EOF

This matcher is expressed as a clang-query shell script for ease of experimentation, but the matcher expression can be copied directly into a C++ clang-tidy checker implementation like this:

void HelloCheck::registerMatchers(MatchFinder *Finder) {
  Finder->addMatcher(
    forStmt(                        // Report any "for" statement
      hasDescendant(                // where a descendant
        varDecl(                    // is a variable declaration
          hasType(                  // whose type
            cxxRecordDecl(          // is a struct/class/union (possibly templatized)
              matchesName(          // whose name matches the regex
                "^::std::vector$"   // given here.
              )
            )
          )
        ).bind("vector_var_decl")   // Bind to the declaration AST node.
      )
    ),
    this
  );
}

To create this matcher, I used the list of matchers at AST Matcher Reference. It takes some trial and error to find the right combination; I wish the documentation was better. One small tip I can give is you can use the anything() matcher as a placeholder that matches anything while developing the matcher incrementally.

In this case, the first hard part (for me, despite some prior experience) was realizing I needed to use the cxxRecordDecl matcher before matchesName could be used. Originally, I tried hasDeclaration, but that matches declarations that don't necessarily have a name, and consequently matchesName will just silently fail (clang-query could be a lot more informative in cases like this). By using cxxRecordDecl, we're guaranteed to be working with a declaration that has a name to match, so it works.

The second problem, when translating my original clang-query matcher into a C++ clang-tidy check, was an error about an ambiguous call to hasType. The original matcher had:

  varDecl(hasType(hasDeclaration(cxxRecordDecl(...))))

Evidently, hasDeclaration in that spot causes problems in the C++ context (although I don't really understand why), and my solution was to just remove it:

  varDecl(hasType(cxxRecordDecl(...)))

To expand to other kinds of loops, the whileStmt and doStmt matchers could be substituted for forStmt.

Example input:

#if 1
  #include <vector>
#else
  // Simplified stand-in for easier AST inspection.
  namespace std {
    template <class T>
    class vector {};
  }
#endif

class SomethingElse {};

int main() {
  for (int i = 0; i < 10; i++) {       // Report.
    std::vector<int> foo;              // Original example.
  }

  for (int i = 0; i < 10; i++) {       // Do not report.
    SomethingElse se;                  // Not a std::vector.
  }

  for (;;) {                           // Report.
    {
      std::vector<int> foo;            // Not an immediate child.
    }
  }

  return 0;
}

Corresponding output:

$ ./cmd.sh

Match #1:

$PWD/test.cc:14:3: note: "root" binds here
  for (int i = 0; i < 10; i++) {       // Report.
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$PWD/test.cc:15:5: note: "vector_var_decl" binds here
    std::vector<int> foo;              // Original example.
    ^~~~~~~~~~~~~~~~~~~~

Match #2:

$PWD/test.cc:22:3: note: "root" binds here
  for (;;) {                           // Report.
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$PWD/test.cc:24:7: note: "vector_var_decl" binds here
      std::vector<int> foo;            // Not an immediate child.
      ^~~~~~~~~~~~~~~~~~~~
2 matches.

Basically answered 10/1 at 1:10 Comment(2)
Hi @Scott, thanks for your answer. The matcher you provided works for clang-query but not for clang-tidy check. Could you take a minute to read the update in the question?Cloy
I see; that's the first time I've seen it not work when copied over to C++. I don't really understand the error (the AST matchers are implemented in a very complicated way), but with more trial and error I was able to adjust the matcher (removing hasDeclaration) so it works in both clang-query and in a C++ clang-tidy check. The answer has been edited accordingly.Basically

© 2022 - 2024 — McMap. All rights reserved.