During a code review, i found source code like this:
void f_odd(std::string &className, std::string &testName)
{
if (className.find("::") != std::string::npos)
{
testName = className.substr(className.find("::") + 2);
(void)className.erase(className.find("::"), std::string::npos);
}
}
Within this function, std::string::find() is called three times with the same pattern( here "::").
This code can of course be refactored to
void f(std::string &className, std::string &testName)
{
const size_t endOfClassNamePos = className.find("::");
if (endOfClassNamePos != std::string::npos)
{
testName = className.substr(endOfClassNamePos + 2);
(void)className.erase(endOfClassNamePos, std::string::npos);
}
}
where find is called only once.
Question
Does anybody know a strategy to detecting such a pattern like this? I am having a huge code base, where i intend to spot this pattern. I plan to use a Windows or a Linux environment.
Potential Strategies
- Use/adapt a static code analysis tool, like cppcheck to detect these kind of oddities.
- Search within the code base with regular expression.
- Use/adapt clang-tidy for detection of this pattern.
- Write a custom checker in some language (e.g. Python) that detects these issues. In this case, the checking should be performed on pre-processed code.
No Go's
- Manual review
Update 1
I decided to start with potential strategy 1). I plan to adapt cppcheck to catch this issue.
Cppcheck offers a possibility to write customized rules, based on PCRE regular expressions. For this, cppcheck has to be compiled with enabled PCRE support. Since the current test environment is Linux-based, the following commands can be used to download the latest version of cppcheck:
git clone https://github.com/danmar/cppcheck.git && cd cppcheck
After that, compile and install the tool as follows:
sudo make install HAVE_RULES=yes
Now the basic tool setup is done. In order to develop a cppcheck-rule, i prepared a simple test case (file: test.cpp), similar to the sample code in the first section of this article. This file contains three functions and the cppcheck-rule shall emit a warning on f_odd
and f_odd1
about consecutive identical std::string::find
calls.
test.cpp:
#include <string>
void f(std::string &className, std::string &testName)
{
const size_t endOfClassNamePos = className.find("::");
if (endOfClassNamePos != std::string::npos)
{
testName = className.substr(endOfClassNamePos + 2);
(void)className.erase(endOfClassNamePos, std::string::npos);
}
}
void f_odd(std::string &className, std::string &testName)
{
if (className.find("::") != std::string::npos)
{
testName = className.substr(className.find("::") + 2);
(void)className.erase(className.find("::"), std::string::npos);
}
}
#define A "::"
#define B "::"
#define C "::"
void f_odd1(std::string &className, std::string &testName)
{
if (className.find(A) != std::string::npos)
{
testName = className.substr(className.find(B) + 2);
(void)className.erase(className.find(C), std::string::npos);
}
}
So far so good. Now cppcheck has to be tweaked to catch consecutive identical std::string::find
calls. For this i have created a cppcheck_rule-file that contains a regular expression that matches consecutive identical std::string::find
calls:
<?xml version="1.0"?>
<rule>
<tokenlist>normal</tokenlist>
<pattern><![CDATA[([a-zA-Z][a-zA-Z0-9]*)(\s*\.\s*find)(\s*\(\s*\"[ -~]*\"\s*\))[ -\{\n]*(\1\2\3)+[ -z\n]]]></pattern>
<message>
<severity>style</severity>
<summary>Found identical consecutive std::string::find calls.</summary>
</message>
This file can be used to extend cppcheck about a new check. Lets try:
cppcheck --rule-file=rules/rule.xml test/test.cpp
and the output is
Checking test/test.cpp...
[test/test.cpp:14]: (style) Found identical consecutive std::string::find calls.
[test/test.cpp:26]: (style) Found identical consecutive std::string::find calls.
Now, identical consecutive std::string::find
calls can be detected in C/C++ codes. Does anybody know a better/more efficient or more clever solution?
References:
cppcheck
applies your rule, but you might want to create some test cases to make sure you don't get false positives: where thestring
is mutated betweenfind
calls (so the previous result can't be reused), where the calls happen in different functions, where different scopes have same-namedstring
s being searched.... – Sacrilege