Automatically deleting unused local variables from C source code
Asked Answered
Y

10

10

I want to delete unused local variables from C file. Example:

int fun(int a , int b)
{
  int c,sum=0;
  sum=a + b;
    return sum;
}

Here the unused variable is 'c'.

I will externally have a list of all unused local variables. Now using unused local variables which I have, we have to find local variables from source code & delete.
In above Example "c" is unused variable. I will be knowing it (I have code for that). Here I have to find c & delete it .

EDIT

The point is not to find unused local variables with an external tool. The point is to remove them from code given a list of them.

Yarak answered 18/2, 2009 at 12:43 Comment(5)
Of course, you can take it further, and say "int fun (int a, int b) {return a + b;}" and eliminate sum as well. But that wasn't your question.Swansdown
What is the format of unused variable list that you have? Does it have line and column numbers?Patron
why do you want to do that? what stops you from manually delete unused variables as soon as they become unused?? or do your colleagues place them purposely into the code and tell you to remove them given a list?Piece
I think you're wishing for a destructive 'lint', which is a terminally stupid thing to wish for. Compiler warnings exist to tell you something, why not watch them?Macmullin
And what about the times when your compiler is just being an @$$ hole?Macmullin
S
21

Turn up your compiler warning level, and it should tell you.

Putting your source fragment in "f.c":

% gcc -c -Wall f.c
f.c: In function 'fun':
f.c:1: warning: unused variable 'c'
Swansdown answered 18/2, 2009 at 12:46 Comment(6)
No point in this! The question assumes you already have this list! And gcc doesn't come with a -Wjustfixit option.Garman
Yes it does, compare the output from -O0 and -O1 :)Encarnacion
Yeah, a decent compiler will do a static analysis and determine if it can remove the variable declaration from the output.Doughboy
Most good compilers just optimize unused variables away .. but I think the OP wants to remove them physically during the build process. I can't see how doing so would ever be a good idea.Macmullin
Right - it sounds like the beginnings of a tool to clean up code.Swansdown
You could use indent to force a style to the code then take the gcc output to clean it up using say a regex or a shell script.Sicyon
G
10

Tricky - you will have to parse C code for this. How close does the result have to be? Example of what I mean:

int a, /* foo */
    b, /* << the unused one */
    c; /* bar */

Now, it's obvious to humans that the second comment has to go.

Slight variation:

void test(/* in */ int a, /* unused */ int b, /* out */ int* c);

Again, the second comment has to go, the one before b this time.

In general, you want to parse your input, filter it, and emit everything that's not the declaration of an unused variable. Your parser would have to preserve comments and #include statements, but if you don't #include headers it may be impossible to recognize declarations (even more so if macro's are used to hide the declaration). After all, you need headers to decide if A * B(); is a function declaration (when A is a type) or a multiplication (when A is a variable)


[edit] Furthermore:

Even if you know that a variable is unused, the proper way to remove it depends a lot on remote context. For instance, assume

int foo(int a, int b, int c) { return a + b; }

Clearly, c is unused. Can you change it to ?

int foo(int a, int b) { return a + b; }

Perhaps, but not if &foo is stored int a int(*)(int,int,int). And that may happen somewhere else. If (and only if) that happens, you should change it to

int foo(int a, int b, int /*unused*/ ) { return a + b; }
Garman answered 18/2, 2009 at 12:53 Comment(3)
Aren't you making the (possibly) flawed assumption that the comments are correct?Swansdown
The question stated that the list of unused variables was externally available. That's quite realistic, compilers can find them. My point here is that you still need to cleanup the associated comments.Garman
Looks like you're the only person trying to answer the actual question asked. Upvoted.Patron
E
5

Why do you want to do this? Assuming you have a decent optimizing compiler (GCC, Visual Studio et al) the binary output will not be any different wheter you remove the 'int c' in your original example or not.

If this is just about code cleanup, any recent IDE will give you quick links to the source code for each warning, just click and delete :)

Encarnacion answered 18/2, 2009 at 14:2 Comment(1)
It leaves quite unreadable code if you have a lot of unused variables lying around! Having a nice way of highlighting unused variables as well as unused imports can be a really useful tool in terms of cleaning up code!Snips
M
5

My answer is more of an elaborate comment to MSalters' very thorough answer. I would go beyond 'tricky' and say that such a tool is both impossible and inadvisable.

If you are looking to simply remove the references to the variable, then you could write a code parser of your own, but it would need to distinguish between the function context it is in such as

int foo(double a, double b)
{
   b = 10.0;
   return (int) b;
}

int bar(double a, double b)
{
   a = 5.00;
   return (int) a;
}

Any simple parser would have trouble with both 'a' and 'b' being unused variables.

Secondly, if you consider comments as MSalter has, you'll discover that people do not comment consistently;

double a;
/*a is designed as a dummy variable*/
double b;

/*a is designed as a dummy variable*/
double a;
double b;

double a; /*a is designed as a dummy variable*/
double b;

etc.

So simply removing the unused variables will create orphaned comments, which are arguably more dangerous than not commenting at all.

Ultimately, it is an obscenely difficult task to do elegantly, and you would be mangling code regardless. By automating the process, you would be making the code worse.

Lastly, you should be considering why the variables were in the code in the first place, and if they are deprecated, why they were not deleted when all their references were.

Monophyletic answered 18/2, 2009 at 15:17 Comment(0)
I
1

Static code analysis tools in additional to warning level as Paul correctly stated.

Inexperienced answered 18/2, 2009 at 12:48 Comment(0)
J
1

As well as being able to reveal these through warnings, the compiler will normally optimise these away if any optimisations are turned on. Checking if a variable is never referenced is quite trivial in terms of implementation in the compiler.

Joacima answered 18/2, 2009 at 12:48 Comment(0)
P
0

You will need a good parser that preserves original character position of tokens (even in presence of preprocessor!). There are some tools for automated refactoring of C/C++, but they are far from mainstream.

I recommend you to check out Taras' Blog. The guy is doing some large automated refactorings of Mozilla codebase, like replacing out-params with return values. His main tool for code rewriting is Pork:

Pork is a C++ parsing and rewriting tool chain. The core of Pork is a C++ parser that provides exact character positions for the start and end of every AST node, as well as the set of macro expansions that contain any location. This information allows C++ to be automatically rewritten in a precise way.

From the blog:

So far pork has been used for “minor” things like renaming classes&functions, rotating outparameters and correcting prbool bugs. Additionally, Pork proved itself in an experiment which involved rewriting almost every function (ie generating a 3+MB patch) in Mozilla to use garbage collection instead of reference-counting.

It is for C++, but it may suit your needs.

Patron answered 24/2, 2009 at 8:10 Comment(0)
O
0

One of the posters above says "impossible and inadvisable". Another says "tricky", which is the right answer. You need 1) a full C (or whatever language of interest) parser, 2) inference procedures that understand the language identifier references and data flows to determine that a variable is indeed "dead", and 3) the ability to actually modify the source code.

What's hard about all this is the huge energy to build 1) 2) 3). You can't justify for any individual cleanup task. What one can do is to build such infrastructure specifically with the goal of amortizing it across lots of differnt program analysis and transformation tasks.

My company offers such a tool: The DMS Software Reengineering Toolkit. See http://www.semdesigns.com/Products/DMS/DMSToolkit.html DMS has production quality front ends for many languages, including C, C++, Java and COBOL.

We have in fact built an automated "find useless declarations" tool for Java that does two things: a) lists them all (thus producing the list!) b) makes a copy of the code with the useless declarations removed. You choose which answer you want to keep :-)

To do the same for C would not be difficult. We already have a tool that identifies such dead variables/functions.

One case we did not addess, is the "useless parameter" case, becasue to remove a useless parameter, you have to find all the calls from other modules, verify that setting up the argument doesn't have a side effect, and rip out the useless argument. We in fact have full graphs of the entire software system of interest, and so this would also be possible.

So, its just tricky, and not even very tricky if you have the right infrastructure.

Outrush answered 14/6, 2009 at 6:3 Comment(0)
R
-1

Also: splint.

Splint is a tool for statically checking C programs for security vulnerabilities and coding mistakes. With minimal effort, Splint can be used as a better lint. If additional effort is invested adding annotations to programs, Splint can perform stronger checking than can be done by any standard lint.

Recrystallize answered 18/2, 2009 at 12:48 Comment(0)
F
-1

You can solve the problem as a text processing problem. There must be a small number of regexp patterns how unused local variables are defined in the source code.

Using a list of unused variable names and the line numbers where they are, You can process the C source code line-by-line. On each line You can iterate over the variable names. On each variable name You can match the patterns one-by-one. After a successful match You know the syntax of the definition, so You know how to delete the unused variable from it.

For example if the source line is: "int a, unused, b;" and the compiler reported "unused" as an unused variable in that line, than the pattern "/, unused,/" will match and You can replace that substring with a single ",".

Fazeli answered 18/9, 2009 at 13:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.