What are the differences between the three methods of code coverage analysis?
Asked Answered
C

2

19

This sonar page basically lists the various methods employed by different code coverage analysis tools:

  1. Source code instrumentation(Used by Clover)
  2. Offline byte code instrumentation(Used by Cobertura)
  3. On-the-fly byte code instrumentation(Used by Jacoco)

What are these three methods and which one is the most efficient and why?If the answer to the question of efficiency is "it depends" , then please explain why?

Caiaphas answered 6/3, 2013 at 19:4 Comment(0)
C
16

Source code instrumentation consists in adding instructions to the source code before compiling it. These instructions are used to trace which parts of the codes have been executed.

Offline byte-code instrumentation consists in adding those same instructions, but after the compilation, directly into the byte-code.

On-the-fly byte-code instrumentation consists in adding those same instructions in the byte-code, but dynamically, at runtime, when the byte-code is loaded by the JVM.

This page has a comparison between the methods. It might be biased, since it's part of the Clover documentation.

Depending on your definition of "efficient", choose the one you like the most. I don't think you'll get enormous differences. They all do the job, and the big picture will be the same whatever the method used.

Chincapin answered 6/3, 2013 at 19:11 Comment(4)
I think you wanted a link to the Clover documentation?Tether
Yes. I forgot to add the link. Done now. Thanks for spotting the problem.Chincapin
As of now cobertura supports both Java 7 and Java 8Herbarium
But isn't using instrumentation, means that we have build which includes the instrumentation printings and another build for operational (without printing). So someone can always say, that the test were done on a different build , Right ?Panel
Q
3

In general the effect on coverage is the same.

Source code instrumentation can give superior reporting results, simply because byte-code instrumentation cannot distinguish any structure within source lines, as the code block granularity is only recorded in terms of source lines.

Imagine I have two nested if statements (or equivalently, if (a && b) ... *) in a single line. A source code instrumenter can see these, and provide coverage information for the multiple arms within the if, within the source line; it can report blocks based on lines and columns. A byte code instrumenter only sees one line wrapped around the conditions. Does it report the line as "covered" if condition a executes, but is false?

You may argue this is a rare circumstance (and it probably is), and is therefore not very useful. When you get bogus coverage on it followed by a field failure, you may change your mind about utility.

There's a nice example and explanation of how byte code coverage makes getting coverage of switch statements right, extremely difficult.

A source code instrumenter may also achieve faster test executions, because it has the compiler helping optimize the instrumented code. In particular, a probe inserted inside a loop by a binary instrumenter may get compiled inside the loop by a JIT compiler. A good Java compiler will see the instrumentation produces a loop-invariant result, and lift the instrumentation out of the loop. (A JIT compiler can arguably do this too; the question is whether they actually do so).

Quattrocento answered 6/3, 2013 at 23:17 Comment(8)
Actually, the reason that tools like Cobertura and JaCoCo don't show intra-line coverage information is simply that the developers chose not to implement it. In my own on-the-fly bytecode instrumentation tool (JMockit Coverage), this is implemented and separate line segments (such as in "if (a && b)") are shown as such in the coverage report.Ashraf
That's the way all tools are: "the developers chose not to implement (some feature)". Good that you have more enthusiasm. How do you distinguish the parts of the line? I didn't think the class files gave information at any finer grain of detail.Quattrocento
At the bytecode level, jump instructions are the basis for separating multiple executable segments in a line of code. Each jump instruction has a target instruction, which may or may not be in the same line; this information is made available by the ASM library which is used for bytecode manipulation. For each line of code, a list of the jump instructions and their targets is kept, and made available to the HTML report generator at a later time, which then parses each source line of code while matching bytecode branches to individual line segments.Ashraf
@Rogério: That must have been fun to implement. How can that work in the face of code optimizations/simplifications and code motion? (Another user reported that the condition test for "x==null" is left out by the compiler, if it knows that "x!=null" is true when it evaluates the condition "x==null". How do you match things up reliably?)Quattrocento
I didn't run into any such difficulties simply because there are no optimizations/simplifications made at the bytecode level; Java compilers generate standardized and unoptimized bytecode (with a few minor differences between javac and the Eclipse compiler), and that's (plus the source code) is all the coverage tool needs to work with. JIT optimizations don't interfere. I think the "other user" is mistaken; if there is a "x==null" condition in the source, it will always be in the bytecode.Ashraf
But isn't using instrumentation, means that we have build which includes the instrumentation printings and another build for operational (without printing). So someone can always say, that the test were done on a different build , Right ?Panel
@ransh: Yes, they can say that. Well designed instrumentation does not affect the functionality of the program, unless there is something explicit in the program to check for presence of that instrumentation to cause some behavioral change, and one can be reasonably certain in a cooperative engineering process that such checks do not exist except in bizarre ways (e.g., inspecting the program size, which will change due to added code volume from instrumentation). ...Quattrocento
@ransh: ... The instrumentation also affects performance (for our tools in Java, by adding about 10-15% overhead) and that may also affect functionality. Most applications have a lot of performance headroom so this tends not to be a big problem in practice. OK, so one can say that one tested something other than the uninstrumented build, but the practical effect is as if you had collected test coverage data on the production build. If you insist, and you dont mind the overhead, you can in fact deploy the instrumented code in production :-)Quattrocento

© 2022 - 2024 — McMap. All rights reserved.