Why Does This Maintainability Index Increase?
Asked Answered
M

4

16

I would be appreciative if someone could explain to me the difference between the following two pieces of code in terms of Visual Studio's Code Metrics rules. Why does the Maintainability Index increase slightly if I don't encapsulate everything within using ( )?

Sample 1 (MI score of 71)

public static String Sha1(String plainText)
{
    using (SHA1Managed sha1 = new SHA1Managed())
    {
        Byte[] text = Encoding.Unicode.GetBytes(plainText);
        Byte[] hashBytes = sha1.ComputeHash(text);
        return Convert.ToBase64String(hashBytes);    
    }
}

Sample 2 (MI score of 73)

public static String Sha1(String plainText)
{
    Byte[] text, hashBytes;
    using (SHA1Managed sha1 = new SHA1Managed())
    {
        text = Encoding.Unicode.GetBytes(plainText);
        hashBytes = sha1.ComputeHash(text);
    }
    return Convert.ToBase64String(hashBytes);   
}

I understand metrics are meaningless outside of a broader context and understanding, and programmers should exercise discretion. While I could boost the score up to 76 with return Convert.ToBase64String(sha1.ComputeHash(Encoding.Unicode.GetBytes(plainText))), I shouldn't. I would clearly be just playing with numbers and it isn't truly any more readable or maintainable at that point. I am curious though as to what the logic might be behind the increase in this case. It's obviously not line-count.

Manila answered 1/5, 2010 at 6:13 Comment(1)
The discussion around this question and its various answers suggests that the Maintainability Index is not very intuitive -- see also my post on the maintainability index, discussing various problems with this metric.Apothecary
N
20

Having your variables all laid out at the top so you know what's in the function is more "maintainable", at least that's what whoever decides the rules for the code metrics thinks.

Whether that's actually true? Totally depends on the team working on the code. It seems you already know this by the tone of the question, but take almost all code metrics with a grain of salt, they're what someone thinks is best, that may not be true for teams outside of microsoft...do what's best for your team, not what some calculator tells you.

I wouldn't make changes that are detrimental to your and your team's coding performance (unless it's for actual performance or improved error handling, etc) that you think are less readable for getting a few points on the metrics board.

All that being said, if it gives you a very low maintainability, there probably is something worth looking at or breaking down into smaller chunks, as a very low score is probably not so acceptable, for pretty much any team.

Noninterference answered 1/5, 2010 at 6:17 Comment(2)
I believe this is correct - in terms of explaining the numbers - but I think that the maintainability calculation is obsolete in this regard. Once upon a time it made sense to declare all your variables at the top - once upon a time (some) languages demanded it! - but that time is long past. Today, minimizing an identifier's lifespan contributes more to maintainability.Lysimachus
This is incorrect: the metrics don't care about the location of variables (although I think it's pretty well accepted by now that proximity to usage is a plus). As Dan Bryant points out, you're just seeing the effect of having two variables declared in one line (meaning Byte[] only appears once in the second method), making the method "shorter" in terms of Halstead Volume.Wartow
B
8

This is an old question, but I just thought I'd add that the MI is partially based on Halstead volume, which is based on a count of 'operators' and 'operands'. If declaration of a variable by type is an 'operator', this would mean that Sample 2 has fewer operators, thus changing the score. In general, because the MI is a statistical measurement, it is of limited usefulness when dealing with small sample sizes (like a single short method.)

Bluegreen answered 25/7, 2012 at 18:7 Comment(2)
Interesting point, I wonder if the score would decrease by splitting Byte[] text, hashBytes into two linesLimit
@MarkSowul: That will not affect Halstead, but will affect lines of code, which is also a component of the Maintainability index.Apothecary
W
7

Because of the increased distance between the declaration of your variables and where they are used.

The rule is to reduce the variable span as much as possible, the span is the distance between the declaration of the variable and where it is used. As this distance increases, the risk increases that later code is introduced that affects the variable without the programmer realising the impact further down in the code.

Here is a link to a good book that covers this and many other topics on code quality. http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670/ref=dp_ob_title_bk

Wnw answered 1/5, 2010 at 6:18 Comment(8)
This would seem to be counter to the question, as the distance between declaration and usage is greater in the higher score.Noninterference
@Nick true, but odd. I find the version with the lower score much easier to read and @Chris' explanation seams entirely sensible to me.Megrim
@Damian - The explanation is backwards from the result, sensible I agree, but doesn't explain the question. I also prefer the one with the lower score, but according to this answer, it should have a higher score, it does not.Noninterference
@Nick I wondered if the OP had got the values the wrong way around. I tested it in VS2010 and it appears not. Although I get different values: 70 and 69. Strange, but highlights your point that it should be takenwith a pinch of salt.Megrim
I have been thinking about this, and while I guess only someone intimately involved with the analysis engine could give a real explanation for this, I wonder if the analysis engine is measuring the line length and saying that the "complexity" of the declaration and intitializing of the variables by calling functions is impacting the metric. I do not have access to VS now, what might be interesting to try is to take the first code and move the declaration of the variable into the scope of the using clause but still initialize on separate lines, what does that come back with?Wnw
Same numbers, @Chris. A lower value when the variables are defined and used within the using() block, than when they are defined outside the block.Manila
@Timothy, Thanks for the update. I must admit, I think it is just a quirk of the analysis, but again I doubt anyone outside of the team developing the technology for the analysis could provide a concrete answer. Of course the difference is small enough so it should not concern you, but from an intellectual stand point it is interesting!Wnw
The span of text and hashbytes is longer in the second one but the span of the variable in the using block gets shorter. I would guess the decreasing the number of statements in a using block rates higher than decreasing the span of simpler variables.Orit
T
0

Myself, I'd rather see return Convert.ToBase64String(sha1.ComputeHash(Encoding.Unicode.GetBytes(plainText))); it's a should rather than a shouldn't. This form has the advantage of concisely expressing the actual data-flow; if you add a bunch of temporary variables and assignments, I now have to read the variable names and match up their occurrences to see what's actually happening.

Tighe answered 2/5, 2010 at 14:2 Comment(3)
I disagree. On one line, you have to first visually scan to find the inner most expression (plainText), then backward, to GetBytes. "Okay, we've got the bytes of the text". Then sha1.ComputeHash ("okay, now we take the SHA1"), and then finally to base 64. If you break this up into a couple lines, I think it is far easier to "step-through" line-by-line, and comprehend what is going on. If one chooses intelligent variable names, you don't have to "match up their occurrences", it just makes sense.Monopolize
Actually the one-liner expresses the exact opposite of the data flow because the method calls are basically prefix notation. The actual flow is plainText -> GetBytes -> ComputeHash -> ToBase64String. Laying it out line by line shows this. The one-liner you have to read right-to-left. And it's much harder to debug because you won't see the intermediate steps. Pretend there's a null reference exception. The line-by-line will show you instantly. How about the one-liner?Limit
@MarkSowul I agree 100%. I always prefer readability than one liner with multiple chain of functions. If one of my team member does this, I'll tell him/her to refactor it.Greensboro

© 2022 - 2024 — McMap. All rights reserved.