Is there a standard way to count lines of code? [closed]
Asked Answered
G

22

17

I realize there's no definitely "right" answer to this question, but when people talk about lines of code, what do they mean? In C++ for example, do you count blank lines? Comments? Lines with just an open or close brace?

I know some people use lines of code as a productivity measure, and I'm wondering if there is a standard convention here. Also, I think there's a way to get various compilers to count lines of code - is there a standard convention there?

Gaekwar answered 9/12, 2008 at 16:33 Comment(2)
A miserable pile of characters. But enough code! Have at you!Cobia
The standard way to count LOC is to not count LOC.Triazine
H
25

No, there is no standard convention, and every tool that counts them will be slightly different.

This may make you ask, "Why then would I ever use LOC as a productivity measure?" and the answer is, because it doesn't really matter how you count a line of code, as long as you count them consistently you can get some idea of the general size of a project in relation to others.

Hermy answered 9/12, 2008 at 16:38 Comment(0)
M
16

Have a look at the Wikipedia Article, especially the "Measuring SLOC" section:

There are two major types of SLOC measures: physical SLOC and logical SLOC. Specific definitions of these two measures vary, but the most common definition of physical SLOC is a count of lines in the text of the program's source code including comment lines. Blank lines are also included unless the lines of code in a section consists of more than 25% blank lines. In this case blank lines in excess of 25% are not counted toward lines of code.

Logical SLOC measures attempt to measure the number of "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons). It is much easier to create tools that measure physical SLOC, and physical SLOC definitions are easier to explain. However, physical SLOC measures are sensitive to logically irrelevant formatting and style conventions, while logical SLOC is less sensitive to formatting and style conventions. Unfortunately, SLOC measures are often stated without giving their definition, and logical SLOC can often be significantly different from physical SLOC.

Consider this snippet of C code as an example of the ambiguity encountered when determining SLOC:

for (i=0; i<100; ++i) printf("hello");   /* How many lines of code is this? */

In this example we have:

  • 1 Physical Lines of Code LOC
  • 2 Logical Lines of Code lLOC (for statement and printf statement)
  • 1 Comment Line

[...]

Microscopy answered 9/12, 2008 at 16:40 Comment(0)
S
9

I'd say

  • comments count
  • blank lines count, because they're important for readability, but not more than one contiguously
  • lines with braces count too, but apply the same rule as for blank lines - i.e. 5 nested braces with no code between them counts as one line.

I'd also humbly suggest that any productivity measure which actually relies on a LoC value is bunk :)

Stertor answered 9/12, 2008 at 16:37 Comment(0)
W
5

Any day that I can end with fewer lines of code, but as much or more working functionality... is a good day. Being able to remove hundreds of lines of code and wind up with something that's just as functional, and more maintainable, is a wonderful thing.

That being said, unless you have very strict coding guidelines in your team, physical lines of code is a useless statistic. Logical lines of code is still useless, but as least it's not dangerously misleading.

Waac answered 9/12, 2008 at 20:54 Comment(0)
T
4

Whatever "wc -l" returns is my number.

Therefrom answered 9/12, 2008 at 16:36 Comment(0)
L
3

If you use LOC as a measure of productivity, you will suddenly find your programmers writing much more verbosely to "game the system". It's a stupid measure, and only stupid people use it for anything more than bragging rights.

Laconic answered 9/12, 2008 at 16:33 Comment(4)
I particularly like the 25% allowance for blank lines in the Wikipedia defn quote elsewhere. A simple checkin hook will ensure that you always get paid for your full allowance of blank lines ;-)Corvette
Better to use it for planning and estimating than to use it as a basis for computing programmer pay.Justiciary
@onebyone - and if they count comments as "better" than blank lines, a checkin hook to change all your blank lines to empty comments!Laconic
Bragging is the whole point, imo. ;) Personally I would only use this to see how much I've written for personal interest, and thus "cheating the system" doesn't apply.Fielding
Z
3

1 line = 4 seconds of reading. If it takes more than that to figure out what I'm saying on that line, the line's too long.

Zo answered 9/12, 2008 at 19:19 Comment(0)
J
2

"Lines of code" should include anything you have to maintain. That includes comments, but excludes whitespace.

If you're using this as a productivity metric, make sure you're making reasonable comparisons. A line of C++ isn't the same as a line of Ruby.

Jdavie answered 9/12, 2008 at 16:42 Comment(3)
Egads, I just looked that up on Wikipedia. It's horrible. :)Jdavie
@Bill, you've never done APL? You haven't lived until you've done it on an IBM Selectric converted to a terminal. Many of the operators required backspacing and overstriking.Laconic
I agree about comments. Comments (in my code at least) often contain commented-out lines that I haven't decided to necessarily erase yet and therefore are still part of the work, if not the actual program. Of course it depends what your purpose in measuring it is.Fielding
M
1

LOC is a notoriously ambiguous metric. For a detailed comparison, it's only valid when comparing code that's been written in the same language, with the same style, by the same team.

However, it does provide a certain complexity notion when looked at in an order-of-magnitude idea. A 10000-line program is much more complex than a 100-line program.

The advantage of LOC is that wc -l returns it, and there's no real fancyness involved in understanding or calculating it, unlike many other software metrics.

Mahaliamahan answered 9/12, 2008 at 16:52 Comment(0)
J
1

There's no right answer.

For informal estimates, I use wc -l.

If I needed to measure something rigorously, I would measure executable statements. Pretty much, anything with a statement terminator (usually semicolon), or ending with a block. For compound statements, I'd count each substatement.

So:

int i = 7;                  # one statement terminator; one (1) statement
if (r == 9)                # count the if as one (1) statement
  output("Yes");      # one statement terminator; one (1) statement; total (2) for the if
while (n <= 14) {    # count the while as one (1) statement
  output("n = ", n);  # one statement terminator; one (1) statement
  do_something();   # one statement terminator; one (1) statement
  n++                       # count this one, one statement (1), even though it doesn't need a statement terminator in some languages
}                              # brace doesn't count; total (4) for the while

If I were doing it in Scheme or Lisp, I'd count expressions.

As others have said, what matters most is that your count is consistent. It also matters what you're using this for. If you just want to let a potential new hire know how big your project is, use wc -l. If you're wanting to do planning and estimating, then you might want to get more formal. You should not in any circumstances be using LOC to base programmer compensation on.

Justiciary answered 9/12, 2008 at 17:17 Comment(0)
A
1

You should be thinking of "lines of code spent", not "lines of code produced".

Things should be as simple as possible, so creating a positive benchmark based on quantity of lines is encouraging bad code.

Furthermore, some things that are very difficult end up being solved with very little code, and some things that are very easy (boilerplate code like getters and setters for example) can add a lot of lines in very little time.

As for the original question, if I was going to count lines, I'd include every line other than consecutive blank lines. I'd include comments as well, since they are (hopefully) useful documentation.

Armagh answered 9/12, 2008 at 17:25 Comment(0)
S
1

The notion of LOC is a attempt to quantify a volume of code. As pointed out in other answers, it doesn't matter what you specifically call a line of code as long as you are consistent. Intuitively, it seems that a 10 line program smaller than an 100 line program which is smaller than a 1000 line program and so on. You would expect that it takes less time to create, deubg, and maintain a 100 line program than a 1000 line program. Informally at least, you can use LOC to give a rough feel for the amount of work required to create, debug, and maintain a program of a certain size.

Of course, there are places where this doesn't hold up. For example, a complex algorithm rendered in 1000 lines may be much harder to develop than, say, a simple database program that consumes 2500 lines.

So, LOC is a coarse-grained measure of code volume that enables managers to get a reasonable understading of the size of a problem.

Superload answered 9/12, 2008 at 20:12 Comment(0)
C
1

I use wc -l for a quick estimate of the complexity of a workspace. However, as a productivity metric LOC is THE WORST. I generally consider it a very productive day if my if LOC count goes DOWN.

Cite answered 21/12, 2008 at 23:44 Comment(0)
F
1

In the .NET world there seems to be a global agreement that a line of code (LoC) is a debugging sequence point. A sequence point is a unit of debugging, it is the code portion highlighted in dark-red when putting a break point. With sequence point we can talk of logical LoC, and this metric can be compared across various .NET languages. The logical LoC code metric is supported by most .NET tools including VisualStudio code metric, NDepend or NCover.

A 8 LoC method (beginning and ending brackets sequence points are not taken account):

alt text

Contrary to physical LoC (meaning just counting the number of line in a source file) the logical LoC has the immense advantage to not be dependent on coding style. Coding style, we all agree on that, can make the physical LoC counting varying from an order of magnitude from one developer to another one. I wrote a more detailed blog post on the topic: How do you count your number of Lines Of Code (LOC) ?

Fusionism answered 12/12, 2010 at 18:12 Comment(0)
V
0
  1. LOCphy: physically lines
  2. LOCbl: Blanklines Kommentarblocks werden als Kommentarzeile gezählt
  3. LOCpro: programming lines (declarations, definitions, directives & code)
  4. LOCcom: lines of comments

Many available tools are giving information of percentage of filled lines and so on.

You just have to look at it but do not only count on it.

LOC is growing massively on start of a project and it decreases often after reviews ;)

Viglione answered 9/12, 2008 at 16:38 Comment(0)
C
0

I think of it as a single processable statement. For example

(1 line)

Dim obj as Object

(5 lines)

If _amount > 0 Then
  _amount += 5
Else
  _amount -= 5
End If
Contingent answered 9/12, 2008 at 16:39 Comment(0)
F
0

I agree with the posts that say it is reported many ways and isn't an important metric. See this ever-hear-of-developers-getting-paid-per-line-of-code.

Faires answered 9/12, 2008 at 19:9 Comment(0)
N
0

I agree w/ the accepted answer by Craig H, however I'd like to add that in school I was taught that white space, comments and declarations shouldn't be counted as "lines of code" in terms of measuring the lines of code produced by a programmer for productivity purposes - i.e. Ol’ “15-lines-per-day” rule.

Newton answered 9/12, 2008 at 19:59 Comment(0)
A
0

I know some people use LoC as a productivity measure

Could you please tell me who they are so I don't accidentally work with (or even worse, for) them?

If I can implement in 1400 lines using Haskell what I could also implement in 2800 lines using C, am I more productive in C or Haskell? Which is going to take longer time? Which is going to have more bugs (hint: it's linear in the LOC count)?

A programmer's worth is how much his code changes (including from or to the empty string) increases the number on your bottom line. I know of no good way to measure or approximate that. But I know that any reasonably measurable metric can be gamed and doesn't reflect what you really want. So don't use it.

That being said, how do you count LOCs? Simple, use wc -l. Why is that the right tool? Well, you probably don't care about any particular number, but about general total trends (going up or down, and by how much), about individual trends (going up or down, changing direction how fast, ...) and about pretty much anything except just the number 82,763.

The differences between what the tools measure are probably not interesting. Unless you have evidence that the number spit out by your tool (and only that tool) correlates with something interesting, use it as a rough ballpark figure; anything other than monotonicity should be taken with not only a grain but a bucketful of salt.

Count how many times '\n' occurs. Other interesting characters to count might be ';', '{' and '/'.

Anuran answered 19/3, 2009 at 22:15 Comment(0)
V
0

Using LOC to measure a programmers performance is like judging the quality of a painting by its size. LOC's only "value" as far as I'm concerned is to impress your customers and scare your competition.

That said, I would think that the number of compiled instructions would be the least ambiguous. Still, bad programmers have the advantage in that they tend to write unnecessarily verbose code. I recall once replacing 800+ lines of really bad code with 28 lines. Does that make me a slacker?

Any project manager who uses LOC as a primary performance metric is an idiot who deserves bad programmers.

Vaginal answered 2/9, 2016 at 0:40 Comment(0)
H
0

I strongly recommend cloc tool for this work. It counts lines for many languages.

https://github.com/AlDanial/cloc#quick-start-

I used for our company and liked this tool. I will share a ss from output;

output from cloc

Homogony answered 21/12, 2020 at 12:38 Comment(0)
M
0

The useful code that you can use in the windows power shell :

 (GCI -include *.c -recurse | select-string .).Count
Muffler answered 13/8, 2022 at 15:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.