If functional languages are really concise, why don't they have a better rank in the language shootout game?
Asked Answered
E

8

24

I compared the languages at the language shootout game by their code size only. Here is a summary of what I got (shortest first, grouped by similar score).

  1. Python, Ruby, JavaScript, Perl, Lua, PHP, Mozart/OZ
  2. OCaml, Erlang, Racket, Go, Scala, F#, Smalltalk
  3. Pascal, Clean, Haskell, Common Lisp, C#, Java, C
  4. C++, Ada, ATS

I wonder why. The winners seem to be plain old dynamic languages. Erlang, Racket (née PLT Scheme), and F# are doing OK. Haskell and Common Lisp don't look more concise than claimed-to-be-verbose Java.

UPDATE:

I've found an insightful post on this topic with charts. I also found a similar comparison of languages for a larger program (a simple ray tracer). Altogether, I wouldn't say I got "the" answer, but I got some food for thought.

Eiderdown answered 26/8, 2010 at 21:39 Comment(22)
Could you give more explanation of what the shootout game is doing? The website you linked to is rather puzzling.Felicafelicdad
What makes you think that each example was written by an expert in that particular language? Looking at the F# code, I see a straight port to F# from C#, which was itself ported from Java. Hardly idomatic functional code. Since the code seems to largely consist of P/Invoke calls into a C++ library, I'm not even sure what this benchmark is supposed to be measuring - I guess the P/Invoke speed of Mono, since the benchmark was apparently not even run on the MS runtime...Kendyl
I'd think that haskell is doing so poorly because haskell code inflates quite a bit once you start optimizing it for speed.Deposal
I'm not sure why anyone would even care about the length of the code? That's not really high on my metrics of how good a programming language is.Antimonous
@David Thornley - Did you read the Help page?Circular
@Joel Mueller - You haven't said which F# code you looked at - the program contributed by some one named Don Syme? shootout.alioth.debian.org/u32q/…Circular
PLT Scheme isn't a woman and didn't get married - there are other ways to change your name.Circular
@Circular - I just kept clicking on the first F# link I saw until I reached this source code. It's not by Don Syme, it's not idiomatic F#, and since the page has the same background color as the OP's link, I think there's a better chance it was the source the OP was referring to.Kendyl
@Joel Mueller >> the same background color << Here's Don Syme's code with the same background color as the OP's link shootout.alioth.debian.org/u32/…Circular
@Joel Mueller >> it's not idiomatic F# << Please show the idiomatic F# way of using the GMP library - or perhaps you'd like to fix the F# program that doesn't use the GMP library - shootout.alioth.debian.org/u32/…Circular
@Circular - The question started as a discussion about how concise functional languages are. Please tell me what a benchmark that measures how well F# interoperates with the GMP library tells me about the conciseness of the F# language. My point was only that the first source code I found for a functional language I was familiar with wasn't particularly functional in nature - which calls into question the idea of using this particular benchmark as a measure of how concise functional languages are.Kendyl
@Circular - As far as fixing the non-GMP implementation you linked, I changed all the N integer suffixes to I and fixed up the OCaml-ish Sys.argv call on the last line to get this version which actually compiles and produces output. I haven't verified if the output is correct or not, but there you go.Kendyl
@Joel Mueller >> how well F# interoperates with the GMP library tells me about the conciseness of the F# language << Without wishing to be trite, it tells you about the conciseness of the F# language when using external libraries - the faster programs in the other languages are also likely to be using GMP. (Should you have been generalizing about "the F# code" from the first source code you found?)Circular
@Joel Mueller >> I haven't verified if the output is correct or not << It is, thanks.Circular
@Circular - Fine, it measures how concise interop with unmanaged code is in F#. Thank you for reinforcing my larger point, which is that a benchmark in which many of the contestants consist largely of calls to an external library is probably not the best starting point for a survey of language verbosity.Kendyl
@Joel Mueller >> many of the contestants consist largely of calls to an external library << Do you actually know which of the tasks do use external libraries or is this conclusion still based on looking at a single task?Circular
@Circular - I've never seen this site before today, and I don't feel in the least obligated to pore over every task to see which ones are fully implemented in the language ostensibly being tested. That should be the job of the guy doing studies of code size and posting the results on StackOverflow, no?Kendyl
@Joel Mueller - Of course you have no obligation to even glance at that website, until you start telling us what it largely consists of ;-)Circular
@Circular - Then it's a good thing I was just deferring to your obvious familiarity with the site when I based my "largely consist" on your "the faster programs in the other languages are also likely to be using GMP." But do go on and make another condescending remark - you seem to enjoy them so much, and I'd hate to spoil your fun.Kendyl
@Joel Mueller - Guessing what is or isn't on a website based on your understanding of someone else's comments will probably generate more heat than light. Thanks for the F# fix.Circular
>> I've found an insightful post on this topic with charts << fyi " Code-used Time-used Shapes" shootout.alioth.debian.org/u32/code-used-time-used-shapes.phpCircular
@Joel Mueller: You're wasting your time. Isaac Gouy's work is no more enlightening than random noise. In many respects, it is worse...Marva
D
19
  1. No language is always superior to another (well, there are a few exceptions... ;) ), so same applies for a group of broadly categorized languages. The benchmarks cover a broad array of topics, and X may be less suited for one than Y.
  2. The source code is gzipped, we don't really know how many lines of what length the programs were (yeah, it's an indicator)
  3. A considerable number of functional languages still do much better than the widespread imperative, static languages - it's not that functional programming languages aren't succinct, but the dynamic languages allow even more succinct programs
  4. At least in Haskell, much potential for conciseness comes from abstractions you can build yourself - but you do have to build them yourself and include them in your solution. A clever Haskell hacker may implement a monad in 20 lines that allows solving a small problem in 20 lines instead of 30 - the abstraction doesn't pay off for a small program, but could save many lines in a larger (eg. 200 lines instead of 300) program. I guess the same applies for Lisps (only macro instead of monad)
  5. Don't take the fanboys too seriously. FP is awesome and worth looking into, but it doesn't cure cancer and doesn't magically shorten any code by 25%
  6. They can still beat dynamic languages for some areas: For example, tree-like data structures and their processing is expressed extremely naturally in many functional languages, thanks to algebraic data types and pattern matching.
Duckbill answered 26/8, 2010 at 22:8 Comment(11)
>> what length the programs where (yeah, it's an indicator) << Is an indicator of formating style - for example, Ada programs seem to be long and thin, and Haskell programs seem to be shorter but wider.Circular
@djv - that's why the source is gzipped - but delnan seems to think LoC would be better.Circular
I think LoC and number of characters and possibly more metrics would be better.Duckbill
@delnan - Now find some way to demonstrate that, rather than just asserting an opinion.Circular
@igouy: Simple: With more data supplied, you can focus on e.g. the gzipped size in bytes while those who care can look at e.g. the character count. Or do you ask why I think LoC matters at all? Again simple, I consider 50 lines à 50 chars slightly more readable (all other things being equal) than 100 lines à 25 chars because more code fits on one screen.Duckbill
@delnan >> while those who care can look at << All you've said is that with different kinds of data people can look at different kinds of data - you haven't even demonstrated that those different kinds of data would change which programs we thought were smaller/bigger.Circular
@delnan >> 50 lines à 50 chars slightly more readable (all other things being equal) than 100 lines à 25 chars << Do you understand why newspapers and magazines printed in narrow columns?Circular
@igouy: That's not really the same. A much better example would be mathematical equations, where one line = one statement. We are not as prepared to read code that flows across multiple lines (which, with a 25 character width, will most definitely happen.)Antimonous
@David Liu - I'd like to see our opinions backed by measurements showing how effective our reading and code-editing is in each circumstance ;-)Circular
yes and nobody has mentioned J. for scientific computing, code size virtually unbeatable!Dugan
Compressed code will drastically suppress repetitions, such as verbose keyword-heavy idioms. Zipped code is by no means an even remotely accurate indicator of source size.Rialto
C
24

If functional languages are really concise...

1 - Programming in the large is different than programming in the small.

Nothing about those tiny benchmarks game programs should be taken as an example of how the abstractions and modularisation provided by each language would apply to programming in the large.

2 - Most of what you see in the benchmarks game summary pages only refers to the fastest programs contributed for each language implementation (slower programs are usually removed from the website after a while - when and which slower programs are removed is mostly arbitrary).

{edit: Adam, as you don't wish to take my word for it that the summary pages only refer to the fastest programs - look at the script that filters data rows for the "Which programming language is best?" page. Look at line 80 and line 82 in function ValidRowsAndMins in lib_scorecard.php - Alioth issue their own security certificate so your browser will complain.}

So to take Haskell as an example, you're looking at code size of the fastest Haskell programs that have been contributed.

3 - None of the meteor-contest programs have been removed, and meteor-contest is a programming contest without restriction - the smallest meteor-contest Haskell program is the slowest meteor-contest Haskell program.

Circular answered 27/8, 2010 at 3:6 Comment(3)
+1 because I can't do +10. Shootouts don't test concision in any useful sense. 10 lines vs 20 is not important, the important thing is 10000 vs 20000 or 10000000 vs 20000000. Oh, and understanding those 10M lines.Hoseahoseia
1 - That wikipedia link doesn't confirm your claim. If a language has good ways of abstractions, it should have a better API which is simpler to call. 2 - I couln't find any reference on the site about their bias towards faster programs.Eiderdown
@Adam Schmideg 1 Wikipedia explains the basic difference between tiny programs and large programs. These tiny programs don't provide the opportunities to create or use abstractions, that large programs provide. 2 There isn't a bias, it's all about the faster programs! Help page "Where can I see previous programs?" shootout.alioth.debian.org/help.php#previousCircular
D
19
  1. No language is always superior to another (well, there are a few exceptions... ;) ), so same applies for a group of broadly categorized languages. The benchmarks cover a broad array of topics, and X may be less suited for one than Y.
  2. The source code is gzipped, we don't really know how many lines of what length the programs were (yeah, it's an indicator)
  3. A considerable number of functional languages still do much better than the widespread imperative, static languages - it's not that functional programming languages aren't succinct, but the dynamic languages allow even more succinct programs
  4. At least in Haskell, much potential for conciseness comes from abstractions you can build yourself - but you do have to build them yourself and include them in your solution. A clever Haskell hacker may implement a monad in 20 lines that allows solving a small problem in 20 lines instead of 30 - the abstraction doesn't pay off for a small program, but could save many lines in a larger (eg. 200 lines instead of 300) program. I guess the same applies for Lisps (only macro instead of monad)
  5. Don't take the fanboys too seriously. FP is awesome and worth looking into, but it doesn't cure cancer and doesn't magically shorten any code by 25%
  6. They can still beat dynamic languages for some areas: For example, tree-like data structures and their processing is expressed extremely naturally in many functional languages, thanks to algebraic data types and pattern matching.
Duckbill answered 26/8, 2010 at 22:8 Comment(11)
>> what length the programs where (yeah, it's an indicator) << Is an indicator of formating style - for example, Ada programs seem to be long and thin, and Haskell programs seem to be shorter but wider.Circular
@djv - that's why the source is gzipped - but delnan seems to think LoC would be better.Circular
I think LoC and number of characters and possibly more metrics would be better.Duckbill
@delnan - Now find some way to demonstrate that, rather than just asserting an opinion.Circular
@igouy: Simple: With more data supplied, you can focus on e.g. the gzipped size in bytes while those who care can look at e.g. the character count. Or do you ask why I think LoC matters at all? Again simple, I consider 50 lines à 50 chars slightly more readable (all other things being equal) than 100 lines à 25 chars because more code fits on one screen.Duckbill
@delnan >> while those who care can look at << All you've said is that with different kinds of data people can look at different kinds of data - you haven't even demonstrated that those different kinds of data would change which programs we thought were smaller/bigger.Circular
@delnan >> 50 lines à 50 chars slightly more readable (all other things being equal) than 100 lines à 25 chars << Do you understand why newspapers and magazines printed in narrow columns?Circular
@igouy: That's not really the same. A much better example would be mathematical equations, where one line = one statement. We are not as prepared to read code that flows across multiple lines (which, with a 25 character width, will most definitely happen.)Antimonous
@David Liu - I'd like to see our opinions backed by measurements showing how effective our reading and code-editing is in each circumstance ;-)Circular
yes and nobody has mentioned J. for scientific computing, code size virtually unbeatable!Dugan
Compressed code will drastically suppress repetitions, such as verbose keyword-heavy idioms. Zipped code is by no means an even remotely accurate indicator of source size.Rialto
L
10

This seems like an opportunity to whip out:

Are there statistical studies that indicates that Python is "more productive"?

The point being, the original question is trying to use some (paltry, inappropriate) data to make a generalization about comparisons among programming languages. But in fact, it's nigh impossible to use any data to make any kind of reasonable general quantitative comparisons about programming languages.

Here is some food for thought, though:

  • All things being equal, dynamically-typed languages are likely to be more concise, since they don't need to spend time describing data types
  • All things being equal, among statically-typed languages, type-inferred languages are likely to me more concise, since they don't need to declare types all over the place
  • All things being equal, among statically-typed languages, those with generics/templates are more likely to be concise, since languages without them require repeated code or casts and indirection
  • All things being equal, languages with a concise lambda syntax are likely to be more concise, since lambda is perhaps the most important abstraction in programming for avoiding repetition and boilerplate

That said, all things are not equal, not by a long shot.

Lavinia answered 28/8, 2010 at 6:9 Comment(0)
P
2

The programs in the Alioth game are not really representative of programs in those languages in general. For one, the implementations there are highly optimized toward the Shootout's specific infrastructure, which can lead to less idiomatic and more bloated code in a functional language. It's similar to how some Ruby libraries will write performance-critical code in C — to look at that C code and declare Ruby to be bloated and low-level would really not be giving the language a fair shake.

For another, a big part of why functional languages are touted as being so terse is that they're good at making abstractions. This tends to help more in big programs than in one-function wonders, so languages specifically engineered for easy brevity win big there. Python, Perl and Ruby are specifically designed to make short programs short, whereas most functional languages have other goals (though that's not to say they just ignore code size either).

Payroll answered 28/8, 2010 at 5:35 Comment(1)
What program could ever be "really representative of programs in those languages in general"? - Your application is the ultimate benchmark shootout.alioth.debian.org/flawed-benchmarks.php#appCircular
R
2

I have recently ported a relatively short Java program to OCaml. I've dabbled in SML and Haskell in the past, plus extensive experience with C.

In this answer, I address a comparison of imperative to purely functional code (i.e.; no mutations). If you allow imperative code to sneak into otherwise functional programs, what are you comparing?

In my experience, purely functional programming (PFP) is elegant, but not more concise than imperative. There is less "declarations" boilerplate in PFP but more "conversion" boilerplate; e.g., unpacking from tuples, tail recursive helper functions, etc. So, neither paradigm allows the unencumbered expression of the pure "meat" of a program.

PFP has lower costs to get something running, and writing a program to demonstrate that a given algorithm works in principle goes over well in PFP. But, augmenting it to make it "real-world", having it deal with error conditions and illegal input and print diagnostics adds a lot of bloat that would have gone over more easily in an imperative language.

Ruisdael answered 10/3, 2015 at 9:9 Comment(1)
Indeed. That's why most people prefer impure functional languages. Graph algorithms are a great case study of a domain where pure solutions tend to look and perform really badly compared to their imperative counterparts.Marva
V
1

Must have something to do with expansive OOP libraries available to most of the languages in your level 1 and just plain old hacks like backticks for shell calls and perl regex syntax. Going off of python

pw = file("/etc/passwd")
for l in pw:
    print l.split(':')[0]

Printing all the usernames on a system, would take a lot more code if it weren't for the abstractions that OO languages are littered with. I'm not saying that it can't be done in other paradigms, but the trend is that every type has a lot of member functions that make tedious tasks simple. Personally I find purely functional languages to be only useful for academic purposes (but then again what do I know).

Valuate answered 26/8, 2010 at 21:54 Comment(6)
That could explain why dynamic languages score better. But it doesn't answer the question why functional languages score so poorly.Eiderdown
this doesn't explain anything. if functional language has rich library we could write something like file("/etc/passwd") map l.split(':')[0]Leroy
Funcional languages are niched towards academia, and they don't have an extensive library community which means you have to do more yourself. And, being niched towards academia, functional languages solve completely different problems anyway (and does so far more concisely than e.g. dynamic languages).Codpiece
In haskell it's readFile "/etc/passwd" >>= mapM (putStrLn . takeWhile (/=':')) . lines, which is 7 characters longer (3 if you minimize the spaces in both the python and the haskell sample). Four of that are because it's called readFile in haskell and file in python.Deposal
the task you describe is a system scripting task, which is what languages like Python were primarily designed for. So it's not surprising that it's good at it. Functional languages will do better at other tasks (such as manipulating data with complex structure).Hoseahoseia
In F#: for l in System.IO.File.ReadAllLines "/etc/passwd" do l.Split(':').[0] |> printfn "%s"Marva
L
1

You should consider the fact that your group 1 languages (scripting) are 30 to 100 times slower than C/C++, for the functional languages the same is between 2 and 7 times. The programs in the list are optimized for speed and measuring anything else is a secondary issue which isn't really a good indicatior of the real state of the language.
It is more interesting to look at the table where code size and running time each have weight 1. This way you get a comparison of the speed/maintainability ratio which seems a better metric than just code size.

Longship answered 27/8, 2010 at 10:55 Comment(0)
M
-1

The winners seem to be plain old dynamic languages.

Lisp is an obvious counter example, being a plain old dynamic language that is extremely verbose. On the other hand, APL/J/K would probably be much more concise than any of the other languages and they are dynamic. Also Mathematica...

Haskell and Common Lisp don't look more concise than claimed-to-be-verbose Java.

Your data are for tiny programs that have been optimized for performance and the measure is code size after compression using the GZIP algorithm on a particular setting so you cannot possibly draw general conclusions from them alone. Perhaps a more valid conclusion would be you are observing the bloat that results from performance optimizations so the most concise languages from your data are those that cannot be optimized because they are fundamentally inefficient (Python, Ruby, Javascript, Perl, Lua, PHP). Conversely, Haskell can be optimized with enough effort to create fast but verbose programs. Is that really a disadvantage of Haskell vs Python? Another equally valid conclusion is that Python, Ruby, Perl, Lua and PHP compress better using the GZIP algorithm on that setting. Perhaps if you repeat the experiment using run-length encoding or arithmetic coding or LZ77/8, maybe with BWT preconditioning, or another algorithm you would get completely different results?

There is also a huge amount of worthless cruft in the code on that site. Look at this snippet of OCaml code that is only necessary if your OCaml install is two generations out of date:

(* This module is a workaround for a bug in the Str library from the Ocaml
 * distribution used in the Computer Language Benchmarks Game. It can be removed
 * altogether when using OCaml 3.11 *)
module Str =
struct
  include Str

  let substitute_first expr repl_fun text =
    try
      let pos = Str.search_forward expr text 0 in
      String.concat "" [Str.string_before text pos;
                        repl_fun text;
                        Str.string_after text (Str.match_end())]
    with Not_found ->
      text

  let opt_search_forward re s pos =
    try Some(Str.search_forward re s pos) with Not_found -> None

  let global_substitute expr repl_fun text =
    let rec replace accu start last_was_empty =
      let startpos = if last_was_empty then start + 1 else start in
      if startpos > String.length text then
        Str.string_after text start :: accu
      else
        match opt_search_forward expr text startpos with
        | None ->
            Str.string_after text start :: accu
        | Some pos ->
            let end_pos = Str.match_end() in
            let repl_text = repl_fun text in
            replace (repl_text :: String.sub text start (pos-start) :: accu)
                    end_pos (end_pos = pos)
    in
      String.concat "" (List.rev (replace [] 0 false))

  let global_replace expr repl text =
    global_substitute expr (Str.replace_matched repl) text
  and replace_first expr repl text =
    substitute_first expr (Str.replace_matched repl) text
end

The single core versions often contain lots of code for parallelism, e.g. regex-dna in OCaml. Look at the monstrosity that is fasta in OCaml: the entire program is duplicated twice and it switches on the word size! I have an old OCaml version of fasta on disk here that is less that a fifth the size of that one...

Finally, I should note that I have contributed code to this site only to have it rejected because it was too good. Politics aside, the OCaml binary-trees used to contain the statement "de-optimized by Isaac Gouy" (although the comment has been removed, the deoptimization is still there making the OCaml code longer and slower) so you can assume that all of the results have been subjectively doctored specifically to introduce bias.

Basically, with such poor quality data you cannot hope to draw any insightful conclusions. You'd be much better off trying to find more significant programs that have been ported between languages but, even then, your results will be domain specific. I recommend forgetting about the shootout entirely...

Marva answered 5/9, 2010 at 12:38 Comment(7)
>> It can be removed altogether when using OCaml 3.11 << Pity no OCaml partisan has contributed up-to-date code - The Objective Caml native-code compiler, version 3.11.1Circular
>> only to have it rejected because << it didn't do what was required.Circular
@iguoy: "Pity no OCaml partisan has contributed up-to-date code". You need an OCaml partisan to remove that block of code from your website for you?Marva
When an OCaml expert notices a simple improvement they can make to an OCaml program, would you guess that they go ahead and contribute a tested program with that change, or that they spend that 5 minutes in public forums complaining about the code? Go figure.Circular
When the author of a website is given a simple instruction explaining how they can improve the quality of their content, they refuse to do anything until an expert comes along and does it for them. Go figure.Marva
I think it's better that people who consider themselves OCaml programmers contribute the OCaml programs, rather than people who make no claim to be OCaml programmers.Circular
We're not talking about contributing a program. We're talking about deleting the block of code clearly marked for deletion by the OCaml programmer who already contributed it. So you demand that an expert delete clearly marked code for you but you are happy to subjectively "de-optimize" that same code without any expert help at all and when people like me point this out your response it to try to hide the fact that you had ever "de-optimized" the code yourself?Marva

© 2022 - 2024 — McMap. All rights reserved.