The winners seem to be plain old dynamic languages.
Lisp is an obvious counter example, being a plain old dynamic language that is extremely verbose. On the other hand, APL/J/K would probably be much more concise than any of the other languages and they are dynamic. Also Mathematica...
Haskell and Common Lisp don't look more concise than claimed-to-be-verbose Java.
Your data are for tiny programs that have been optimized for performance and the measure is code size after compression using the GZIP algorithm on a particular setting so you cannot possibly draw general conclusions from them alone. Perhaps a more valid conclusion would be you are observing the bloat that results from performance optimizations so the most concise languages from your data are those that cannot be optimized because they are fundamentally inefficient (Python, Ruby, Javascript, Perl, Lua, PHP). Conversely, Haskell can be optimized with enough effort to create fast but verbose programs. Is that really a disadvantage of Haskell vs Python? Another equally valid conclusion is that Python, Ruby, Perl, Lua and PHP compress better using the GZIP algorithm on that setting. Perhaps if you repeat the experiment using run-length encoding or arithmetic coding or LZ77/8, maybe with BWT preconditioning, or another algorithm you would get completely different results?
There is also a huge amount of worthless cruft in the code on that site. Look at this snippet of OCaml code that is only necessary if your OCaml install is two generations out of date:
(* This module is a workaround for a bug in the Str library from the Ocaml
* distribution used in the Computer Language Benchmarks Game. It can be removed
* altogether when using OCaml 3.11 *)
module Str =
struct
include Str
let substitute_first expr repl_fun text =
try
let pos = Str.search_forward expr text 0 in
String.concat "" [Str.string_before text pos;
repl_fun text;
Str.string_after text (Str.match_end())]
with Not_found ->
text
let opt_search_forward re s pos =
try Some(Str.search_forward re s pos) with Not_found -> None
let global_substitute expr repl_fun text =
let rec replace accu start last_was_empty =
let startpos = if last_was_empty then start + 1 else start in
if startpos > String.length text then
Str.string_after text start :: accu
else
match opt_search_forward expr text startpos with
| None ->
Str.string_after text start :: accu
| Some pos ->
let end_pos = Str.match_end() in
let repl_text = repl_fun text in
replace (repl_text :: String.sub text start (pos-start) :: accu)
end_pos (end_pos = pos)
in
String.concat "" (List.rev (replace [] 0 false))
let global_replace expr repl text =
global_substitute expr (Str.replace_matched repl) text
and replace_first expr repl text =
substitute_first expr (Str.replace_matched repl) text
end
The single core versions often contain lots of code for parallelism, e.g. regex-dna in OCaml. Look at the monstrosity that is fasta in OCaml: the entire program is duplicated twice and it switches on the word size! I have an old OCaml version of fasta on disk here that is less that a fifth the size of that one...
Finally, I should note that I have contributed code to this site only to have it rejected because it was too good. Politics aside, the OCaml binary-trees used to contain the statement "de-optimized by Isaac Gouy" (although the comment has been removed, the deoptimization is still there making the OCaml code longer and slower) so you can assume that all of the results have been subjectively doctored specifically to introduce bias.
Basically, with such poor quality data you cannot hope to draw any insightful conclusions. You'd be much better off trying to find more significant programs that have been ported between languages but, even then, your results will be domain specific. I recommend forgetting about the shootout entirely...
N
integer suffixes toI
and fixed up the OCaml-ish Sys.argv call on the last line to get this version which actually compiles and produces output. I haven't verified if the output is correct or not, but there you go. – Kendyl