Why is the highest rated answer 3.70x slower than this ?
% echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
| mawk2 '
function __(_,___,____,_____) {
____=gsub("[^1-9]+","",_)~""
___=10
while((+____<--___) && _) {
_____+=___*gsub(___,"",_)
}
return _____+length(_) }
BEGIN { FS=OFS=ORS
RS="^$"
} END {
print __($!_) }' )| pvE9 ) | gcat -n | lgp3 ;
in0: 173MiB 0:00:00 [1.69GiB/s] [1.69GiB/s] [<=> ]
out9: 11.0 B 0:00:09 [1.15 B/s] [1.15 B/s] [<=> ]
in0: 484MiB 0:00:00 [2.29GiB/s] [2.29GiB/s] [ <=> ]
( nice echo | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )
8.52s user 1.10s system 100% cpu 9.576 total
1 2822068024
% echo; ( time ( nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
\
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
| gtr -d '\n' \
\
| python3 -c 'import math, os, sys;
[ print(sum(int(digit) for digit in str(ln)), \
end="\n") \
\
for ln in sys.stdin ]' )| pvE9 ) | gcat -n | lgp3 ;
in0: 484MiB 0:00:00 [ 958MiB/s] [ 958MiB/s] [ <=> ]
out9: 11.0 B 0:00:35 [ 317miB/s] [ 317miB/s] [<=> ]
( nice echo | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )
35.22s user 0.62s system 101% cpu 35.447 total
1 2822068024
And that's being a bit generous already. On this large synthetically created test case of 2.82 GB, it's 19.2x slower.
% echo; ( time ( pvE0 < testcases_more108.txt | mawk2 'function __(_,___,____,_____) { ____=gsub("[^1-9]+","",_)~"";___=10; while((+____<--___) && _) { _____+=___*gsub(___,"",_) }; return _____+length(_) } BEGIN { FS=RS="^$"; CONVFMT=OFMT="%.20g" } END { print __($_) }' ) | pvE9 ) |gcat -n | ggXy3 | lgp3;
in0: 284MiB 0:00:00 [2.77GiB/s] [2.77GiB/s] [=> ] 9% ETA 0:00:00
out9: 11.0 B 0:00:11 [1016miB/s] [1016miB/s] [<=> ]
in0: 2.82GiB 0:00:00 [2.93GiB/s] [2.93GiB/s] [=============================>] 100%
( pvE 0.1 in0 < testcases_more108.txt | mawk2 ; )
8.75s user 2.36s system 100% cpu 11.100 total
1 3031397722
% echo; ( time ( pvE0 < testcases_more108.txt | gtr -d '\n' | python3 -c 'import sys; [ print(sum(int(_) for _ in str(__))) for __ in sys.stdin ]' ) | pvE9 ) |gcat -n | ggXy3 | lgp3;
in0: 2.82GiB 0:00:02 [1.03GiB/s] [1.03GiB/s] [=============================>] 100%
out9: 11.0 B 0:03:32 [53.0miB/s] [53.0miB/s] [<=> ]
( pvE 0.1 in0 < testcases_more108.txt | gtr -d '\n' | python3 -c ; )
211.47s user 3.02s system 100% cpu 3:32.69 total
1 3031397722
—————————————————————
UPDATE : native python3 code of that concept - even with my horrific python skills, i'm seeing a 4x speedup :
% echo; ( time ( pvE0 < testcases_more108.txt \
\
|python3 -c 'import re, sys;
print(sum([ sum(int(_)*re.subn(_,"",__)[1]
for _ in [r"1",r"2", r"3",r"4",
r"5",r"6",r"7",r"8",r"9"])
for __ in sys.stdin ]))' |pvE9))|gcat -n| ggXy3|lgp3
in0: 1.88MiB 0:00:00 [18.4MiB/s] [18.4MiB/s] [> ] 0% ETA 0:00:00
out9: 0.00 B 0:00:51 [0.00 B/s] [0.00 B/s] [<=> ]
in0: 2.82GiB 0:00:51 [56.6MiB/s] [56.6MiB/s] [=============================>] 100%
out9: 11.0 B 0:00:51 [ 219miB/s] [ 219miB/s] [<=> ]
( pvE 0.1 in0 < testcases_more108.txt | python3 -c | pvE 0.1 out9; )
48.07s user 3.57s system 100% cpu 51.278 total
1 3031397722
Even the smaller test case managed a 1.42x speed up :
echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE0 | python3 -c 'import re, sys; print(sum([ sum(int(_)*re.subn(_,"",__)[1] for _ in [r"1",r"2", r"3",r"4",r"5",r"6",r"7",r"8",r"9"]) for __ in sys.stdin ]))' | pvE9 )) |gcat -n | ggXy3 | lgp3
in0: 484MiB 0:00:00 [2.02GiB/s] [2.02GiB/s] [ <=> ]
out9: 11.0 B 0:00:24 [ 451miB/s] [ 451miB/s] [<=> ]
( nice echo | mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE 0.1 in0)
20.04s user 5.10s system 100% cpu 24.988 total
1 2822068024