Which scripting languages support long (64 bit) integers well?
Asked Answered
S

4

5

Perl has long been my choice scripting language but I've run into a horrible problem. By default there is no support for long (64 bit) integers. Most of the time an integer is just a string and they work for seeking in huge files but there are plenty of places they don't work, such as binary &, printf, pack, unpack, <<, >>.

Now these do work in newer versions of Perl but only if it is built with 64-bit integer support, which does not help if I want to make portable code to run on Perls built without this option. And you don't always get control over the Perl on a system your code runs on.

My question is do Python, PHP, and Ruby suffer from such a problem, or do they also depend on version and build options?

Secundine answered 15/12, 2010 at 16:22 Comment(6)
Out of curiosity, any reason why use bigint; isn't enough?Beady
@Hugmeir: Just that it's slow. I'm processing MediaWiki dump files which can be multiple terabytes in size!Secundine
@hippietrail, are you sure the 64 bit numbers are to blame?Salomie
@Winston Ewert, I'll do some timing tests. I also want to test some methods using floating point that appeared on answers to similar questions. The bigint solution makes the cleanest code but if the speed penalty is too high I might go with another solution.Secundine
OK I finally got some testing done on a box with a 64-bit perl. I'm processing an XML file with over 2 million entries. With 64-bit ints: 1m11.916s - With "use bigint" 36m37.102s - over 30 times slower!Secundine
With "use Math::Int64" on a Perl with 64-bit support the time is: real 1m14.215s. Still need to test Math::Int64 on a Perl without 64-bit support.Secundine
H
14

The size of high speed hardware integers (assuming the language has them) will always be dependent on whatever size integers are available to the compiler that compiled the language interpreter (usually C).

If you need cross-platform / cross-version big integer support, the Perl pragma use bigint; will do the trick. If you need more control, bigint is a wrapper around the module Math::BigInt.

In the scope where use bigint; is loaded, all of the integers in that scope will be transparently upgraded to Math::BigInt numbers. Lastly, when using any sort of big number library, be sure to not use tricks like 9**9**9 to get infinity, because you might be waiting a while :)

Hightension answered 15/12, 2010 at 16:32 Comment(1)
I've accepted that use Math::BigInt / bigint is the best solution in Perl for now but I'm still a bit disappointed, especially that pack / unpack with "Q" only works with a 64-bit build.Secundine
S
4

In Python, you never get overflows. Instead, python switches the implementation of numbers it is using automatically. The basic implementation uses the native ints on the platform, but long integers use an infinite length number implementation. As a result, you never have to worry about your numbers becoming too large, python just handles it naturally.

Salomie answered 15/12, 2010 at 16:33 Comment(3)
So you never know why your script is dog slow on some machine vs another one. Clever :)Oxidate
@wazoox, better then giving incorrect results on some machines rather then others.Salomie
It seems that the standard distribution of Python 3 on Windows supports 64 bit integers which the standard distribution of Perl on WIndows does not even though it is possible to build Perl with 64-bit integer support on a 32-bit platform with extra build parameters.Secundine
S
2

Tcl 8.5's long integer support is pretty good from a user perspective. Internally, it represents integers as whatever type is necessary to hold them (up to and including bigints) and things that consume integers will take any of them (though might impose their own limits; you don't really want to use a number that will only fit in a bigint as a Unix file mode...)

The only time you really need to think about it at all is when you're going to/from some fixed-width binary format. That's reasonably obvious though (it's fixed width after all).

Subtlety answered 15/12, 2010 at 16:30 Comment(2)
indeed I am using them with a fixed width binary format. I'm making a binary index of byte offsets in extremely large text files. The indexes must be fixed width to enable quick binary searching.Secundine
@hippietrail: Well, in that case use 64-bit values. I've never heard of anyone having a data file that doesn't fit in 8 exabytes, but if you do, have two files. And $env(DEITY) bless you. :-)Subtlety
O
1

Excuse me sir, bigint and Math::BigInt are part of core modules. Just friggin' use one of them, it will work on any platform.

Oxidate answered 15/12, 2010 at 16:29 Comment(6)
bigint is the best solution I've found so far but it's slowed my script down a lot. Math::Int64 seems much better but it's not a core module.Secundine
There is no magic. Big number crunching on non-64 bits platform is slow. You can't have your cake and eat it.Oxidate
32-bit platforms have long had support for 64-bit integers using methods simpler than "big numbers" just as 16-bit platforms had 32-bit types and 8-bit platforms had 16-bit types. Of course these are slower than native types but they are faster than bigints. Magic is not required. In my case I know I will never even need the full 64 bits.Secundine
@hippietrail: I usually build Perl with 64-bit int support, but if you can’t control the compilation environment, then you should use bigint. It was I who first give Perl 64-bit int support more than 20 years ago.Kinard
I just ran into a second problem with "use bigint". It also turns all of my floats into ints )-:Secundine
Then use Math::BigInt. Read the documentation, the differences are pretty clear (and the fact that 'bigint' turns every number into int).Oxidate

© 2022 - 2024 — McMap. All rights reserved.