Since I'm the author of the y-cruncher program that you mention, I'll add my 2 cents.
For such a large task, the two biggest barriers that must be tackled are as follows:
- Memory
- Run-time Complexity
Memory
2 trillion digits is extreme - to say the least. That's double the current record set by Shigeru Kondo and myself back in 2010. (It took us more than 9 days to compute 1 trillion digits using y-cruncher.)
In plain text, that's about 1.8 TiB in decimal. In packed binary representation, that's 773 GiB.
If you're going to be doing arithmetic on numbers of this size, you're gonna need 773 GiB for each operand not counting scratch memory.
Feasibly speaking, y-cruncher actually needs 8.76 TiB of memory to do this computation all in ram. So you can expect other implementations to need the same give or take a factor of 2 at most.
That said, I doubt you're gonna have enough ram. And even if you did, it'd be heavily NUMA. So the alternative is to use disk. But this is not trivial, as to be efficient, you need to treat memory as a cache and micromanage all data that is transferred between memory and disk.
Run-time Complexity
Here we have the other problem. For 2 trillion digits, you're gonna need a very fast algorithm. Not just any fast algorithm, but a quasi-linear run-time algorithm.
Your current attempt runs in about O(N^2)
. So even if you had enough memory, it won't finish in your lifetime.
The standard approach to computing e
to high precision runs in O(N log(N)^2)
and combines the following algorithms:
Fortunately, GMP already uses FFT-based large multiplication. But it lacks two crucial features:
- Out-of-core (swap) computation to use disk when there isn't enough memory.
- It isn't parallelized.
The second point isn't as important since you can just wait longer. But for all practical purposes, you're probably gonna need to roll out your own. And that's what I did when I wrote y-cruncher.
That said, there are many other loose-ends that also need to be taken care of:
- The final division will require a fast algorithm like Newton's Method.
- If you're gonna compute in binary, you're gonna need to do a radix conversion.
- If the computation is gonna take a lot of time and a lot of resources, you may need to implement fault-tolerance to handle hardware failures.
int
to count the limbs, so with the usual 32-bit two's complementint
s, you couldn't get more than2^37 - 64
bits of precision. That can be less than you RAM. – Korey1 + 1/2 + 1/6 + ...
is not the same as(1 + 1 + 1 + ...) / (1 + 2 + 6 + ...)
– Redbudmpq_add
, I am quite sure thatq
stands for the field of rationals (and of courseadd
stands for add). I wouldn't be too worried. – Miquelon