I'm currently working on a template-meta-programming based implementation of floating-point arithmetic. The template which represent compile-time float values is as follows:
template<bool S , std::int16_t E , std::uint64_t M>
struct number{};
Since initializing such values using hard coded mantissas, exponents, etc, is a cumbersome and bug-prone process I have written a template for converting decimal values to floating point ones:
template<std::int64_t INT , std::uint64_t DECS>
struct decimal{};
Where the first parameter represents the integral part and the second the fractional digits. I think this is a common and well known way.
However this pattern suffers from some issues (how can I enter negative less-than-one numbers?), where one of the most annoying for me is the fact that there is no way to enter zero digits just after the comma, i.e., numbers like 0.00032
.
I'm C++11 aware, and I was thinking about a user-defined-literal + decltype()
approach (even with a macro #define FLOAT(x) decltype(x_MY_LITERAL)
), but I'm not sure that approach is possible in all contexts, I mean, if the literal + decltype is evaluable in the context of a template parameter.
Even if that could work, I want to know if there are other possible approaches for this problem. So, what alternatives are there for floating-point-like initialization at compile-time via tmp?
Attempted alternatives:
Just for completeness' sake, I will describe the alternatives I have implemented, how they work, and its consts and pros. The question itself remains open, to allow anybody to add more alternatives.
Some background
First I will describe the features I have used, just to make sure everybody understand the code.
My library, The Turbo Metaprogramming Library, is based on three principles:
Type template parameters only: Being completely generic mixing type parameters, value parameters, and template-template parameters is really hard (Near impossible), so this library uses type parameters only. Whenever is necessary to use values or templates, the library provides wrappers to pass such parameters through boxing.
Uniform expression evaluation: One of the first needs when working in a programming language is a way to evaluate expressions and take its value. Turbo provides the
tml::eval
metafunction, which takes any kind of expression and returns (evaluates) its value.Generic algorithms and metafunctions customized via template specialization: Whenever I can I use C++11 template aliases to avoid the cumbersome
typename ::type
construction. My convention is to define implementation templates (The metafunctions which really do the work) on a nestedimpl
namespace, and a C++11 template alias to the result on the current namespace. Since such aliases return the result directly, they are not evaluable on a complex expression (Consider a metafunction instantationadd<X,Y>
, whereX
andY
are variables of a lambda. Ifadd
was an alias to the result, that doesn't work because the evaluation has no sense. If we need the expression (metafunction) instead of its result directly, my convention was to put an alias to the metafunction on afunc
nested namespace .
Here are some examples:
using bits = tml::util::sizeof_bits<int>; //bits is a size_t integral constant with the
//size on bits of an int
//A metafunction which returns the size on bits of a type doubled
using double_size = tml::lambda<_1 , tml::mul<tml::util::func::sizeof_bits<_1>,tml::Int<2>> >;
using int_double_size = tml::eval<double_size,int>; //Read as "double_size(int)"
tml
is the main namespace of the library, and floating-point features are exposed on the tml::floating
namespace.
TL;DR
tml::eval
takes any expression and evaluates it, returning its value. Its a C++11 template alias, sotypename ::type
is not needed.tml::integral_constant
(Just an alias ofstd::integral_constant
) is the de-facto value wrapper for passing value parameters as type parameters through boxing. The library has the convention of using type-parameters only (There are wrappers for template-template parameters too, seetml::lazy
andtml::bind
).
Attempt 1: From integer
Here we define a metafunction integer
which returns a floating-point value from an integer one:
template<std::int64_t mantissa , sign_t S = (sign_t)(mantissa >= 0)>
struct integer
{
using m = tml::floating::number<S,0,static_cast<mantissa_t>((mantissa >= 0) ? mantissa : -mantissa)>;
using hsb = tml::floating::highest_set_bit<m>;
static constexpr const exponent_t exp = hsb::value - 31;
using result = tml::floating::number<S,exp,(m::mantissa << (31 - hsb::value))>; //Note the number is normalized
};
What it does is to take the integral value directly, use it as mantissa, and normalize the number explicitly computing the highest (most significant) set bit, shifting the mantissa accordingly.
An example of its usage could be:
using ten = tml::floating::integer<10>;
Advantages:
Efficiency: No extra complex computations are required to obtain the equivalent floating point number. The only relevant operation is the call to
highest_set_bit
.The number is normalized by default (regarding on efficiency too). Also there are no precision issues (at least not for small values).
Disadvantages:
- Only works with integral values.
Attempt 2: Decimal initialization
This alternative uses a pair of integral values to represent the integral and fractional parts of the number respectively:
template<std::int64_t INTEGRAL , std::uint64_t FRACTIONAL>
struct decimal{ ... };
using pi = decimal<3,141592654>;
What it does is to compute the value of the integral part (just call to integer
, the previous attempt) and the value of the fractional part.
The value of the fractional part is the value of the integer adjusted until the radix point is at the beginning of the number. In other words:
integer<fractional_part>
fractional_value = ________________________________
10^number_of_digits
Then the value of the number is just the sum of both values:
result = integer_part_value + fractional_value
The number of digits of an integral number is log10(number) + 1
. I have ended up with a log10
metafunction for integral values that doesn't require recursion:
template<typename N>
struct log10
{
using result = tml::Int<(0 <= N::value && N::value < 10) ? 0 :
(10 <= N::value && N::value < 100) ? 1 :
...
>;
}
So it has O(1) complexity (Measuring template instantiation depth, of course).
With this metafunction, the formula above becomes:
// First some aliases, to make the code more handy:
using integral_i = tml::integral_constant<std::int64_t,INTEGRAL>;
using integral_f = tml::floating::integer<INTEGRAL>;
using fractional_f = tml::floating::integer<FRACTIONAL>;
using ten = tml::floating::integer<10>;
using one = tml::Int<1>;
using fractional_value = tml::eval<tml::div<fractional_f ,
tml::pow<ten,
tml::add<tml::log10<integral_i>,
one
>
>
>
>
And then the result is:
using result = tml::eval<tml::add<integral_f,fractional_value>>;
Advantages
- Allows instancing non-integral values like
12.123
.
Disadvantages:
Performance:
tml::pow
is recursive, with a complexity of O(n).tml::div
for floating-point values is implemented as a multiplication of the numerator by the reciprocal of the denominator. That reciprocal is computed by a Newton-Raphson approximation (five iterations by default).Precision issues: The sequential multiplications done to compute the power could lead to accumulative minor precision issues. The same for the Newton-Raphson approximation done to compute the division.
The notation is limited: There is no way to specify numbers with trailing zeros after the point, say
13.0004
, since an integer literal0004
is not valid.
Attempt 3 (3.1 and 3.2): Decimal scientific notation
Instead of writing the number using hard coded digits, we use decimal (power of 10) scientific notation to initialize floating-point numbers:
using pi = tml::floating::decimal_sci<3141592654,-9>; //3141592654 x 10^-9
To compute the number you only have to take the value of the significant, and multiply it by the corresponding power of 10:
template<std::int64_t S , std::int64_t E>
struct decimal_sci
{
using significant = tml::floating::integer<S>;
using power = tml::eval<tml::pow<tml::floating::integer<10>,tml::Int<E>>>;
using result = tml::eval<tml::mul<significant,power>>;
};
There is an improvement for this attempt, which treats the given significant if it was normalized to one integer digit only. So a value 0.0034565432
could be written as (34565432 , -3)
instead of (34565432 , -11)
.
I call it tml::floating::decimal_scinorm
:
template<std::int64_t S , std::int64_t E = 0>
struct decimal_scinorm
{
using significant_i = tml::integral_constant<std::int64_t,S>;
using exponent_i = tml::integral_constant<std::int64_t,E>;
using adjust = tml::eval<tml::log10<significant_i>>;
using new_exp = tml::eval<tml::sub<exponent_i,adjust>>;
using result = typename decimal_sci<S,new_exp::value>::result;
};
using pi = tml::floating::decimal_scinorm<3141592654>; //3.141592654
using i = tml::floating::decimal_scinorm<999999,-4>; //0.000999999
Advantages
- Leads with wide numbers, with heading zeros included, in a simple way.
- Uses a well known notation, no syntactic tricks involved.
Disadvantages
- Poor precision with very large/small numbers (Well, that’s expected since that’s how scientific notation works). Note the floating point internal computations could lead to accumulative precision errors, proportional to the length (of the mantissa) and exponent of the number. Are the same precision errors of the attempts above (From the usage of
tml::pow
,tml::div
, etc.).
decimal<123,4>
corresponds to1.23e4
? – Cloudlet123.33
. – Runckstd::ratio
. It doesstd::ratio<Numerator, Denominator>
. If that's not an option, I thinkdecimal<123, 33>
would be a nice way of representing123.33
. – Abernon123.033
or-0.1
? – Sillimanitedecimal<integral_part,decimals>
, decimal scientific notation, normalized decimal scientific notation, and finally floating-point literals parsing via user defined literals and parsing metafunctions) implemented at this time, and my idea is to include all of them on the attemped solutions section of the question. – Runck