How can I define my own float-point format (type) with specific precision and certain bitness of exponent and significand? For example, 128-bit float-point number with 20-bit exponent and 107-bit significand (not standart 15/112-bit), or 256-bit one with 19/236-bit exponent/significand.
How to define custom float-point format (type) in C++?
Asked Answered
There are 2 ways to do this. You can create your own class where you have a member for the exponent and a member for the mantissa, and you can write code for the operators you need, and then implement all of the functions you'd need that normally exist in the standard math library. (Things like atan()
, sin()
, exp()
and pow()
.)
Or you can find an existing arbitrary precision library and use it instead. While implementing it yourself would be interesting and fun, it is likely to have a lot of errors in it and to be an extremely large amount of work, unless your use-case is extremely constrained.
Wikipedia has a list of arbitrary precision math libraries that you can look into for yourself.
Indeed, trying to pack the components as bitfields makes absolutely no sense once you depart from the particular formats that match the hardware circuits. –
Cosenza
© 2022 - 2024 — McMap. All rights reserved.
long double
on x87) – Garrekfloat
anddouble
math can be implemented in CPU but this is not true for all CPU-s, respectively for all C compilers. There are many old compilers (including such for i386) which implement floating point arithmetic in software way. In other words, there is full implementation in C or ASM, and not in CPU hardware. – Ignorance