ieee-754 Questions
2
Solved
Given a 64-bit floating point type (ieee-754), and a closed range [R0,R1] assuming both R0 and R1 are within the representable range and are not NaN or +/-inf etc.
How does one calculate the number...
Stellite asked 7/9, 2024 at 22:27
2
Solved
I just read this topic (especially the last comments).
Then I was wondering, why we actually need this was of giving the remainder. But it seems, that not many people "on google" were interested ...
Mayday asked 31/10, 2014 at 10:4
1
Solved
Intro:
With Java floats, I noticed that when you add 1.0 to a certain range of tiny negative numbers, it equals 1.0. I decided to investigate this and learned a lot about how floats work in my ques...
8
Anyone have a recommendation on a good compression algorithm that works well with double precision floating point values? We have found that the binary representation of floating point values resul...
Halo asked 10/2, 2010 at 17:5
3
Solved
When creating a nan which contains a char-sequence, how do I get the char-sequence back? [duplicate]
When doing
const double d = std::nan ("Hello");
you get a NAN containing the string "Hello". How can one back out this string from the variable d? Is there simply no standard conforming wa...
Enlargement asked 16/8, 2018 at 7:41
9
Solved
Is there a reasonably fast way to extract the exponent and mantissa from a Number in Javascript?
AFAIK there's no way to get at the bits behind a Number in Javascript, which makes it seem to me th...
Datha asked 21/2, 2012 at 19:9
4
Solved
JavaScript uses IEEE 754 for storing numbers, for both integer and floating point values, where 53 bits are used for representing the mantissa and 11 bits are used for representing the exponent.
T...
Hoopla asked 14/3, 2014 at 10:48
9
Solved
In my C++ program, I need to pull a 64 bit float from an external byte sequence. Is there some way to ensure, at compile-time, that doubles are 64 bits? Is there some other type I should use to sto...
7
Solved
Hello I have this code in C#:
float n = 2.99499989f;
MessageBox.Show("n = " + n.ToString("f2", CultureInfo.InvariantCulture));
And this code in C++:
float n = 2.99499989f;
printf("n = %.2f", n)...
Paunch asked 18/1, 2010 at 10:58
3
Solved
Whenever I search for the term 'denormal numbers' or 'denormals', I only find ways how to detect them and round them to zero. Apparently, nobody really likes them, because dealing with them incurs ...
Tarrance asked 14/5, 2012 at 12:27
3
Solved
How do I subtract IEEE 754 numbers?
For example: 0,546875 - 32.875...
-> 0,546875 is 0 01111110 10001100000000000000000 in IEEE-754
-> -32.875 is 1 10000111 01000101111000000000000 in IEEE-754
...
Stigmatize asked 7/1, 2012 at 0:29
5
Solved
This is not a duplicate of this, I'll explain here.
Consider x = 1.2. I'd like to separate it out into 1 and 0.2. I've tried all these methods as outlined in the linked question:
In [370]: x = 1....
Savarin asked 19/7, 2017 at 15:10
0
Note: this question is about CPU instructions, not high-level languages (where you are at the mercy of the compiler)
From a popular answer:
The same floating-point operations, run on the same har...
Dunkirk asked 13/11, 2023 at 19:25
14
Solved
I want to take a floating point number in C++, like 2.25125, and a int array filled with the binary value that is used to store the float in memory (IEEE 754).
So I could take a number, and end up...
Arias asked 23/1, 2009 at 18:45
4
Solved
I am writing a video game, Humm and Strumm, which requires a network component in its game engine. I can deal with differences in endianness easily, but I have hit a wall in attempting to deal with...
Placard asked 27/4, 2010 at 19:27
2
Solved
What is the most accurate way to compute mean of two numbers in floating-point arithmetic? Let's consider the most common double precision 64-bit numbers.
(a + b) / 2
a / 2 + b / 2
a + (b - a) /...
Cardona asked 25/9, 2023 at 13:33
10
I've been trying to find out the reason, but I couldn't.
Can anybody help me?
Look at the following example.
float f = 125.32f;
System.out.println("value of f = " + f);
double d = (double) 125.32...
Dhaulagiri asked 6/7, 2013 at 16:29
3
Solved
I know in C and Java, float's underlying representation is IEEE754-32, double is IEEE754-64.
In expressions, float will be auto-promoted to double. So how?
Take 3.7f for example. Is the process ...
Markowitz asked 25/8, 2012 at 2:47
3
I want to test if a number is positive or negative, especially also in the case of zero. IEEE-754 allows for -0.0, and it is implemented in Python.
The only workarounds I could find were:
def te...
3
Solved
In order to compare two floats (float64) for equality in Go, my superficial understanding of IEEE 754 and binary representation of floats makes me think that this is a good solution:
func Equal(a,...
Casta asked 25/12, 2017 at 14:11
11
Solved
What is the biggest "no-floating" integer that can be stored in an IEEE 754 double type without losing precision?
In other words, at would the follow code fragment return:
UInt64 i = 0;
D...
Variolite asked 4/12, 2009 at 18:12
2
Solved
I'm reading a book about OPENGL ES development. I came across to this sentence.
A float in Java has 32 bits of precision, while a byte has 8 bits of precision. This might seem like an obvious poin...
Pucker asked 12/2, 2014 at 11:40
8
Solved
Is there any faster alternative to the following expression:
Math.pow(2,Math.floor(Math.log(x)/Math.log(2)))
That is, taking the closest (smaller) integer power of 2 of a double? I have such exp...
Retrograde asked 17/11, 2014 at 3:43
4
Solved
In regards to IEEE-754 single precision floating point, how do you perform round to nearest, where ties round to the nearest even digit in the required position (the default and by far the most com...
Flux asked 24/1, 2012 at 4:12
1
numpy allows one to handle IEEE-754 exceptions originating from floating-point by using np.seterr appropriately. However seterr only supports the following keywords each corresponding to a IEEE-754...
Callaghan asked 6/10, 2020 at 10:42
1 Next >
© 2022 - 2025 — McMap. All rights reserved.