When is a decimal considered as 'canonical'?
Asked Answered
C

2

6

Everywhere on the Internet it always says the decimal.Decimal.is_canonical() method will return True if the decimal is canonical.

But what does that even mean? Is it just some term that I do not know?

Chambertin answered 24/1, 2021 at 14:22 Comment(1)
According to the docs that method will always return True, so the question seems moot.Rancher
A
4

Contrary to what the other answer claims, canonical is not related to normalize. "Canonical" as used in the normalize docs is not the same as in the canonical or is_canonical methods - if it were the same, then normalize would always return its argument unchanged, as Decimal instances are always canonical as far as is_canonical is concerned.

The decimal module is an implementation of the IBM General Decimal Arithmetic Specification, and Decimal.is_canonical exists for the sole purpose of implementing that specification's is-canonical operation. The spec has this to say about is-canonical:

is-canonical takes one operand. The result is 1 if the operand is canonical; otherwise it is 0. The definition of canonical is implementation-defined; if more than one internal encoding for a given NaN, Infinity, or finite number is possible then one ‘preferred’ encoding is deemed canonical. This operation then tests whether the internal encoding is that preferred encoding.

If all possible operands have just one internal encoding each, then is-canonical always returns 1. This operation is unaffected by context and is quiet – no flags are changed in the context.

If an implementation of the spec has multiple internal encodings of a value, then one encoding is deemed canonical, and is_canonical tests whether a value is encoded that way. If an implementation (like Python's decimal module) does not have multiple encodings of the same value, then the operation always reports that the operand is canonical.

Note that the way the spec defines "finite number", Decimal('2.50') and Decimal('2.5') are different finite numbers, not different internal encodings of the same number. Finite numbers are defined by a sign, coefficient, and exponent, not by their numerical value. 2.50 has coefficient 250 and exponent -2, while 2.5 has coefficient 25 and exponent -1. Thus, there is no problem with is_canonical reporting True for both of these instances.

Apatetic answered 19/4 at 21:56 Comment(1)
If by "others" you mean me, I never stated that the canonical method is related to the normalize method. What I said was that by looking at the normalize method documentation one can see how it is using the adjective "canonical" .Corneous
C
5

As @snakecharmerb pointed out, the method will always return True, but I don't believe that makes the question moot. As an aside, why the method always returns True can be seen from looking at method canonical():

Return the canonical encoding of the argument. Currently, the encoding of a Decimal instance is always canonical, so this operation returns its argument unchanged.

But, of course, that does not really shed any more light on the subject. But if we look at method normalize(), we get some insight:

Normalize the number by stripping the rightmost trailing zeros and converting any result equal to Decimal('0') to Decimal('0e0'). Used for producing canonical values for attributes of an equivalence class. For example, Decimal('32.100') and Decimal('0.321000e+2') both normalize to the equivalent value Decimal('32.1').

The above description explains, more or less, what a canonical value is. Also:

Q. There are many ways to express the same value. The numbers 200, 200.000, 2E2, and 02E+4 all have the same value at various precisions. Is there a way to transform them to a single recognizable canonical value?

A. The normalize() method maps all equivalent values to a single representative:

>>> values = map(Decimal, '200 200.000 2E2 .02E+4'.split())
>>> [v.normalize() for v in values]
[Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2')]

Demo of the canonical method

The first 3 Decimal values, but not the 4th, have the same canonical representation because they have the same value and precision.

>>> from decimal import Decimal
>>>
>>> values = map(Decimal, '2E2 .2E+3 .02E+4 20E1'.split())
>>> [v.canonical() for v in values]
[Decimal('2E+2'), Decimal('2E+2'), Decimal('2E+2'), Decimal('2.0E+2')]
>>>
Corneous answered 24/1, 2021 at 14:40 Comment(7)
"Canonical" as used in the normalize docs is not the same as in the canonical method - if it were, normalize would also return its argument unchanged.Apatetic
Note that list(values) produces the exact same output as the list comprehension in the canonical demo. Decimal('2E2'), Decimal('.2E+3'), and Decimal('.02E+4') are already equivalent before calling canonical. canonical isn't actually doing anything there.Apatetic
@Apatetic I never said that normalize and canonical are the same methods. I said that by looking at the docs for the normalize method (Used for producing canonical values for attributes of an equivalence class) one can deduce what is meant by the adjective "canonical" regardless of what the canonical method actually returns, i.e. . Thus if we have d = Decimal('32.100') and then if we display str(d) and str(d.normalize()) we see respectively '32.100' and '32.1'. This implies that '32.1' is the canonical representation of Decimal('32.100').Corneous
But '32.1' is not the canonical representation of Decimal('32.100') as far as canonical or is_canonical are concerned.Apatetic
@Apatetic I don’t disagree and I have never said otherwise. Re-read what I said about the canonical method that it sheds no light on what “canonical” means but the “normalize” method documentation suggests a definition of what a “canonical” representation might be.Corneous
But it suggests the wrong definition for thinking about the is_canonical method.Apatetic
The OP asked, Everywhere on the Internet it always says the decimal.Decimal.is_canonical() method will return True if the decimal is canonical. But what does that even mean? Is it just some term that I do not know? I Interpreted that as asking what the term "canonical" means since we are all in agreement that the canonical and is_canonical methods are not useful in explaining what "canonical" means. So I suggested that the documentation on normalize gives an explanation. But yours is now the accepted answer so you obviously convinced the OP otherwise.Corneous
A
4

Contrary to what the other answer claims, canonical is not related to normalize. "Canonical" as used in the normalize docs is not the same as in the canonical or is_canonical methods - if it were the same, then normalize would always return its argument unchanged, as Decimal instances are always canonical as far as is_canonical is concerned.

The decimal module is an implementation of the IBM General Decimal Arithmetic Specification, and Decimal.is_canonical exists for the sole purpose of implementing that specification's is-canonical operation. The spec has this to say about is-canonical:

is-canonical takes one operand. The result is 1 if the operand is canonical; otherwise it is 0. The definition of canonical is implementation-defined; if more than one internal encoding for a given NaN, Infinity, or finite number is possible then one ‘preferred’ encoding is deemed canonical. This operation then tests whether the internal encoding is that preferred encoding.

If all possible operands have just one internal encoding each, then is-canonical always returns 1. This operation is unaffected by context and is quiet – no flags are changed in the context.

If an implementation of the spec has multiple internal encodings of a value, then one encoding is deemed canonical, and is_canonical tests whether a value is encoded that way. If an implementation (like Python's decimal module) does not have multiple encodings of the same value, then the operation always reports that the operand is canonical.

Note that the way the spec defines "finite number", Decimal('2.50') and Decimal('2.5') are different finite numbers, not different internal encodings of the same number. Finite numbers are defined by a sign, coefficient, and exponent, not by their numerical value. 2.50 has coefficient 250 and exponent -2, while 2.5 has coefficient 25 and exponent -1. Thus, there is no problem with is_canonical reporting True for both of these instances.

Apatetic answered 19/4 at 21:56 Comment(1)
If by "others" you mean me, I never stated that the canonical method is related to the normalize method. What I said was that by looking at the normalize method documentation one can see how it is using the adjective "canonical" .Corneous

© 2022 - 2024 — McMap. All rights reserved.