Why does C++ output negative numbers when using modulo?
Asked Answered
M

4

81

Math:

If you have an equation like this:

x = 3 mod 7

x could be ... -4, 3, 10, 17, ..., or more generally:

x = 3 + k * 7

where k can be any integer. I don't know of a modulo operation is defined for math, but the factor ring certainly is.

Python:

In Python, you will always get non-negative values when you use % with a positive m:

#!/usr/bin/python
# -*- coding: utf-8 -*-

m = 7

for i in xrange(-8, 10 + 1):
    print(i % 7)

Results in:

6    0    1    2    3    4    5    6    0    1    2    3    4    5    6    0    1    2    3

C++:

#include <iostream>

using namespace std;

int main(){
    int m = 7;

    for(int i=-8; i <= 10; i++) {
        cout << (i % m) << endl;
    }

    return 0;
}

Will output:

-1    0    -6    -5    -4    -3    -2    -1    0    1    2    3    4    5    6    0    1    2    3    

ISO/IEC 14882:2003(E) - 5.6 Multiplicative operators:

The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined; otherwise (a/b)*b + a%b is equal to a. If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined 74).

and

74) According to work underway toward the revision of ISO C, the preferred algorithm for integer division follows the rules defined in the ISO Fortran standard, ISO/IEC 1539:1991, in which the quotient is always rounded toward zero.

Source: ISO/IEC 14882:2003(E)

(I couldn't find a free version of ISO/IEC 1539:1991. Does anybody know where to get it from?)

The operation seems to be defined like this:

enter image description here

Question:

Does it make sense to define it like that?

What are arguments for this specification? Is there a place where the people who create such standards discuss about it? Where I can read something about the reasons why they decided to make it this way?

Most of the time when I use modulo, I want to access elements of a datastructure. In this case, I have to make sure that mod returns a non-negative value. So, for this case, it would be good of mod always returned a non-negative value. (Another usage is the Euclidean algorithm. As you could make both numbers positive before using this algorithm, the sign of modulo would matter.)

Additional material:

See Wikipedia for a long list of what modulo does in different languages.

Mcgary answered 24/7, 2012 at 11:54 Comment(6)
The usual reason for C (and therefore C++) is that existing hardware does math in a certain way. The language standard just documents what is happening (and what is not).Lynellelynett
A useful addition to this question might be "and what is a good alternative in C++ code to get the behaviour shown by Python?"Guidebook
A good solution to get a positive value for the mod is explained here: [https://mcmap.net/q/260614/-modulus-with-negative-numbers-in-c-duplicate]Marlo
Maybe the problem is naming the function as “modulo”. If “modulo” is used it MUST be the math way. If it is not the math way, then it can be named as something else. All the answers in this post are frustrating. We are lucky that, for example, the multiplication works the math way. They could have invented a faster function and name it multiplication.Ethel
@Marlo that requires 2 slow modulos which is worse than int ret = a%b; return ret>=0? ret: ret+b; which can be done with a conditional moveChristensen
@MehmetKaplan Exactly, but note that the quoted standard doesn’t use the word ‘module’. It’s says ‘remainder’.Pisarik
C
45

On x86 (and other processor architectures), integer division and modulo are carried out by a single operation, idiv (div for unsigned values), which produces both quotient and remainder (for word-sized arguments, in AX and DX respectively). This is used in the C library function divmod, which can be optimised by the compiler to a single instruction!

Integer division respects two rules:

  • Non-integer quotients are rounded towards zero; and
  • the equation dividend = quotient*divisor + remainder is satisfied by the results.

Accordingly, when dividing a negative number by a positive number, the quotient will be negative (or zero).

So this behaviour can be seen as the result of a chain of local decisions:

  • Processor instruction set design optimises for the common case (division) over the less common case (modulo);
  • Consistency (rounding towards zero, and respecting the division equation) is preferred over mathematical correctness;
  • C prefers efficiency and simplicitly (especially given the tendency to view C as a "high level assembler"); and
  • C++ prefers compatibility with C.
Choleric answered 24/7, 2012 at 12:43 Comment(2)
I wonder how often the truncated division is faster than floored, given that power-of-two divisors are very common, as are divisors which are amenable to scaled multiplication.Slipway
For further information, en.cppreference.com/w/cpp/language/operator_arithmetic, under "Built-in multiplicative operators", discusses the two rules about integer division and remainder operators with references to the standard.Davit
R
26

Back in the day, someone designing the x86 instruction set decided it was right and good to round integer division toward zero rather than round down. (May the fleas of a thousand camels nest in his mother's beard.) To keep some semblance of math-correctness, operator REM, which is pronounced "remainder", had to behave accordingly. DO NOT read this: IBM Documentation: Remainder (REM)

I've warned you. Later, someone doing the C spec decided it would be conforming for a compiler to do it either the right way or the x86 way. Then, a committee doing the C++ spec decided to do it the C way. Then later yet, after this question was posted, a C++ committee decided to standardize on the wrong way. Now we are stuck with it. Many programmers have written the following function or something like it. I have probably done it at least a dozen times:

inline int mod(int a, int b) {
    int ret = a % b;
    return ret >= 0 ? ret : ret + b;
}

There goes your efficiency.

These days, I use essentially the following, with some type_traits stuff thrown in. (Thanks to Clearer for a comment that gave me an idea for an improvement using latter day C++. See below.)

template<class T>
inline T mod(T a, T b) {
    assert(b > 0);
    T ret = a % b;
    return ret >= 0 ? ret : ret + b;
}

template<>
inline unsigned mod(unsigned a, unsigned b) {
    assert(b > 0);
    return a % b;
}

True fact: I lobbied the Pascal standards committee to do mod the right way until they relented. To my horror, they did integer division the wrong way. So they do not even match.

EDIT: Clearer gave me an idea. I am working on a new one.

#include <type_traits>

template<class T1, class T2>
inline T1 mod(T1 a, T2 b) {
    assert(b > 0);
    T1 ret = a % b;
    if constexpr (std::is_unsigned_v<T1>) {
        return ret;
    } else {
        return ret >= 0 ? ret : ret + b;
    }
}
Rufinaruford answered 24/1, 2018 at 9:51 Comment(7)
Wouldn't an assert that a % b >= 0 be the right thing? Some platform may define a % b, to do the right thing (by accident, extension of willful rebellion) and assert that a % b >= 0.Roybn
@Roybn - No, but you gave me a good idea. See the edited answer.Rufinaruford
I got your inline function to work, but my compiler gave an error: expected primary-expression before ‘constexpr’. I'm using GCC with -std=c++14.Pylon
@tyebillion "if constexpr" is a C++17 thing.Rufinaruford
@Roybn - the assert says "this function isn't designed to handle negative b". If you want to allow b to be negative, you need a more complex solution. [Usually b is known to be positive, it is just a that has the pesky tendency to be negative sometimes]Donkey
"... Pascal standards committee to do mod the right way until they relented. To my horror, they did integer division the wrong way. So they do not even match." FreePascal 3.0.4 has div and mod matching C behavior. Did they "fix"?Horseshit
Why not simply write return (a + b) % b and be done with it?Secondary
A
11

What are arguments for this specification?

One of the design goals of C++ is to map efficiently to hardware. If the underlying hardware implements division in a way that produces negative remainders, then that's what you'll get if you use % in C++. That's all there is to it really.

Is there a place where the people who create such standards discuss about it?

You will find interesting discussions on comp.lang.c++.moderated and, to a lesser extent, comp.lang.c++

Aesthesia answered 24/7, 2012 at 12:8 Comment(4)
And this goes very well with the C++ goal of "you dont pay for what you dont use". Performance is never sacrificed by default for the sake of convenience. If you need to check/abs your modulo results, you can easily wrap it wherever you need that behavior.Valance
Wouldn't the goal of "mapping efficiently to hardware" be better specified by saying that if x or y is negative and the modulus is non-zero, the compiler could arbitrarily return either the positive or negative result? The fastest implementation of (x%123456789) which works correctly with positive numbers might yield negative results with negative numbers, but the fastest implementation of (x%8) would yield positive numbers. The fastest way to compute (x mod y) if y is positive and x may be negative is probably: m=x%y; if (m<0) m+=y;, and that would work even if the compiler...Slipway
...randomly returned positive or negative results for negative x values that weren't divisible by y. The only thing I can see that is accomplished by specifying truncate-to-zero on /, and corresponding behavior on %, is to make operations like x/=4; or y%=4; three times as slow as they would otherwise need to be. Have you ever seen any code that actually benefits from -5%2=-1?Slipway
@Preet - I always want modulo to work the right way, so I always pay for it.Rufinaruford
M
1

Others have described the why well enough and unfortunately the question which asks for a solution is marked a duplicate of this one and a comprehensive answer on that aspect seems to be missing. There seem to be 2 commonly used general solutions and one special-case I would like to include:

// 724ms
inline int mod1(int a, int b)
{
  const int r = a % b;
  return r < 0 ? r + b : r;
}

// 759ms
inline int mod2(int a, int b)
{
  return (a % b + b) % b;
}

// 671ms (see NOTE1!)
inline int mod3(int a, int b)
{
  return (a + b) % b;
}

int main(int argc, char** argv)
{
  volatile int x;
  for (int i = 0; i < 10000000; ++i) {
    for (int j = -argc + 1; j < argc; ++j) {
      x = modX(j, argc);
      if (x < 0) return -1;  // Sanity check
    }
  }
}

NOTE1: This is not generally correct (i.e. if a < -b). The reason I included it is because almost every time I find myself taking the modulus of a negative number is when doing math with numbers that are already modded, for example (i1 - i2) % n where the 0 <= iX < n (e.g. indices of a circular buffer).

As always, YMMV with regards to timing.

Modernism answered 23/11, 2022 at 19:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.