Why do most programming languages only have binary equality comparison operators?

Asked 8/7, 2010 at 15:11 Answered 11/8, 2015 at 14:40

Solved language-design boolean-operations

In natural languages, we would say "some color is a primary color if the color is red, blue, or yellow."

In every programming language I've seen, that translates into something like:

isPrimaryColor = someColor == "Red" or someColor == "Blue" or someColor == "Yellow"

Why isn't there a syntax that more closely matches the English sentence. After all, you wouldn't say "some color is a primary color if that color is red, or that color is blue, or that color is yellow."

I realize simply isPrimaryColor = someColor == ("Red" or "Blue" or "Yellow") because instead of Red Blue and Yellow they could be boolean statement in which case boolean logic applies, but what about something like:

isPrimaryColor = someColor ( == "Red" or == "Blue" or == "Yellow")

As an added bonus that syntax would allow for more flexibility, say you wanted to see if a number is between 1 and 100 or 1000 and 2000, you could say:

someNumber ((>= 1 and <=100) or (>=1000 and <=2000))

Edit:

Very interesting answers, and point taken that I should learn more languages. After reading through the answers I agree that for strictly equality comparison something similar to set membership is a clear and concise way of expressing the same thing (for languages that have language support for concise inline lists or sets and testing membership)

One issues that came up is that if the value to compare is the result of an expensive calculation a temporary variable would need to be (well, should be) created. The other issue is that there may be different evaluations that need to be checked, such as "the result of some expensive calculation should be prime and between 200 and 300"

These scenarios are also covered by more functional languages (though depending on the language may not be more concise), or really any language that can take a function as a parameter. For instance the previous example could be

MeetsRequirements(GetCalculatedValue(), f(x):x > 200, f(x):x < 300, IsPrime)

Fiducial answered 8/7, 2010 at 15:11 Comment(5)

I think that you are confusing at least 2 separate issues: whether two items are equal (in some sense) and whether one item (call it an element) is or is not a member of another item (call it a set). – Optimum 8/7, 2010 at 16:31

You need to learn more languages. – Rotz 8/7, 2010 at 16:36

BTW, Turbo Pascal allowed what seemed like a rather nice construct "if ch in ['A'..'Z','a'..'z']" until one actually looked at the generated code. At least in older versions of the compiler, it was pretty nasty (IIRC, it would allocate 32 bytes on the stack to create a 256-bit table of allowable characters, and call a routine which would use ch to dereference a bit in that table). – Yarvis 8/7, 2010 at 22:0

I would've preferred 1 <= someNumber <= 100 much more. – Gales 8/7, 2010 at 22:32

@Dave: You can write that in Python. – Pulpboard 8/10, 2010 at 13:44

In Haskell, it is easy to define a function to do this:

matches x ps = foldl (||) False $  map (\ p -> p x) ps

This function takes a value list of predicates (of type a -> Bool) and returns True if any of the the predicates match the value.

This allows you to something like this:

isMammal m = m `matches` [(=="Dog"), (=="Cat"), (=="Human")]

The nice thing is that it doesn't have to just be equality, you can use anything with the correct type:

isAnimal a = a `matches` [isMammal, (=="Fish"), (=="Bird")]

Crisp answered 8/7, 2010 at 17:59 Comment(5)

This is almost exactly what the human sentence structure is. +1. – Entomb 9/7, 2010 at 0:34

I need to play with Haskell more. Played with it for about a week or so. – Fiducial 9/7, 2010 at 2:43

Or: matches x = any ($x). – Jehias 9/7, 2010 at 14:3

Accepting this answer because it most closely (almost exactly) matches my psuedocode and has a very elegant syntax. Guess I should go back to learning me a haskell. – Fiducial 9/7, 2010 at 23:38

Also because it is very flexible as you've shown and I think it should've been voted much higher than it was. – Fiducial 9/7, 2010 at 23:41

I think that most people consider something like

isPrimaryColor = ["Red", "Blue", "Yellow"].contains(someColor)

to be sufficiently clear that they don't need extra syntax for this.

Poulin answered 8/7, 2010 at 15:14 Comment(10)

Good point. I wish more languages had inline syntax for creating lists/arrays. I'm looking at you C# and Java (I think... haven't used Java in years). (new string[] {"Red", "Blue", "Yellow"}).contains(someColor) just doesn't look as nice. Also see my edit with checking more than just equality such as numeric ranges (I know, some languages support that too, so I guess it's still a moot point.) – Fiducial 8/7, 2010 at 15:22

The inability to create a short list concisely is a real sore point in C# for me. I hate it. – Poulin 8/7, 2010 at 15:23

But surely you'd have declared string[] primaryColours = new string[] { "Red", "Blue", "Yellow" }; elsewhere in your code, then it would just be bool isPrimaryColour = primaryColours.Contains(someColour); – Galven 8/7, 2010 at 15:25

If it's something used more than once, yes it's probably declared elsewhere, possibly as a static-readonly variable or something, but if it's a one-off check then it's a (minor) burden to have to declare it that way and not to mention ugly too. – Fiducial 8/7, 2010 at 15:30

Sometimes in C# I write this function: T[] Array<T>(params T[] a) { return a; }... so I have a concise syntax for making a list. – Careycarfare 8/7, 2010 at 15:46

Not to mention that when using something other than natural language, we can introduce different terms for different concepts ('is' vs. 'is element of' / 'contains') , which is generally a very good idea. – Highline 8/7, 2010 at 15:54

In C# you don't need the type name if it can be inferred, e.g. new string[] { "Red", "Green", "Blue" } can be written as new [] { "Red", "Green", "Blue" }. You also don't need the parenthesis around it. Seems pretty concise? – Jed 8/7, 2010 at 15:56

Not really, JonoW, compared to just { "Red", "Green", "Blue" }. The new[] definitely adds some visual overhead. Perhaps I am just spoiled. – Poulin 8/7, 2010 at 17:36

in Java: boolean isPrimary = Arrays.asList("red", "blue", "yellow").contains(someColor); – Osteoclasis 8/7, 2010 at 21:10

I don't think creating a list and testing for membership would be as good as having a language construct, unless the compiler could recognize and optimize the code from the former. Otherwise I would consider the usage as syntactic aspartame. – Yarvis 8/7, 2010 at 23:8

In python you can do something like this:

color = "green"

if color in ["red", "green", "blue"]:
    print 'Yay'

It is called in operator, which tests for set membership.

Spiniferous answered 8/7, 2010 at 15:15 Comment(4)

You can also say something like 1 <= somenumber < 300. – Menace 8/7, 2010 at 15:23

@Menace yes I love that about Python and I wish every other language bothered to implement that as well. – Fiducial 8/7, 2010 at 15:31

and of course that equates to 1 <= somenumber and somenumber < 300, which means you can write something silly like 1 > somenum < 3 and that's a perfectly valid expression... – Nardoo 8/7, 2010 at 17:20

You can do this in SQL too, like "select ... where color in ('red', 'green', 'blue')". – Pulpboard 8/10, 2010 at 13:46

In perl 6 you could do this with junctions:

if $color eq 'Red'|'Blue'|'Green' {
    doit()
}

Alternately you could do it with the smart match operator (~~). The following is roughly equivalent to python's if value in list: syntax, except that ~~ does a lot more in other contexts.

if ($color ~~ qw/Red Blue Green/) {
    doit()
}

The parens also make it valid perl 5 (>=5.10); in perl 6 they're optional.

Crowl answered 8/7, 2010 at 15:31 Comment(1)

perl is a gift from his noodly appendage – Thump 8/7, 2010 at 17:23

In Haskell, it is easy to define a function to do this:

matches x ps = foldl (||) False $  map (\ p -> p x) ps

This function takes a value list of predicates (of type a -> Bool) and returns True if any of the the predicates match the value.

This allows you to something like this:

isMammal m = m `matches` [(=="Dog"), (=="Cat"), (=="Human")]

The nice thing is that it doesn't have to just be equality, you can use anything with the correct type:

isAnimal a = a `matches` [isMammal, (=="Fish"), (=="Bird")]

Crisp answered 8/7, 2010 at 17:59 Comment(5)

This is almost exactly what the human sentence structure is. +1. – Entomb 9/7, 2010 at 0:34

I need to play with Haskell more. Played with it for about a week or so. – Fiducial 9/7, 2010 at 2:43

Or: matches x = any ($x). – Jehias 9/7, 2010 at 14:3

Accepting this answer because it most closely (almost exactly) matches my psuedocode and has a very elegant syntax. Guess I should go back to learning me a haskell. – Fiducial 9/7, 2010 at 23:38

Also because it is very flexible as you've shown and I think it should've been voted much higher than it was. – Fiducial 9/7, 2010 at 23:41

Ruby

Contained in list:

irb(main):023:0> %w{red green blue}.include? "red"
=> true
irb(main):024:0> %w{red green blue}.include? "black"
=> false

Numeric Range:

irb(main):008:0> def is_valid_num(x)
irb(main):009:1>   case x
irb(main):010:2>     when 1..100, 1000..2000 then true
irb(main):011:2>     else false
irb(main):012:2>   end
irb(main):013:1> end
=> nil
irb(main):014:0> is_valid_num(1)
=> true
irb(main):015:0> is_valid_num(100)
=> true
irb(main):016:0> is_valid_num(101)
=> false
irb(main):017:0> is_valid_num(1050)
=> true

Photic answered 8/7, 2010 at 16:11 Comment(0)

So far, nobody has mentioned SQL. It has what you are suggesting:

SELECT
    employee_id
FROM 
    employee
WHERE
    hire_date BETWEEN '2009-01-01' AND '2010-01-01' -- range of values
    AND employment_type IN ('C', 'S', 'H', 'T')     -- list of values

Trompe answered 9/7, 2010 at 13:58 Comment(0)

COBOL uses 88 levels to implement named values, named groups of values and named ranges of values.

For example:

01 COLOUR         PIC X(10).
   88 IS-PRIMARY-COLOUR VALUE 'Red', 'Blue', 'Yellow'.
...
MOVE 'Blue' TO COLOUR
IF IS-PRIMARY-COLOUR
   DISPLAY 'This is a primary colour'
END-IF

Range tests are covered as follows:

01 SOME-NUMBER    PIC S9(4) BINARY.
   88 IS-LESS-THAN-ZERO    VALUE -9999 THRU -1.
   88 IS-ZERO              VALUE ZERO.
   88 IS-GREATER-THAN-ZERO VALUE 1 THRU 9999.
...
MOVE +358 TO SOME-NUMBER
EVALUATE TRUE
    WHEN IS-LESS-THAN-ZERO
         DISPLAY 'Negative Number'
    WHEN IS-ZERO
         DISPLAY 'Zero'
    WHEN IS-GREATER-THAN-ZERO
         DISPLAY 'Positive Number'
    WHEN OTHER
         DISPLAY 'How the heck did this happen!'
END-EVALUATE

I guess this all happened because COBOL was supposed to emulate English to some extent.

Oldwife answered 8/7, 2010 at 15:51 Comment(3)

What the heck is with the "PIC", "S9", "01" and "88"s? I thought COBOL was supposed to be uber-readable, otherwise why bother with the excess verbiage? – Careycarfare 8/7, 2010 at 16:2

I'm just going to throw down +1 because, well, COBOL. Yeah. – Barling 8/7, 2010 at 16:6

There you had to go, reminding me of when I had to code in COBOL. Now the nightmares will be starting all over. – Pulpboard 8/7, 2010 at 17:25

You'll love Perl 6 because it has:

chaining comparison operators:

(1 <= $someNumber <= 100) || (1000 <= $someNumber <= 2000))
junctive operators:

$isPrimaryColor = $someColor ~~ "Red" | "Blue" | "Yellow"

And you can combine both with ranges:

$someNumber ~~ (1..100) | (1000..2000)

Minos answered 8/7, 2010 at 15:57 Comment(0)

Python actually gives you the ability to do the last thing quite well:

>>> x=5
>>> (1<x<1000 or 2000<x<3000)
True

Starveling answered 8/7, 2010 at 15:31 Comment(0)

In Python you can say ...

isPrimaryColor = someColor in ('Red', 'Blue', 'Yellow')

... which I find more readable than your (== "Red" or == "Blue") syntax. There's a few reasons to add syntax support for a language feature:

Efficiency: Not a reason here, since there's no speed improvement.
Functionality: Also not a concern; there's nothing you can do in the new syntax that you can't do in the old.
Legibility: Most languages handle the case where you're checking the equality of multiple values just fine. In other cases (e.g., someNumber (> 1 and < 10)) it might be more useful, but even then it doesn't buy you much (and Python allows you to say 1 < someNumber < 10, which is even clearer).

So it's not clear the proposed change is particularly helpful.

Alisun answered 8/7, 2010 at 15:25 Comment(1)

If the value to be checked is something like a function return, checking its value against multiple possibilities using normal operators would generally require using a temp variable or goofy code, even if the value is needed for no purpose other than the check. A compare-against-multiple-things operator would eliminate the need for a temp variable. – Yarvis 8/7, 2010 at 20:9

My guess would be that languages are designed by force of habit. Early languages only would have had binary comparison operators because they are simpler to implement. Everyone got used to saying (x > 0 and x < y) until language designers didn't ever bother to support the common form in mathematics, (0 < x < y).

In most languages a comparison operator returns a boolean type. In the case of 0 < x < y, if this is interpreted as (0 < x) < y it would be meaningless, since < does not make sense for comparing booleans. Therefore, a new compiler could interpret 0 < x < y as tmp:=x, 0 < tmp && tmp < y without breaking backward compatibility. In the case of x == y == z, however, if the variables are already booleans, it is ambiguous whether this means x == y && y == z or (x == y) == z.

In C# I use the following extension method so that you can write someColor.IsOneOf("Red", "Blue", "Yellow"). It is less efficient than direct comparison (what with the array, loop, Equals() calls and boxing if T is a value type), but it sure is convenient.

public static bool IsOneOf<T>(this T value, params T[] set) 
{
    object value2 = value;
    for (int i = 0; i < set.Length; i++)
        if (set[i].Equals(value2))
            return true;
    return false;
}

Careycarfare answered 8/7, 2010 at 15:25 Comment(6)

I hope that what you meant was that everyone got used to saying (0 < x and x < y). – Poulin 8/7, 2010 at 15:32

No. Almost everyone I see uses x > 0 && x < y. What difference does it make anyway? – Careycarfare 8/7, 2010 at 15:43

mquander, that seems terribly pedantic. What does it matter if one reverses the 0 and x if one also reverses the comparator? I'd hope most developers would realize that (0 < x) is the same as (x > 0). Insisting it should be (0 < x) seems even worse than the infamous (null == variable) vs (variable == null), as saying (null == variable) at least has the benefit of helping to prevent typos. – Triple 8/7, 2010 at 15:45

Check out my COBOL example - COBOL was (is) an early language and it has always had this feature. – Oldwife 8/7, 2010 at 15:58

I like 0 < x and x < y because it's visually isomorphic to 0 < x < y, so it's immediately obvious where x is (between 0 and y) without having to parse the greater-than-this and less-than-that combination in my head. For me, it's the difference between immediate comprehension of the expression and comprehension after three or four seconds. – Poulin 8/7, 2010 at 16:0

On a technicality, it isn't necessarily meaningless to compare booleans with < and >. For example, I might well want to perform a sort that groups all the false's together and all the true's together. This would be no problem at all in C, where false is 0 and true is usually 1, so false<true. – Pulpboard 8/10, 2010 at 13:51

Icon has the facility you describe.

if y < (x | 5) then write("y=", y)

I rather like that aspect of Icon.

Barling answered 8/7, 2010 at 16:3 Comment(0)

In C#:

if ("A".IsIn("A", "B", "C"))
{
}

if (myColor.IsIn(colors))
{
}

Using these extensions:

public static class ObjectExtenstions
{
    public static bool IsIn(this object obj, params object [] list)
    {
        foreach (var item in list)
        {
            if (obj == item)
            {
                return true;
            }
        }

        return false;
    }

    public static bool IsIn<T>(this T obj, ICollection<T> list)
    {
        return list.Contains(obj);
    }

    public static bool IsIn<T>(this T obj, IEnumerable<T> list)
    {
        foreach (var item in list)
        {
            if (obj == item)
            {
                return true;
            }
        }

        return false;
    }
}

Inclinable answered 7/10, 2010 at 17:58 Comment(0)

You'll have to go a bit down the abstraction layer to find out the reason why. x86's comparison/jump instructions are binary (since they can be easily computed in a few clock cycles), and that's the way things have been.

If you want, many languages offer an abstraction for that. In PHP, for example you could use:

$isPrimaryColor = in_array($someColor, array('Red', 'White', 'Blue'));

Teferi answered 8/7, 2010 at 15:19 Comment(1)

Good point about the low level reason, although most high level languages can abstract an optimize that away at compile/interpretation time. – Fiducial 8/7, 2010 at 15:26

I don't see an Objective-C answer yet. Here is one:

BOOL isPRimaryColour = [[NSSet setWithObjects: @"red", @"green", @"blue", nil] containsObject: someColour];

Confession answered 8/7, 2010 at 16:33 Comment(0)

The question is reasonable, and I wouldn't regard the change as syntactic sugar. If the value being compared is the result of computation, it would be nicer to say:

  if (someComplicatedExpression ?== 1 : 2 : 3 : 5)

than to say

  int temp;
  temp = someComplicatedExpression;
  if (temp == 1 || temp == 2 || temp == 3 || temp == 5)

particularly if there was no other need for the temp variable in question. A modern compiler could probably recognize the short useful lifetime of 'temp' and optimize it to a register, and could probably recognize the "see if variable is one of certain constants" pattern, but there'd be no harm in allowing a programmer to save the compiler the trouble. The indicated syntax wouldn't compile on any existing compiler, but I don't think it would be any more ambiguous than (a+b >> c+d) whose behavior is defined in the language spec.

As to why nobody's done that, I don't know.

Yarvis answered 8/7, 2010 at 16:34 Comment(0)

As a mathematician, I would say that the colour is primary if and only if it is a member of the set {red, green, blue} of primary colours.

And this is exactly how you could say in Delphi:

isPrimary := Colour in [clRed, clGreen, clBlue]

In fact, I employ this technique very often. Last time was three days ago. Implementing my own scripting language's interpreter, I wrote

const
  LOOPS = [pntRepeat, pntDoWhile, pntFor];

and then, at a few lines,

if Nodes[x].Type in LOOPS then

The Philosophical Part of the Question

@supercat, etc. ("As to why nobody's done that, I don't know."):

Probably because the designers of programming languages are mathematicians (or, at least, mathematically inclined). If a mathematician needs to state the equality of two objects, she would say

X = Y,

naturally. But if X can be one of a number of things A, B, C, ..., then she would define a set S = {A, B, C, ...} of these things and write

X ∈ S.

Indeed, it is extremely common that you (mathematicians) write X ∈ S, where S is the set

S = {x ∈ D; P(x)}

of objects in some universe D that has the property P, instead of writing P(X). For instance, instead of saying "x is a positive real number", or "PositiveReal(x)", one would say x ∈ ℝ⁺.

Linson answered 8/7, 2010 at 17:12 Comment(1)

FYI, the "@" within a post doesn't trigger a notification. I'll agree that conceptually the set is cleaner, and perhaps syntax could take that into consideration. From an execution standpoint, though, there should be a way of saying "See if X is in the set {1,2,3}" without having to have the code instantiate an object holding the set "1,2,3" and see if X is within it. – Yarvis 6/9, 2013 at 23:10

I'm reminded of when I first started to learn programming, in Basic, and at one point I wrote

if X=3 OR 4

I intended this like you are describing, if X is either 3 or 4. The compiler interpreted it as:

if (X=3) OR (4)

That is, if X=3 is true, or if 4 is true. As it defined anything non-zero as true, 4 is true, anything OR TRUE is true, and so the expression was always true. I spent a long time figuring that one out.

I don't claim this adds anything to the discussion. I just thought it might be a mildly amusing anecdote.

Pulpboard answered 8/7, 2010 at 17:31 Comment(0)

It's because programming languages are influenced by mathematics, logic and set theory in particular. Boolean algebra defines ∧, ∨ operators in a way that they do not work like spoken natural language. Your example would be written as:

Let p(x) be unary relation which holds if and only if x is a primary color
p(x) ⇔ r(x) ∨ g(x) ∨ b(x)
or
p(x) ⇔ (x=red) ∨ (x=green) ∨ (x=blue)

As you see, it's pretty similar to notation that would be used in programming language. As mathematics provide strong theoretic foundations, programming languages are based on mathematics rather than natural language which always leaves a lot of space for interpretation.

EDIT: Above statement could be simplified by using set notation:

p(x) ⇔ x ∈ {red, green, blue}

and indeed, some programming languages, most notably Pascal, included set, so you could type:

type
    color = (red, green, blue, yellow, cyan, magenta, black, white);

function is_primary (x : color) : boolean;
begin
    is_primary := x in [red, green, blue]
end

But sets as a language feature didn't catch on.

PS. Sorry for my imperfect English.

Plumcot answered 8/7, 2010 at 20:41 Comment(0)

The latter examples you give are effectively syntactic sugar, they'd have to evaluate to the same code as the longer form as at some point the executed code has to compare your value with each of the conditions in turn.

The array comparison syntax, given in several forms here, closer and I suspect there are other languages which get even closer.

The main problem with making syntax closer to natural language is that the latter is not just ambiguous, it's hideously ambiguous. Even keeping ambiguity to a minimum we still manage to introduce bugs into our apps, can you imagine what it would be like if you programmed in natural english?!

Galven answered 8/7, 2010 at 15:23 Comment(3)

Of course it's syntactic sugar. So? All languages have some amount of syntactic sugar. – Careycarfare 8/7, 2010 at 16:4

What's the point of your comment? Syntactic sugar that increases ambiguity is always a bad idea, that's what was being proposed here. – Galven 13/7, 2010 at 11:15

It does not necessarily introduce any ambiguity at all - having shorter code which reads more naturally is certainly a boon. – Commercialize 7/10, 2010 at 18:8

Just to add to language examples

Scheme

(define (isPrimaryColor color)
  (cond ((member color '(red blue yellow)) #t)
        (else #f)))

(define (someNumberTest x)
  (cond ((or (and (>= x 1) (<= x 100)) (and (>= x 10000 (<= x 2000))) #t)
        (else #f)))

Polyhistor answered 8/7, 2010 at 21:3 Comment(0)

Two possibilities

Java

boolean isPrimary = Arrays.asList("red", "blue", "yellow").contains(someColor);

Python

a = 1500
if  1 < a < 10 or  1000 < a < 2000:
     print "In range"

Osteoclasis answered 8/7, 2010 at 21:17 Comment(0)

This can be replicated in Lua with some metatable magic :D

local function operator(func)
    return setmetatable({},
        {__sub = function(a, _)
            return setmetatable({a},
                {__sub = function(self, b)
                    return f(self[1], b)
                end}
            )
        end}
    )
end


local smartOr = operator(function(a, b)
    for i = 1, #b do
        if a == b[i] then
            return true
        end
    end
    return false
end)


local isPrimaryColor = someColor -smartOr- {"Red", "Blue", "Either"}

Note: You can change the name of -smartOr- to something like -isEither- to make it even MORE readable.

Witmer answered 11/8, 2015 at 14:40 Comment(0)

-1

Languages on computers compare as binary because they are all for a machine that uses binary to represent information. They were designed using similar logic and with broadly similar goals. The English language wasn't designed logically, designed to describe algorithms, and human brains (the hardware it runs on) aren't based on binary. They're tools designed for different tasks.

Pulpboard answered 8/7, 2010 at 16:55 Comment(1)

The term "binary operator" refers to an operator which takes two operands. In C, the operators "!" and "~" are unary operators; "+" and "-" may be used as unary and binary operators; "? :" is a ternary operator. – Yarvis 8/7, 2010 at 21:53

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Ruby

Contained in list:

Numeric Range:

The Philosophical Part of the Question

Recommended topics

Hot tags