How can "x & y" be false when both x and y are true?
Asked Answered
Z

4

20

Context:

I'm learning C# and have been messing about on the Pex for fun site. The site challenges you to re-implement a secret algorithm, by typing code into the site and examining how the inputs and outputs differ between your implementation and the secret implementation.

Problem:

Anyway, I got stuck on a basic code duel called XAndY.

From the name it seemed obvious that the answer was just:

public static bool Puzzle(bool x, bool y) 
{
    return x && y;
}

However, this was incorrect and Pex informed me that the following inputs produced a different result than the secret implementation:

Input:

x:true y:true (0x02)

Output:

my implementation: true (0x02)

secret implementation: false

Mismatch Your puzzle method produced the wrong result.

Code: Puzzle(true, PexSafeHelpers.ByteToBoolean((byte)2));

After a lot of confusion trying to compare different types of true, I realised that the implementation that Pex was looking for was actually just using a bitwise AND:

return x & y;

Questions:

I thought that for both semantic and short-circuiting reasons you should use logical && for comparing boolean values, but regardless:

  1. Does this mean that x & y and x && y definitively do not have the same outputs for all possible bool arguments? (or could it be something buggy in Pex?)
  2. Does this mean you can differentiate between different values of bool true in C#? If so, how?
Zacynthus answered 10/7, 2014 at 17:46 Comment(15)
According to what you've just printed, with an input of true and true it should output false. & doesn't do that, so it would seem that that is not the actual implementation.Bouldin
In a language like C++ any non-zero value is "true", so doing a bitwise and between two different "true" values could easily return 0 (false). However, I don't think this is true in C#. Interesting question!Morrie
@Morrie The same is true in c# except the language and compilers go to great lengths to ensure that only 0 is false and 1 is true.Effulgent
@Shoe - you don't know how true/false are encoded. The next C# compiler (or CLR) could do it differently. The only cross-over is in the System.Convert class.Corrugate
@HenkHolterman Well 0 will always be false no matter the compiler. I guess "true" could be whatever the compiler has it be. I am assuming CLI compliance.Effulgent
True=0 and False=-1 (0xff) used to be popular.Corrugate
When I tell people that indeed a boolean can be other than 0 or 1 it takes a back and forth of many comments to convince them. This is an unfathomable fact to many. Finally a question to point them to.Waldemar
@Waldemar - booleans are still only true or false.Corrugate
@HenkHolterman I guess the C# compiler is not compliant with the C# language spec, then. (I'm glad this is the case because I have used the & operator on bools to reduce branches in tight loops.)Waldemar
@Waldemar - the & is a perfectly legal boolean operator in C#. The CLR implementation is a little suspect but practical.Corrugate
@HenkHolterman the compiler does not provide what the language promises. The compiler needs to force each boolean to 0 or 1, or just implement & just like &&. Right now it deviates from the spec.Waldemar
Where do you find the representation of boolean as int in the C# specs?Corrugate
@HenkHolterman nowhere. But the language promises that & is a logical operator and it is easy to show that it is not. The compiler shouldn't let CLR implementation details leak through. The fact that a "corrupt" bool can come in from external code does not allow the spec to be violated. There is no "special" bool type that does not have to obey the laws.; Here's another such case where the compiler would have a really hard time ensuring compliance: #11977469Waldemar
And of course the correct answer to the puzzle should be XOR x ^ yInlet
There is also a nice answer given here @ codegulf. What is done here with emit statements is done there with unsafe code and pointers..Purulence
S
24

The puzzle is exploiting what, in my opinion, is a bug in the C# compiler. (The bug affects VB.NET as well.)

In the C# 5.0 specification, §4.1.8 says that "The possible values of type bool are true and false", and §7.11.3 says that operator &(bool x, bool y) is a logical operator:

The result of x & y is true if both x and y are true. Otherwise, the result is false.

It's obviously a violation of the specification for true & true to yield false. What's going on?

At run time, a bool is represented by a 1-byte integer. The C# compiler uses 0 to represent false and 1 to represent true. To implement the & operator, the C# compiler emits a bitwise AND instruction in the generated IL. At first glance, this seems to be okay: bitwise AND operations involving 0 and 1 correspond exactly with logical AND operations involving false and true.

However, §III.1.1.2 of the CLI specification explicitly allows a bool to be represented by an integer other than 0 or 1:

A CLI Boolean type occupies 1 byte in memory. A bit pattern of all zeroes denotes a value of false. A bit pattern with any one or more bits set (analogous to a non-zero integer) denotes a value of true.

By going beyond the scope of C#, it is indeed possible—and perfectly legal—to create a bool whose value is, say, 2, thus causing & to behave unexpectedly. This is what the Pex site is doing.

Here's a demonstration:

using System;
using System.Reflection.Emit;

class Program
{
    static void Main()
    {
        DynamicMethod method =
            new DynamicMethod("ByteToBoolean", typeof(bool), new[] { typeof(byte) });
        ILGenerator il = method.GetILGenerator();
        il.Emit(OpCodes.Ldarg_0); // Load the byte argument...
        il.Emit(OpCodes.Ret);     // and "cast" it directly to bool.
        var byteToBoolean =
            (Func<byte, bool>)method.CreateDelegate(typeof(Func<byte, bool>));

        bool x = true;
        bool y = byteToBoolean(2);
        Console.WriteLine(x);               // True
        Console.WriteLine(y);               // True
        Console.WriteLine(x && y);          // True
        Console.WriteLine(x & y);           // False (!) because 1 & 2 == 0
        Console.WriteLine(y.Equals(false)); // False
        Console.WriteLine(y.Equals(true));  // False (!) because 2 != 1
    }
}

So the answers to your questions are:

  1. Currently, it's possible for x & y and x && y to have different values. However, this behavior violates the C# specification.
  2. Currently, you can use Boolean.Equals (as shown above) to differentiate between true values. However, this behavior violates the CLI specification of Boolean.Equals.
Sextant answered 10/7, 2014 at 21:10 Comment(8)
This is correct. In IL it is even possible to write low-trust code to emit (bool)2. (The code shown here requires full trust due to the explicit layout struct.) This is not a "corrupt" bool. The CLR has not problem dealing with this.Waldemar
@usr: Thanks for the info. I guess it's "corrupt" from the standpoint of the C# spec, but not the CLR.Sextant
In particular, PEX is passing 0xFF for true in this case. PEX is all about discovering possible failure cases, so it explores the limits of what is acceptable for input. It also has a tendency to make methods blow up by passing int.MaxValue, NaN and other fun inputs.Rosenstein
Excellent comprehensive answer, thank you. I guess the PEX for fun site shouldn't be using & instead of && on boolean values in a basic C# teaching course, but on the other hand, I wouldn't have learnt this if they hadn't.Zacynthus
Maybe you should note in this excellent answer that the C# compiler seems to be not compliant with the C# language spec. I hope this answer can become the canonical page to send people to for information about this issue.Waldemar
Very interesting answer. Using IL to force the compiler using boolean values that would otherwise not be possible (you usually can't cast from int to bool), can be regarded as an exploit - like a buffer overflow, something the designers of the language didn't think of.Purulence
When I run this code above I get x=true y=true x&&y=false x&y=false y.Equals(false)=false y.Equals(true)=false. Why? I realise this is quite an old question now so maybe something in .NET has been patched or changed? I am using .NET Framework 4.6.1Incautious
@JackPRead: Interesting. At some point after I wrote this answer, the C# compiler was changed to use & instead of conditional jumps to implement && for simple Boolean variables, so now both x & y and x && y incorrectly evaluate to false. :-(Sextant
L
3

I noticed that this issue has been raised to the Roslyn compiler group. Further discussion.

It had the following resolution:

This is effectively by design and you can find the document detailing that here: https://github.com/dotnet/roslyn/blob/master/docs/compilers/Boolean%20Representation.md

You can see more details and past conversations about this here: #24652

The noted document states:

Representation of Boolean Values

The C# and VB compilers represent true (True) and false (False) bool (Boolean) values with the single byte values 1 and 0, respectively, and assume that any boolean values that they are working with are restricted to being represented by these two underlying values. The ECMA 335 CLI specification permits a "true" boolean value to be represented by any nonzero value. If you use boolean values that have an underlying representation other than 0 or 1, you can get unexpected results. This can occur in unsafe code in C#, or by interoperating with a language that permits other values. To avoid these unexpected results, it is the programmer's responsibility to normalize such incoming values.

(All italics are my own emphasis)

Luci answered 18/2, 2019 at 16:8 Comment(0)
B
1

& in C# is not a bitwise operator, assuming that the input values are Boolean values. It is overloaded. There are two entirely separate implementations of the operator. A non-short circuiting logical boolean operator if the inputs are booleans, and a bitwise AND if the values are non-boolean values.

In the code that you have shown, then input is a boolean variable. It's not a numeric value, it's not an expression that resolves to a boolean value (which may have side effects), or anything else.

When the input is two boolean variables there is never going to be any different in the output between & and &&. The only way to have any observable difference between these two is to have a boolean expression that is more complex than just resolving a variable to its value, or some non-boolean input.

If the operands could be of some type other than bool then its pretty trivial to provide a type that has different results for either operator, such as a super mean type that overrides the true operator in a manor inconsistent with it's implicit conversion to bool:

Bouldin answered 10/7, 2014 at 17:55 Comment(3)
& is emitted as a bitwise operation and an i1 at the IL level can very well contain values other than 0 and 1. You can easily write an IL function returning (bool)2.Waldemar
@MichaelLiu It's correct for C# code. It's not true of entirely different languages, such as IL. The question is about C#, and for C# the answer is 100% correct.Bouldin
I'd call this a bug in the C# compiler. After all, C# doesn't exist in a vacuum, and because it's completely legitimate for any BCL method to return these strange Boolean values, the compiler ought to handle them properly. The current behavior is so unexpected I wouldn't be surprised if it's the cause of a security vulnerability.Sextant
B
0

Since the implementation to satisfy the puzzle is entirely up to you, why dont you try:

public static bool Puzzle(bool x, bool y)
{
    return x & !y;
}
Bugaboo answered 10/10, 2021 at 5:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.