Haskell and laziness
Asked Answered
T

3

6

I've just started to learn Haskell and I was told that Haskell is lazy, i.e. it does as little work as possible in evaluating expressions, but I don't think that's true.

Consider this:

und :: Bool -> Bool -> Bool
und False y = False
und y False = False

non_term x = non_term (x+1)

The evaluation of und (non_term 1) False never terminates, but it's clear that the result if False.

Is there a way to implement und (i.e. and in German) correctly (not just partially as above) so that both

und (non_term 1) False

and

und False (non_term 1)

return False?

Tonsil answered 20/6, 2015 at 23:10 Comment(16)
Your function is lazy, it's not short-circuited.Indestructible
"lazy" doesn't mean "does as little work as possible". It means "values are not computed until they're needed".Braithwaite
... and they're needed to be sure that the first case doesn't match.Synovia
@Synovia But you don't need to be sure that the first case doesn't match. That's the point.Tonsil
Think of the two arguments as "mystery boxes". The only way Haskell can find out if one of them contains False is by opening the box, or evaluating the arguments. It can't possibly "guess" which order to do this in!Petulia
@Memphy I don't think the adjective "short-circuited" makes sense with pure functions.Tonsil
@Mauris. Yes, it can. All adjacent definitions with the same rhs can be examined in any order. That's how I know that that expression evaluates to False.Tonsil
Kiuhnm: operationally, Haskell is defined as needing to check definitions in order.Synovia
@Synovia I know, but that's not very smart (and lazy). Imagine that instead of non_term we use long_computation. Haskell would compute the correct result but would do a lot of superfluous computations easily avoidable.Tonsil
@Tonsil There are plenty of cases where you want the patterns to be matched in a certain order and it would be ambiguous otherwise. Probably the majority of the time, in fact.Scientism
You should look into unamb, a parallel computation operator that says "run these two computations and use the result of the first one to terminate without an error." But basically, lazy evaluation is a specific evaluation strategy-there are others that may be better for your purposes.Govea
@DavidYoung The extension I proposed doesn't introduce any ambiguity.Tonsil
When you say "it is clear that the result is False" you have a birds-eye view that the compiler doesn't have. You can see that the expression is a constant, however the compiler must treat it like a black box expression. It has no idea which of the two (if any) is an infinite loop. So the best you can do is pick one (left or right) to evaluate first, and if you are lucky you picked the one which isn't an infinite loop.Triable
@MartinCapodici As I said again and again, "all adjacent definitions with the same rhs can be examined in any order". It's easy to extend Haskell this way. It's clear to me that Haskell doesn't do this, but the point is that it could and maybe it should.Tonsil
@MartinCapodici A better way would be to tell Haskell explicitly that some definitions must be considered in a round-robin way. Haskell would try to match the first def and give up after just one substitution. Then Haskell would move to the next def. and so on. Then the first def would be considered again. This can be seen as a sort of cooperative multi-tasking. Actually, it's not so different from unamb...Tonsil
Round robin would work. However there is probably a performance trade off there. Question to ask is whether things like TCO would still work. Also will have different behaviour for bottom values. E.g. one of the functions eventually throws an error. How quickly or slowly it does that affects the behaviour of the resulting function.Triable
S
6

You can write a complete definition for und that will work on non-terminating expressions... sort of

To make this work, you need your own definition of Bool that makes explicit the delay in any computation:

import Prelude hiding (Bool(..))
data Bool = True | False | Delay Bool
  deriving (Show, Eq)

Then whenever you define a value of type Bool, you have to constrain yourself to co-recursion, where the delays are made explicit with the Delay constructor, rather than via recursion, where you have to evaluate a sub-expression to find the constructor for the top-level return value.

In this world, a non-terminating value could look like:

nonTerm :: Bool
nonTerm = Delay nonTerm

Then und becomes:

und :: Bool -> Bool -> Bool
und False y = False
und x False = False
und True y  = y
und x True  = x
und (Delay x) (Delay y) = Delay $ und x y

which works just fine:

λ und True False
False
λ und False nonTerm
False
λ und nonTerm False
False
λ case und nonTerm nonTerm of Delay _ -> "delayed" ; _ -> "not delayed"
"delayed"

Following up on dfeuer's comment, it looks like what you're looking for can be done with unamb

λ :m +Data.Unamb 
λ let undL False _ = False ; undL _ a = a
λ let undR _ False = False ; undR a _ = a
λ let und a b = undL a b `unamb` undR a b
λ und True False
False
λ und False False
False
λ und False True
False
λ und True True
True
λ und undefined False
False
λ und False undefined
False
Synovia answered 20/6, 2015 at 23:34 Comment(2)
If I understand this correctly, that works only with very specific nonTerm expressions. Therefore, the answer to my question is a resounding NO.Tonsil
With the unamb code, und False (non_term 1) still does not terminate for me in ghci. I suspect because non_term 1 is running in a thread that's locking out the thread that tries to kill it: downloads.haskell.org/~ghc/latest/docs/html/libraries/base/…Byssinosis
G
9

Is there a way to implement und (i.e. and in German) correctly (not just partially as above) so that both

und (non_term 1) False

and

und False (non_term 1)

return False?

If you're interested in theory, there's a classic theoretical result that states that the function above is impossible in the lazy lambda calculus with recursion (which is called PCF). This was due to Plotkin in 1977. You can find a discussion in the Winskel's notes on denotational demantics in Chapter 8 "Full Abstraction".

Even if the proof is more involved, the key idea here is that the lambda calculus is a sequential, deterministic language. As such, once a lazy binary function is fed two boolean values (possibly bottom ones), it needs to decide which one to evaluate before the other, hence fixing an evaluation order. This will break the symmetry of or and and, since if the chosen argument diverges then the or/and will also diverge.

As others mentioned, in Haskell, there's a library defining unamb through non sequential means, i.e. exploiting some concurrency underneath, hence going outside the power of PCF. With that you can define your parallel or or and.

Guadalcanal answered 21/6, 2015 at 8:27 Comment(1)
I upvoted your answer because I appreciate the theoretical explanation but I chose the other answer because he provides an implementation.Tonsil
S
6

You can write a complete definition for und that will work on non-terminating expressions... sort of

To make this work, you need your own definition of Bool that makes explicit the delay in any computation:

import Prelude hiding (Bool(..))
data Bool = True | False | Delay Bool
  deriving (Show, Eq)

Then whenever you define a value of type Bool, you have to constrain yourself to co-recursion, where the delays are made explicit with the Delay constructor, rather than via recursion, where you have to evaluate a sub-expression to find the constructor for the top-level return value.

In this world, a non-terminating value could look like:

nonTerm :: Bool
nonTerm = Delay nonTerm

Then und becomes:

und :: Bool -> Bool -> Bool
und False y = False
und x False = False
und True y  = y
und x True  = x
und (Delay x) (Delay y) = Delay $ und x y

which works just fine:

λ und True False
False
λ und False nonTerm
False
λ und nonTerm False
False
λ case und nonTerm nonTerm of Delay _ -> "delayed" ; _ -> "not delayed"
"delayed"

Following up on dfeuer's comment, it looks like what you're looking for can be done with unamb

λ :m +Data.Unamb 
λ let undL False _ = False ; undL _ a = a
λ let undR _ False = False ; undR a _ = a
λ let und a b = undL a b `unamb` undR a b
λ und True False
False
λ und False False
False
λ und False True
False
λ und True True
True
λ und undefined False
False
λ und False undefined
False
Synovia answered 20/6, 2015 at 23:34 Comment(2)
If I understand this correctly, that works only with very specific nonTerm expressions. Therefore, the answer to my question is a resounding NO.Tonsil
With the unamb code, und False (non_term 1) still does not terminate for me in ghci. I suspect because non_term 1 is running in a thread that's locking out the thread that tries to kill it: downloads.haskell.org/~ghc/latest/docs/html/libraries/base/…Byssinosis
C
4

Haskell is indeed lazy. Laziness means that an expression is not evaluated unless required. However, laziness doesn't mean that two expressions can be evaluated in an arbitrary order. The order of evaluation of expressions in Haskell matters. For example, consider your und function:

und :: Bool -> Bool -> Bool
und False y = False
und y False = False

First, I would like to note that this function is incomplete. The complete function is:

und :: Bool -> Bool -> Bool
und False y = False
und y False = False
und y True  = True          -- you forgot this case

In fact, the und function can be written more succinctly (and more lazily) as follows:

-- if the first argument is False then the result is False
-- otherwise the result is the second argument
-- note that the second argument is never inspected

und :: Bool -> Bool -> Bool
und False _ = False
und _     x = x

Anyway, the pattern matching syntax in Haskell is just syntactic sugar for case expressions. For example, your original (incomplete) function would be desugared to (up to alpha equivalence):

und :: Bool -> Bool -> Bool
und x y = case x of False -> False
                    True  -> case y of False -> False
                                       True  -> undefined

From this we can see:

  1. Your function is incomplete because the last case is undefined.
  2. Your function evaluates the second argument if the first argument is True even though it doesn't need to. Remember that case expressions always force the evaluation of the expression being inspected.
  3. Your function first inspects x and then inspects y if x evaluates to True. Hence, there is indeed an explicit order of evaluation here. Note that if x evaluates to False then y is never evaluated (proof that und is indeed lazy).

It is because of this ordering of evaluation that your expression und (non_term 1) False diverges:

  und (non_term 1) False
= case non_term 1 of False -> False
                     True  -> case False of False -> False
                                            True  -> undefined
= case non_term 2 of False -> False
                     True  -> case False of False -> False
                                            True  -> undefined
= case non_term 3 of False -> False
                     True  -> case False of False -> False
                                            True  -> undefined
                .
                .
                .
                .

If you want, you can create a function which has a different order of evaluation:

und :: Bool -> Bool -> Bool
und x y = case y of False -> False
                    True  -> x     -- note that x is never inspected

Now, the expression und (non_term 1) False evaluates to False. However, the expression und False (non_term 1) still diverges. So, your main question is:

Is there a way to implement und (i.e. and in German) correctly (not just partially as above) so that both

und (non_term 1) False

and

und False (non_term 1)

return False?

The short answer is no. You always need a specific order of evaluation; and depending upon the order of evaluation either und (non_term 1) False or und False (non_term 1) will diverge.

Does this mean that Haskell is wrong/faulty? No. Haskell does the right thing and simply doesn't produce any answer. To a human (who can evaluate both expressions in parallel) it would seem apparent that the result of und (non_term 1) False must be False. However, computers must always have an order of evaluation.

So, what is the actual problem? In my humble opinion the actual problem is either/or:

  1. Parallel evaluation. Haskell should evaluate the expression both ways in parallel and choose the one that terminates first:

    import Data.Unamb (unamb)
    
    type Endo a = a -> a
    
    bidirectional :: Endo (a -> a -> b)
    bidirectional = unamb <*> flip
    
    und :: Bool -> Bool -> Bool
    und = bidirectional (&&)
    
  2. General recursion. In my humble opinion, general recursion is too powerful for most use cases: it allows you to write absurd functions like non_term x = non_term (x + 1). Such functions are quite useless. If we don't consider such useless functions as inputs then your original und function is a perfectly good function to use (just implement the last case or use &&).

Hope that helps.

Canescent answered 21/6, 2015 at 5:37 Comment(3)
"However, computers must always have an order of evaluation." If I can do it, a computer can too. As I already said, Haskell could examine all adjacent definitions with the same RHS for an easy match and then, if necessary, proceed as usual.Tonsil
@Kiuhnm, define "easy". If I replace False in the RHS of the second equation with not True, is it still an "easy match"? If not, then I can no longer replace equals by equals without changing the meaning of the program. I think you will find it difficult to define a language with well-defined, deterministic semantics including general recursion which accepts functions like the one you want to write. In any case, Haskell is not that language and a compiler which performs the evaluations you suggest is not a correct Haskell compiler.France
@ReidBarton I meant an easy match on the LHS. Anyway, we must define what it means for the RHS to be equal. We may explicitly tell Haskell that some definitions can be considered in any order. Haskell will try to match the definitions in a round-robin way by giving up each time after just one substitution. This should work.Tonsil

© 2022 - 2024 — McMap. All rights reserved.