Does the existence rseq/seq break referential transparency? Are there some alternative approaches that don't?
Asked Answered
P

4

6

I always thought that replacing an expression x :: () with () :: () would be one of the most basic optimizations during compiling Haskell programs. Since () has a single inhabitant, no matter what x is, it's result is (). This optimization seemed to me as a significant consequence of referential transparency. And we could do such an optimization for any type with just a single inhabitant.

(Update: My reasoning in this matter comes from natural deduction rules. There the unit type corresponds to truth (⊤) and we have an expansion rule "if x : ⊤ then () : ⊤". For example see this text p. 20. I assumed that it's safe to replace an expression with its expansion or contractum.)

One consequence of this optimization would be that undefined :: () would be replaced by () :: (), but I don't see this as a problem - it would simply make programs a bit lazier (and relying on undefined :: () is certainly a bad programming practice).

However today I realized that such optimization would break Control.Seq completely. Strategy is defined as

type Strategy a = a -> ()

and we have

-- | 'rseq' evaluates its argument to weak head normal form.
rseq :: Strategy a
rseq x = x `seq` ()

But rseq x :: () so the optimization would simply discard the required evaluation of x to WHNF.

So where is the problem?

  • Does the existence of rseq and seq break referential transparency (even if we consider only terminating expressions)?
  • Or is this a flaw of the design of Strategy and we could devise a better way how to force expressions to WHNF compatible with such optimizations?
Properly answered 2/12, 2012 at 19:48 Comment(5)
x may not terminate, then replacing it by () can change the meaning of the programme. main = case last (repeat ()) of { () -> launchMissiles }.Epithet
@DanielFischer I agree on that, the optimization would change the meaning of the program. But it's just an artificial example - could you give a meaningful, real program whose meaning would change? Nobody (except perhaps Strategy) relies on _|_ :: () and I doubt it's reasonable to do so. Also the problem with rseq + the optimization occurs even if we consider only terminating expressions.Properly
If x terminates, replacing rseq x with () doesn't change the meaning of the programme. That replacement changes the meaning if and only if x doesn't terminate. I can't give a meaningful example programme relying on () being inhabited by ⊥ directly, though. But it's the point of the magical seq that such a substitution is not made. seq is somewhat at odds with the rest of the language, but it is too useful to turn space-and-time-leaks into working programmes to live without it. That's why it is in the language despite its magic.Epithet
Petr: I don't know what you could mean by _|_ :: () not being a meaningful, real program. (I'm guessing you don't mean that it lacks meaning/semantics and that it doesn't exist.)Rural
@Rural No, I didn't mean that. I meant that I'd like to see a real program or library (except one using Strategy or seq of course) whose meaning would change if we allowed the rule x :: () --> () :: (). Or, stated in a broader context: What would we lose if we made all pattern matching on single-constructor data types automatically lazy? Would this semantic change cause some programs to be harder to write? (This change would imply the optimization rule - like Daniel's example would become case last (repeat ()) of { ~() -> launchMissiles } which would always launch missiles.)Properly
L
11

Referential transparency is about equality statements and variable references. When you say x = y and your language is referentially transparent, then you can replace every occurence of x by y (modulo scope).

If you haven't specified x = (), then you can't replace x by () safely, such as in your case. This is because you are wrong about the inhabitants of (), because in Haskell there are two of them: One is the only constructor of (), namely (). The other is the value that is never calculated. You may call it bottom or undefined:

x :: ()
x = x

You certainly can't replace any occurrence of x by () here, because that would chance semantics. The existence of bottom in the language semantics allows some awkward edge cases, especially when you have a seq combinator, where you can even prove every monad wrong. That's why for many formal discussions we disregard the existence of bottom.

However, referential transparency does not suffer from that. Haskell is still referentially transparent and as such a purely functional language.

Levey answered 2/12, 2012 at 20:31 Comment(0)
K
7

Your definition of referential transparency is incorrect. Referential transparency does not mean you can replace x :: () with () :: () and everything stays the same; it means you can replace all occurrences of a variable with its definition and everything stays the same. seq and rseq are not in conflict with referential transparency if you use this definition.

Kusin answered 2/12, 2012 at 19:52 Comment(5)
And before anyone asks, the "everything" that stays the same does not include running time and memory consumption.Epithet
The problem is, I've seen many different definitions of referential transparency and none was rigorous and based in the theory. The definition I started with was: "You can replace a term with its expansion or contractum." Then in natural deduction we have the following expansion rule for the unit type (truth): x :: ⊤ expands to () :: ⊤ where "⊤" is truth - the unit type. (For example , see this text, p. 20).Properly
@PetrPudlák: You could do this replacement provided () is the only inhabitant of . But since Haskell is not consistent under CH - that is, every type is inhabitated - has extra inhabitant, namely undefined.Thankyou
@PetrPudlák Your problem is that "You can replace a term with its expansion or contractum" is not sufficient to infer from x :: () to x = (). You're making an intuitive leap based on the idea that the type () has only one value, so any x :: () must be the value (). That's not rigorously true in Haskell; we cannot say that () is a valid expansion or contractum of x solely by knowing that x is of type (). So even the definition of referential transparency you're using doesn't allow the optimization you're talking about.Benedetta
@Benedetta it's not intuitive, it comes rigorously from natural deduction rules. But now I understand that expansions derived from natural deduction rules are not admissible in the presence of bottom. I've added my own answer with some views I came up with.Properly
O
4

seq is a red herring here.

unitseq :: () -> a -> a
unitseq x y = case x of () -> y

This has the same semantics as seq on unit, and requires no magic. In fact, seq only has something "magic" when working on things you can't pattern match on -- i.e., functions.

Replacing undefined :: () with () has the same ill effects with unitseq as with the more magical seq.

In general, we think of values not only as "what" they are, but how defined they are. From this perspective, it should be clear why we can't go implementing definedness-changing transformations willy-nilly.

Oppenheimer answered 3/12, 2012 at 1:38 Comment(0)
P
2

Thanks all for the inspiring answers. After thinking about it more, I come to the following views:

View 1: Considering terminating expressions only

If we restrict ourselves to a normalizing system, then seq (or rseq etc.) has no influence on the result of a program. It can change the amount of memory the program uses and increase the CPU time (by evaluating expressions we won't actually need), but the result is the same. So in this case, the rule x :: () --> () :: () is admissible - nothing is lost, CPU time and memory can be saved.

View 2: Curry-Howard isomorphism

Since Haskell is Turing-complete and so any type is inhabited by an infinite loop, the C-H corresponding logic system is inconsistent (as pointed out by Vitus) - anything can be proved. So actually any rule is admissible in such a system. This is probably the biggest problem in my original thinking.

View 3: Expansion rules

Natural deduction rules give us reductions and expansions for different type operators. For example for -> it's β-reduction and η-expansion, for (,) it is

fst (x, y) --reduce--> x (and similarly for snd)

x : (a,b) --expand--> (fst x, snd x)

etc. With the presence of bottom, the reducing rules are still admissible (I believe), but the expansion rules not! In particular, for ->:

rseq (undefined :: Int -> Int) = undefined

but it's η-expansion

rseq (\x -> (undefined :: Int -> Int) x) = ()

or for (,):

case undefined of (a,b) -> ()

but

case (fst undefined, snd undefined) of (a,b) -> ()

etc. So in the same way, the expansion rule for () is not admissible.

View 4: Lazy pattern matching

Allowing x :: () to expand to () :: () could be viewed as a special case of forcing all pattern matching on single-constructor data types to be lazy. I've never seen lazy pattern matching to be used on a multi-constructor data type, so I believe this change would allow us to get rid of the ~ lazy pattern matching symbol completely. However, it would change the semantic of the language to be more lazy (see also Daniel Fischer's comment to the question). And moreover (as described in the question) break Strategy.

Properly answered 2/12, 2012 at 22:19 Comment(2)
You may find the paper Free Theorems in the Presence of seq (Johann and Voigtländer, POPL 2004) interesting. They show that the presence of seq parametricity does not hold. So it is more than just η, equational reasoning in the presence of seq is suspect.Composition
For future readers, I think the paper is now located at: janis-voigtlaender.eu/papers/FreeTheoremsInThePresenceOfSeq.pdfWatson

© 2022 - 2024 — McMap. All rights reserved.