There are two aspects to this tweet: first, the fact that from a technical point of view, laziness generally mandates purity; second, the fact that from a practical point of view, strictness could still allow purity, but in practice it usually doesn't (i.e., with strictness, purity "goes out the window").
Simon Peyton-Jones explains both of these aspects in the paper "A History of Haskell: Being Lazy With Class". With respect to the technical aspect, in the section 3.2 Haskell is Pure, he writes (with my bold emphasis):
An immediate consequence of laziness is that evaluation order is demand-driven. As a result, it becomes more or less impossible to reliably perform input/output or other side effects as the result of a function call. Haskell is, therefore, a pure language.
If you can't see why laziness makes impure effects unreliable, I'm sure it's because you're over-thinking it. Here's a simple example illustrating the problem. Consider a hypothetical impure function that reads some information from a configuration file, namely some "basic" configuration and some "extended" configuration whose format depends on the configuration file version information in the header:
getConfig :: Handle -> Config
getConfig h =
let header = readHeader h
basic = readBasicConfig h
extended = readExtendedConfig (headerVersion header) h
in Config basic extended
where readHeader
, readBasicConfig
, and readExtendedConfig
are all impure functions that sequentially read bytes from the file (i.e., using typical, file pointer-based sequential reads) and parse them into the appropriate data structures.
In a lazy language, this function probably can't work as intended. If the header
, basic
, and extended
variable values are all lazily evaluated, then if the caller forces basic
first followed by extended
, the effects will be called in order readBasic
, readHeader
, readExtendedConfig
; while if the caller forces extended
first followed by basic
, the effects will be called in order readHeader
, readExtendedConfig
, readBasic
. In either case, bytes intended to be parsed by one function will be parsed by another.
And, these evaluation orders are gross oversimplifications which assume that the effects of the sub-functions are "atomic" and that readExtendedConfig
reliably forces the version argument for any access to extended
. If not, depending on which parts of basic
and extended
are forced, the order of the (sub)effects in readBasic
, readExtendedConfig
, and readHeader
may be reordered and/or intermingled.
You can work around this specific limitation by disallowing sequential file access (though that comes with some significant costs!), but similar unpredictable out-of-order effect execution will cause problems with other I/O operations (how do we ensure that a file updating function reads the old contents before truncated the file for update?), mutable variables (when exactly does that lock variable get incremented?), etc.
With respect to the practical aspect (again with my bold emphasis), SPJ writes:
Once we were committed to a lazy language, a pure one was inescapable. The converse is not true, but it is notable that in practice most pure programming languages are also lazy. Why? Because in a call-by-value language, whether functional or not, the temptation to allow unrestricted side effects inside a "function" is almost irresistible.
...
In retrospect, therefore, perhaps the biggest single benefit of laziness is not laziness per se, but rather that laziness kept us pure, and thereby motivated a great deal of productive work on monads and encapsulated state.
In his tweet, I believe Hutton is referring not to the technical consequence of laziness leading to purity, but rather the practical consequence of strictness tempting the language designers to relax purity "just in this one, special case", after which purity quickly goes out the window.