I'm not sure if this will help (I apologise in advance if it only confuses more), but the way IO is made referentially transparent in Mercury is to explicitly pass a value of type io
to all IO-performing code, which must also return a new value of type io
.
The input io
represents "the state of the world" just before the code is called. The entire world outside the program; disk contents, what's printed on the screen, what the user is about to type, what's about to be received from the network, everything.
The output io
represents the state of the world just after the code is called. The difference between the input io
and the output io
contains the changes to the world that were made by that code (plus everything else that happened externally, in theory).
Mercury's mode system ensures that values of type io
are unique; there's only ever one of them, so you can't pass the same io
value to two different IO-performing procedures. You pass an io
into a procedure, rendering it useless to you, then receive a new one back.
Of course the real state of the actual world isn't encoded into values of type io
; in fact under the hood io
is completely empty! There's no information being passed at all! But the io
values represent the state of the world.
You can think of functions in the IO monad as doing the same. They take an extra implicit state-of-the-world argument, and return an extra implicit state-of-the-world value. The IO monad implementation handles passing this extra output on to the next function. This makes the IO monad very like the State monad; it's easy to see that get
in that monad is pure, even though it appears to take no arguments.
In that understanding, main receives the initial state of the world before your program runs and transforms it into the state of the world after the program is run, by threading it through all the IO code in your program.
And because you can't get a state-of-the-world value yourself, you have no way of starting your own little IO chain in the middle of other code. This is what ensures purity, since in actual fact we can't have a brand new world with its own state spring out of nowhere.
So getLine :: IO String
can be thought of as something like getLine :: World -> (World, String)
. It's pure, because all those different times it's called and returns different strings it received a different World
each time.
Whether you think about values that are IO actions, or the state-of-the-world being passed around between functions, or any other mechanism, all these constructs are representational. Under the hood, all IO is implemented with impure code, because that's how the world works; when you write to a file, you have changed the state of the disk. But we can represent this at a higher level of abstraction, allowing you to think about it differently.
An analogy is that you can implement a map with search trees or hash tables or a zillion other ways. But having implemented it, when you're reasoning about code that uses the map, you don't think about left and right sub-trees or buckets and hashes, you think about the abstraction that is a map.
If we can represent IO in a way that maintains purity and referential transparency, then we can apply any reasoning that requires referential transparency to code using this representation. This allows all the mathematics that applies to such code to work (much of which is used in the implementation of advanced compilers for purity-enforced languages), even for programs that perform IO.
And a quick addendum about your second question. GHC could theoretically reduce that input program to just the output. I don't believe it tries terribly hard to do so though, because this is undecidable in general. Imagine a program that took no input but generated an infinite list and then printed its last 3 elements. Theoretically any program that is not dependent on its input can be reduced to its output, but in order to do that the compiler has to do something equivalent to executing the program at compile time. So to do this fully generally you'd have to be happy for your programs to sometimes go into infinite loops at compile time. And almost every program is dependent on its input, so there's not a lot to be gained by even trying to do this.
There is something to be gained by identifying parts of programs that aren't dependent on any input and replacing them with their result. This is called partial evaluation, and it's an active research topic, but it's also very hard and there's no one-size-fits-all solution. To do it, you have to be able to identify areas of the program that won't send the compiler into an infinite loop trying to figure out what they return, and you have to make make decisions about whether removing some code that takes a few seconds at run time is a good enough benefit if it means embedding the multi-hundred-megabyte data structure it returns in the program binary. And you have to do all this analysis without taking hours to compile moderately complex programs.
getLine
isn't really even a function (as it doesn't take any parameters), so "referentially transparent" as term doesn't really apply.getLine
is just a constant that contains the IO action that, when executed, reads from stdin. But note that executing the action is a separate thing from evalutating the constant, which is why you can e.g. saylet action = getLine
in ghci, and it won't read anything from stdin yet at that point. – Jotprint = putStrLn . show
. – Catercornered