Pure Functions: Does "No Side Effects" Imply "Always Same Output, Given Same Input"?
Asked Answered
H

6

92

The two conditions that define a function as pure are as follows:

  1. No side effects (i.e. only changes to local scope are allowed)
  2. Always return the same output, given the same input

If the first condition is always true, are there any times the second condition is not true?

I.e. is it really only necessary with the first condition?

Haggerty answered 4/3, 2019 at 22:7 Comment(6)
Your premises are ill-specified. "Input" is too broad. Functions can be thought two have kinds of input. Their arguments, and "environmental"/"contextual". A function that returns the system time could be thought to be pure (even though it's obv not) if you don't distinguish between these two kinds of input.Heterogenous
@Alexander: In the context of "pure function", "input" is commonly understood to mean the parameters / arguments that are passed explicitly (by whatever mechanism the programming language uses). That's part of the definition of "pure function". But you are right, it's important to mind the definition.Waligore
Trivial counter-example: return the value of a global variable. No side effects (the global is only ever read!), but still potentially different results every time. (If you don't like globals, return the address of a local variable which depends on the call stack at run time).Malnutrition
You need to expand your definition of "side effects"; you say that a pure method does not produce side effects, but you need to also note that a pure method does not consume side effects produced elsewhere.Grandma
@Waligore Perhaps commonly understood, but the lack of that distinction is the exact cause of OP's confusion.Heterogenous
@EricLippert OP is simply quoting e.g. wikipedia's criteria. (I'm tempted to call a function communicating through side channels a "leaky gut" function.) It's precisely criterion 2 which implies no consumption of any (mutable) external state. I would think the two statements are equivalent.Malnutrition
P
123

Here are a few counterexamples that do not change the outer scope but are still considered impure:

  • function a() { return Date.now(); }
  • function b() { return window.globalMutableVar; }
  • function c() { return document.getElementById("myInput").value; }
  • function d() { return Math.random(); } (which admittedly does change the PRNG, but is not considered observable)

Accessing non-constant non-local variables is enough to be able to violate the second condition.

I always think of the two conditions for purity as complementary:

  • the result evaluation must not have effects on side state
  • the evaluation result must not be affected by side state

The term side effect only refers to the first, the function modifying the non-local state. However, sometimes read operations are considered as side effects as well: when they are operations and involve writing as well, even if their primary purpose is to access a value. Examples for that are generating a pseudo-random number that modifies the generator's internal state, reading from an input stream that advances the read position, or reading from an external sensor that involves a "take measurement" command.

Pittel answered 4/3, 2019 at 22:10 Comment(20)
Thanks Bergi. For some reason I thought side effects included reading variables outside of local scope, but I guess it is only a side effect if it writes such external variables.Haggerty
Note that the two conditions are actually logically independent: This answer shows that 1. does not imply 2., but there are also many counterexamples that 2. does not imply 1. - for example, a function setDate(newDate) that sets the computer's current date to newDate has property 2. (the return value can be newDate or nothing), but it still violates 1. (it sets the current date).Waligore
If prompt("you choose") does not have side effects, we should take a step back and clarify the meaning of side effects.Aground
@Aground OK, it's a bit of a stretch, probably even more so than Math.random(), but given a usual model of JavaScript computation, it does not change the observable environment - it does not violate OPs definition of "side effects (i.e. only changes to local scope are allowed)". But you're right, I will change it to something less controversial.Pittel
@Pittel Apologies, I forgot a question mark at the end of my above comment. Is it correct that side effects only includes writing to external variables, and not reading said variables?Haggerty
@Haggerty Yes, exactly that's what effect means. I'll try to clarify in my answer as well, I didn't expect such large attention and want to make answer worthy of dozens of votes :-)Pittel
Or reading from the hard drive, or looking at a website, or reading data from a sensor ...Leung
Not sure I agree with the terminology about "side effect". I always thought from CS classes that the deliberately broad term "side effects" was used specifically to include cases like in the second bullet. Write to a global variable is explicit, so it is not a side effect (but a direct one). Affecting something like a stream's read position seems to perfectly fit the intention of side effect to me. It is not the specific intended consequence but occurs nevertheless.Wigeon
@DaveInCaz I'm sure the term has a pretty specific meaning when discussing referential transparency in functional programming. But there's a more generic usage as well, with the meaning of unintentional or hidden effects, that you seem to have learned about.Pittel
Well for all you know Math.random() returns a thermal diode. It's not actually specified to use a bad RNG.Starla
@Starla Even that would probably be wrapped in enough OS abstraction on which a read changes some internal state :-)Pittel
@Bergi: then consider the purity of a function written in x86 assembly that uses the rdrand machine instruction to get hardware randomness. The internal implementation goes through some conditioning, but there's no way for x86 instructions to access that internal state. At an architectural level, the instruction produces true randomness every time it executes, in user-space, with no OS intervention. You can't usefully say that you took the value some other function or program would have gotten. But you must not optimize away multiple calls.Sawfly
@PeterCordes OK, I was thinking about some higher-level language than assembly - I don't know what model to use for purity analysis of that. You're probably right there. (On the other hand, the description "The Carry Flag indicates whether a random value is available at the time the instruction is executed." implies that there is some state involved even in the hardware generator).Pittel
@Bergi: Consider the implementation in IvyBridge, where rdrand never fails unless the HW detects that the true-RNG is broken. The API leaves the door open for implementations that return an error instead of stalling, in case they can't keep up with the possible number of cores pulling randomness, but IvB can keep up, according to an SO answer from the Intel architect who designed it: What are the exhaustion characteristics of RDRAND on Ivy Bridge?. Or consider a hypothetical true RNG (with hidden conditioning, if any).Sawfly
I guess from a purity perspective, it's equivalent to input on any external sensor measuring any physical quantity. (Different from reading an input stream like a keyboard or file, because there's an infinite amount of data available, limited only by how often you sample it.)Sawfly
and if random generator is not very random then you will have impure function which returns the same output for the same input ()Swindell
@Pittel I find this thread so puzzling. 1) this question has been asked countless times on SO, 2) it's off-topic, 3) it's brand-new but it has over 70 upvotes. What's so hot about the question this time around?Rhyolite
@user633183 I guess it was posted in the JS tag, which lead to many viewers and a few upvotes, which caused it to be featured on the hot network questions from where it gained even more viewers and votes. Btw, I don't think it is off-topic. And if you can find previous instances, please comment on the question itself - if you found an exact duplicate, I will vote to close.Pittel
Of the two conditions, I've heard the former called "effects" while the latter is called "coeffects". Both are "side effects" and impure. f(coeffects, input) -> effects, output Coeffects are input that comes from the changes in the wider environment , effects are output that change the wider environment. Elm and Clojurescrips re-frame work with this model, for example.Storer
@Dan Thanks for mentioning that term. I don't think coeffects are side effects under most definitions, though.Pittel
E
31

The "normal" way of phrasing what a pure function is, is in terms of referential transparency. A function is pure if it is referentially transparent.

Referential Transparency, roughly, means that you can replace the call to the function with its return value or vice versa at any point in the program, without changing the meaning of the program.

So, for example, if C's printf were referentially transparent, these two programs should have the same meaning:

printf("Hello");

and

5;

and all of the following programs should have the same meaning:

5 + 5;

printf("Hello") + 5;

printf("Hello") + printf("Hello");

Because printf returns the number of characters written, in this case 5.

It gets even more obvious with void functions. If I have a function void foo, then

foo(bar, baz, quux);

should be the same as

;

I.e. since foo returns nothing, I should be able to replace it with nothing without changing the meaning of the program.

It is clear, then, that neither printf nor foo are referentially transparent, and thus neither of them are pure. In fact, a void function can never be referentially transparent, unless it is a no-op.

I find this definition much easier to handle as the one you gave. It also allows you to apply it at any granularity you want: you can apply it to individual expressions, to functions, to entire programs. It allows you, for example, to talk about a function like this:

func fib(n):
    return memo[n] if memo.has_key?(n)
    return 1 if n <= 1
    return memo[n] = fib(n-1) + fib(n-2)

We can analyze the expressions that make up the function and easily conclude that they are not referentially transparent and thus not pure, since they use a mutable data structure, namely the memo array. However, we can also look at the function and can see that it is referentially transparent and thus pure. This is sometimes called external purity, i.e. a function that appears pure to the outside world, but is implemented impure internally.

Such functions are still useful, because while impurity infects everything around it, the external pure interface builds a kind of "purity barrier", where the impurity only infects the three lines of the function, but does not leak out into the rest of the program. These three lines are much easier to analyze for correctness than the entire program.

Exciseman answered 5/3, 2019 at 6:32 Comment(10)
That impurity affects the whole program once you have concurrency.Calcicole
@R.. Can you think of a way that concurrency could make the described Fibonacci function externally impure? I can't. Writing to memo[n] is idempotent, and failing to read from it merely wastes CPU cycles.Succursal
I agree with both of you. Impurity can lead to concurrency problems, but it doesn't in this specific case.Ahoy
@R.. It's not hard to imagine a concurrency-aware version.Rheumatism
@Succursal For example, memo[n] = ... may first create a dictionary entry, and then store the value into it. This leaves a window during which another thread could see an uninitialized entry.Rheumatism
@immibis And then what? The other thread does the exact same calculation again, and gets the exact same result. It makes no actual difference. No external impurity. As I said before: "failing to read from it merely wastes CPU cycles."Succursal
@Succursal No, the other thread sees the entry exists and reads the value from it which is some random number.Rheumatism
@immibis Ah, I understand. Yes, that could work, depending on the dictionary library.Succursal
OK, you have function which read current year (from system clock/calendar) and return 1 if it > 0 otherwise 0. Obviously this function has referentially transparency. But it's impure and will live under IO monad.Swindell
@Paul-AG: No, this function is not referentially transparent. Referential transparency means that you can replace a call to the function with the result of that call without changing the meaning of the program. But, whichever result you replace the call with, 0 or 1, you will change the meaning of the program for at least runs, namely, when you replace it with 1 you will change the meaning for runs of the program with the clock set to BC.Ahoy
E
13

It seems to me that the second condition you have described is a weaker constraint than the first.

Let me give you an example, suppose you have a function to add one that also logs to the console:

function addOneAndLog(x) {
  console.log(x);
  return x + 1;
}

The second condition you supplied is satisfied: this function always returns the same output when given the same input. It is, however, not a pure function because it includes the side effect of logging to the console.

A pure function is, strictly speaking, a function that satisfies the property of referential transparency. That is the property that we can replace a function application with the value it produces without changing the behaviour of the program.

Suppose we have a function that simply adds:

function addOne(x) {
  return x + 1;
}

We can replace addOne(5) with 6 anywhere in our program and nothing will change.

By contrast, we cannot replace addOneAndLog(x) with the value 6 anywhere in our program without changing behaviour because the first expression results in something being written to the console whereas the second one does not.

We consider any of this extra behaviour that addOneAndLog(x) performs besides returning output as a side-effect.

Ervin answered 4/3, 2019 at 23:3 Comment(6)
"It seems to me that the second condition you have described is a weaker constraint than the first." No, the two conditions are logically independent.Waligore
@Waligore you are mistaken. I have provided clear definitions for the terms pure and side-effect. Within these constraints, there is nothing that a function with no side effects besides return the same output for a given input. I have however provided examples where the second condition can be satisfied without the first. The fundamnetal concept to understand the notion of purity is referential transparency.Ervin
Small typo: There is nothing that a function with no side effects can do besides returning the same output for a given input.Ervin
What about something like returning the current time? That does not have side effects, but it does return a different output for the same input. Or more generally, any function where the result depends not only on the input parameters, but also on a (changeable) global variable.Waligore
@Waligore Returning the current time is absolutely a side effect. You cannot substitute 'Date.now()' with its result anywhere in your program without changing the behaviour, therefore it is not referentially transparent and consequently not pure.Ervin
It seems you are using a different definition of "side effect" than what is commonly used. A side effect is commonly defined as "an observable effect besides returning a value" or an "observable change in state" - see e.g. Wikipedia , this post on softwareengineering.SE. You are totally right that Date.now() is not pure/referentially transparent, but not because it has side-effects, but because its result depends on more than just its input.Waligore
A
7

There could be a source of randomness from outside the system. Suppose that part of your calculation includes the room temperature. Then executing the function will yield different results each time depending on the random external element of room temperature. The state is not changed by executing the program.

All I can think of, anyway.

Arioso answered 4/3, 2019 at 22:11 Comment(1)
According to me, these "randomness from outside the system" are a form of side effect. Functions with these behaviors are not "pures".Hackneyed
F
2

If the first condition is always true, are there any times the second condition is not true?

Yes

Consider simple code snippet below

public int Sum(int a, int b) {
    Random rnd = new Random();
    return rnd.Next(1, 10);
}

This code will return random output for same given set of inputs - however it does not have any side effect.

The overall effect of both the points #1 & #2 you mentioned when combined together means : At any point of time if function Sum with same i/p is replaced with its result in a program, overall meaning of program does not change. This is nothing but Referential transparency.

Faught answered 5/3, 2019 at 7:27 Comment(7)
But in this case, the first condition is not verified: writing to the console is considered a side effect, since it changes the state of the machine itself.Soursop
@Rightleg thx for pointing it out. Somehow I misunderstood OP totally other way. corrected answer.Faught
Doesn't it change the state of the random generator?Bikol
@EricDuminil is correct. rnd.Next() triggers a side effect in the rnd object. learn.microsoft.com/en-us/dotnet/api/…. Therefore the example is not pureJenninejennings
Generating a random number is itself a side effect, unless the state of the random number generator is supplied explicitly which would make the function satisfy condition 2.Ervin
rnd doesn't escape the function, so the fact that its state changes doesn't matter to the purity of the function, but the fact that the Random constructor uses the current time as a seed value means that there are "inputs" other than a and b.Lucianaluciano
I assumed that we are taking about observable side effects. If not, then there is long list of side effectsFaught
S
2

Problem with FP definitions is that they are very artificial. Each evaluation/calculation has side-effects on the evaluator. It's theoretically true. A deny of this shows only that FP apologists ignore philosophy and logic: an "evaluation" means changing of the state of some intelligent environment (machine, brain, etc). This is the nature of the evaluation process. No changes - no "calculi". The effect can be very visible: heating the CPU or its failure, shutting down the motherboard in case of overheating, and so on.

When you talk about referential transparency, you should understand that information about such transparency is available for human as a creator of the whole system and holder of semantical information and may be not available for the compiler. For example, a function can read some external resource and it will have IO monad in its signature but it will return the same value all the time (for example, the result of current_year > 0). The compiler does not know that function will return always the same result, so the function is impure but has referentially transparent property and can be substituted with True constant.

So, to avoid such inaccuracy, we should distinguish mathematical functions and "functions" in programming languages. Functions in Haskell are always impure and definition of purity related to them is always very conditionally: they are running on real hardware with real side-effects and physical properties, which is wrong for mathematical functions. This means that the example with "printf" function is totally incorrect.

But not all mathematical functions are pure too: each function which has t (time) as a parameter may be impure: t holds all effects and stochastic nature of the function: in common case you have input signal and have not idea about actual values, it can be even a noise.

Swindell answered 8/3, 2019 at 8:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.