What is the difference between an Idempotent and a Deterministic function?
Asked Answered
G

3

53

Are idempotent and deterministic functions both just functions that return the same result given the same inputs?

Or is there a distinction that I'm missing? (And if there is a distinction, could you please help me understand what it is)

Galvez answered 28/10, 2016 at 0:29 Comment(0)
F
69

In more simple terms:

  • Pure deterministic function: The output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output. There are no side-effects or other output.
  • Impure deterministic function: As with a deterministic function that is a pure function: the output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output - however there is other output (side-effects).
  • Idempotency: The practical definition is that you can safely call the same function multiple times without fear of negative side-effects. More formally: there are no changes of state between subsequent identical calls.

Idempotency does not imply determinacy (as a function can alter state on the first call while being idempotent on subsequent calls), but all pure deterministic functions are inherently idempotent (as there is no internal state to persist between calls). Impure deterministic functions are not necessarily idempotent.

Pure deterministic Impure deterministic Pure Nondeterministic Impure Nondeterministic Idempotent
Input Only parameter arguments (incl. this) Only parameter arguments (incl. this) Parameter arguments and hidden state Parameter arguments and hidden state Any
Output Only return value Return value or side-effects Only return value Return value or side-effects Any
Side-effects None Yes None Yes After 1st call: Maybe.
After 2nd call: None
SQL Example UCASE CREATE TABLE GETDATE DROP TABLE
C# Example String.IndexOf DateTime.Now Directory.Create(String)Footnote1
  • Footnote1 - Directory.Create(String) is idempotent because if the directory already exists it doesn't raise an error, instead it returns a new DirectoryInfo instance pointing to the specified extant filesystem directory (instead of creating the filesystem directory first and then returning a new DirectoryInfo instance pointing to it) - this is just like how Win32's CreateFile can be used to open an existing file.

A temporary note on non-scalar parameters, this, and mutating input arguments:

(I'm currently unsure how instance methods in OOP languages (with their hidden this parameter) can be categorized as pure/impure or deterministic or not - especially when it comes to mutating the the target of this - so I've asked the experts in CS.SE to help me come to an answer - once I've got a satisfactory answer there I'll update this answer).

A note on Exceptions

Many (most?) programming languages today treat thrown exceptions as either a separate "kind" of return (i.e. "return to nearest catch") or as an explicit side-effect (often due to how that language's runtime works). However, as far as this answer is concerned, a given function's ability to throw an exception does not alter its pure/impure/deterministic/non-deterministic label - ditto idempotency (in fact: throwing is often how idempotency is implemented in the first place e.g. a function can avoid causing any side-effects simply by throwing right-before it makes those state changes - but alternatively it could simply return too.).

So, for our CS-theoretical purposes, if a given function can throw an exception then you can consider the exception as simply part of that function's output. What does matter is if the exception is thrown deterministically or not, and if (e.g. List<T>.get(int index) deterministically throws if index < 0).

Note that things are very different for functions that catch exceptions, however.

Determinacy of Pure Functions

For example, in SQL UCASE(val), or in C#/.NET String.IndexOf are both deterministic because the output depends only on the input. Note that in instance methods (such as IndexOf) the instance object (i.e. the hidden this parameter) counts as input, even though it's "hidden":

"foo".IndexOf("o") == 1 // first cal
"foo".IndexOf("o") == 1 // second call
// the third call will also be == 1

Whereas in SQL NOW() or in C#/.NET DateTime.UtcNow is not deterministic because the output changes even though the input remains the same (note that property getters in .NET are equivalent to a method that accepts no parameters besides the implicit this parameter):

 DateTime.UtcNow == 2016-10-27 18:10:01 // first call
 DateTime.UtcNow == 2016-10-27 18:10:02 // second call

Idempotency

A good example in .NET is the Dispose() method: See Should IDisposable.Dispose() implementations be idempotent?

a Dispose method should be callable multiple times without throwing an exception.

So if a parent component X makes an initial call to foo.Dispose() then it will invoke the disposal operation and X can now consider foo to be disposed. Execution/control then passes to another component Y which also then tries to dispose of foo, after Y calls foo.Dispose() it too can expect foo to be disposed (which it is), even though X already disposed it. This means Y does not need to check to see if foo is already disposed, saving the developer time - and also eliminating bugs where calling Dispose a second time might throw an exception, for example.

Another (general) example is in REST: the RFC for HTTP1.1 states that GET, HEAD, PUT, and DELETE are idempotent, but POST is not ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html )

Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.

So if you use DELETE then:

Client->Server: DELETE /foo/bar
// `foo/bar` is now deleted
Server->Client: 200 OK
Client->Server DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
// the client asks again:
Client->Server: DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found

So you see in the above example that DELETE is idempotent in that the state of the server did not change between the last two DELETE requests, but it is not deterministic because the server returned 200 for the first request but 404 for the second request.

Frolic answered 28/10, 2016 at 1:19 Comment(6)
A deterministic function you defined is not always idempotent if you consider only the output, not the algorithm. For example, when there is global variables, a function f in a program defined by int f(x) { i = i+1; return 42; } (i.e., f always returns 42 but increments a global variable a) is deterministic, but not idempotent. Maybe "deterministic" and "pure" are mixed up in this post.Mild
@Frolic how would you call a function that no matter what will always make sure the result is the same. - for example, no matter what the status of a machine on ec2, it will make sure your server is healthy at the end.. ?Incendiary
@Frolic I would also agree with the fact that deterministic does not inherently mean idempotent. What if you have a function that based on the same input generates the same output but has a side-effect? Idk like for instance it does another function call inside it that alters some state somewhere else, or modifies a mutable data type, etc. (not that it would be a good design, in fact it's an awful design and it should be avoided at all times) but it demonstrates that it's not idempotent. I would say that your definitions are correct, but that the latter statement is not.Romeu
@George You're right - I should have differentiated between pure and impure deterministic functions. I've amended my answer.Frolic
Great examples, I saw create table and I wonder what category you would put the related create table if not exists and also create or replace tableItinerary
@Itinerary CREATE TABLE IF NOT EXISTS is impure nondeterministic and idempotent. I'm unfamiliar with the semantics of CREATE OR REPLACE TABLE but I assume it will always drop/override the table even if the schema is identical? If so, then CREATE OR REPLACE TABLE would be impure nondeterministic but idempotent only in an unrealistic academically pure database system - in practice this operation is not idempotent because it alters state every time it's run elsewhere in the database engine (e.g. storage pages on disk, statistics, counters/metrics, audit logs, etc.Frolic
L
9

A deterministic function will return the same result for the same inputs, regardless of how many times you call it.

An idempotent function may NOT return the same result (it will return the result in the same form but the value could be different, see http example below). It only guarantees that it will have no side effects. In other words it will not change anything.

For example, the GET verb is meant to be idempotent in HTTP protocol. If you call "~/employees/1" it will return the info for employee with ID of 1 in a specific format. It should never change anything but simply return the employee information. If you call it 10, 100 or so times, the returned format will always be the same. However, by no means can it be deterministic. Maybe if you call it the second time, the employee info has changed or perhaps the employee no longer even exists. But never should it have side effects or return the result in a different format.

My Opinion

Idempotent is a weird word but knowing the origin can be very helpful, idem meaning same and potent meaning power. In other words it means having the same power which clearly doesn't mean no side effects so not sure where that comes from. A classic example of There are only two hard things in computer science, cache invalidation and naming things. Why couldn't they just use read-only? Oh wait, they wanted to sound extra smart, perhaps? Perhaps like cyclomatic complexity?

Lisette answered 17/10, 2020 at 19:42 Comment(3)
There are 4 hard things: 0. cache invalidation 1. naming things 2. exactly once delivery 3. exactly once delivery 4. off-by-one errorsItinerary
@Itinerary "There are 10 types of people in the world: those who know binary and those who don't."Frolic
This explanation is wrong in my mind. Idempotent function may have side effects, but the side-effects must always end up in the same same state with identical inputs. So, the world state might change when the function is called for the first time, but duplicating calls doesn't further change the world state.Flyer
C
8

A deterministic function is just a function in the mathematical sense. Given the same input, you always get the same output. On the other hand, an idempotent function is a function which satisfies the identity

f(f(x)) = f(x) 

As a simple example. If UCase() is a function that converts a string to an upper case string, then clearly UCase(Ucase(s)) = UCase(s).

Idempotent functions are a subset of all functions.

Cenis answered 28/10, 2016 at 0:59 Comment(1)
What about in the computer science sense? Consider a function that updates state such as a database table. It might be appropriate to represent input data and the state of the table as separate inputs f(x, y) so in the case where the table already exists, f(x, f(x, y)) = f(x, y) ... oh... is that different or did I just take a roundabout way to get to the same f(f(x)) = f(x) ?Itinerary

© 2022 - 2024 — McMap. All rights reserved.