Background: In traditional Reverse Polish Notation, all operators must have fixed lengths, which allows RPN to be easily evaluated and manipulated by code because every token, expression, and subexpression are all "self-contained" such that one can blindly substitute the y
in x y *
for y 1 +
to get x y 1 + *
, which is another valid expression that does exactly what you want it to do. Here is an interactive demo of a simple RPN calculator with named variable support. Note that the demos try to present the gist of an algorithm; they don't correlate to or represent production code.
var rpn = prompt("Please enter RPN string, where each token is " +
"separated by a space", "x 1 x + * 2 /").trim().split(/\s+/);
var stack = [], variables = [], values = [];
for (let i = 0, len = rpn.length|0; i < len; i=i+1|0) {
if (/^\d*(\.\d*)?$/.test(rpn[i]) && rpn[i] !== "") {
stack.push( rpn[i] );
} else if (/^[a-z]$/i.test(rpn[i])) {
stack.push( rpn[i] );
if (!~variables.indexOf(rpn[i])) variables.push( rpn[i] );
} else {
if(stack.length<2)throw Error("No operand for " + rpn[i]);
const firstPop = stack.pop(); //lacks check if stack empty
stack.push( "(" + stack.pop() + rpn[i] + firstPop + ")" );
}
}
if (stack.length !== 1) throw Error("Invalid RPN got: " + stack);
for (let i = 0, len = variables.sort().length|0; i < len; i=i+1|0)
values[i] = +prompt(variables[i] + " = ", Math.random()*10|0);
variables.push("'use strict';return(" + stack.pop() + ")");
alert("Result: " + Function.apply(0, variables).apply(0, values));
Question: How can RPN be modified or adapted to accommodate variable-length "operators" (think functions)?
Research and proposed solutions: I am using RPN as an intermediary representation of code before it is finalized into a specified code language. I want to preserve as much of the usefulness and ease of RPN as possible while still representing variable-length operators. I devised three solutions and implemented them in rather simplistic demos below.
- A special ARGUMENTS_BEGIN prefix operator (we'll use
#
for the purposes of this question). This solution flies in the face of traditional RPN in that it adds prefix operators to denote where the arguments begin. This makes the arguments list auto-expand in size, and assists with debugging because no malformed token substitution can disrupt the arguments list, allowing one to localize the error more easily. This could make manipulation of arguments more complex due to more code needed to handle cases such as nested function calls, but I am not entirely sure what all complications could arise. It is my guess that I will encounter obstacles parsing syntax that includes prefix and postfix operators. It also makes direct evaluation more difficult because back-tracking or a separate stack is needed to locate the start of the arguments.
var rpn = prompt("Please enter a RPN string, where each token is " +
"separated by a space", "# # x 210 gcd x 6 * 126 gcd").trim()
.split(/\s+/);
var stack = [], variables = [], values = [];
for (let i = 0, len = rpn.length|0; i < len; i=i+1|0) {
if (/^\d*(\.\d*)?$/.test(rpn[i]) && rpn[i] !== "") {
stack.push( rpn[i] );
} else if (/^[a-z]$/i.test(rpn[i])) {
stack.push( rpn[i] );
if (!~variables.indexOf(rpn[i])) variables.push( rpn[i] );
} else if (/^[a-z]\w*$/i.test(rpn[i])) {
const s = stack.lastIndexOf("#");
if(s<0) throw Error("No start of arguments to " + rpn[i]);
stack.push( rpn[i]+"(" + stack.splice(s).slice(1) + ")" );
} else if (rpn[i] === '#') {
stack.push( '#' ); // sparks a syntax error if misused
} else {
if(stack.length<2)throw Error("No operand for " + rpn[i]);
const firstPop = stack.pop();
stack.push( "(" + stack.pop() + rpn[i] + firstPop + ")" );
}
}
if (stack.length !== 1) throw Error("Invalid RPN got: " + stack);
for (let i = 0, len = variables.sort().length|0; i < len; i=i+1|0)
values[i] = +prompt(variables[i] + " = ", Math.random()*10|0);
variables.push( "gcd" );
values.push( function gcd(a, b) {return b ? gcd(b, a % b) : a;} );
variables.push("'use strict';return(" + stack.pop() + ")");
alert("Result: " + Function.apply(0, variables).apply(0, values));
- Comma operators to group arguments together (we'll use
,
for grouping the last two items and~
to denote the a zero-length group for the purposes of this question). This solution is traditional RPN except with slightly special handling of the comma and zero-group operators. Every variable-length operator is treated as having a length of one (zero arguments is represented with~
). Commas build arguments lists out of two items, each of which can be an ordinary token, an arguments list, or a zero-group operator. Advantages include easy manipulation and parsing of the code, compliance with the simplicity of RPN, and preservation of the token-independentness of RPN. Disadvantages include the RPN being harder to debug because a tiny malformed token can upset an entire arguments list and snowball out of control with no way to detect whether it is deliberate or accidental.
var rpn = prompt("Please enter RPN string, where each token is " +
"separated by a space", "x 6 * 126 , 210 , gcd ~ PI %")
.trim().split(/\s+/);
var stack = [], variables = [], values = [];
for (let i = 0, len = rpn.length|0; i < len; i=i+1|0) {
if (/^\d*(\.\d*)?$/.test(rpn[i]) && rpn[i] !== "") {
stack.push( rpn[i] );
} else if (/^[a-z]$/i.test(rpn[i])) {
stack.push( rpn[i] );
if (!~variables.indexOf(rpn[i])) variables.push( rpn[i] );
} else if (/^[a-z]\w*$/i.test(rpn[i])) {
if(stack.length<1)throw Error("No operand for " + rpn[i]);
stack.push( rpn[i] + "(" + stack.pop() + ")" );
} else if (rpn[i] === ',') {
if(stack.length<2)throw Error("No operand for " + rpn[i]);
const p2 = "" + stack.pop(), p1 = "" + stack.pop();
stack.push( p1 && p2 ? p1 + "," + p2 : p1 || p2 );
} else if (rpn[i] === '~') {
stack.push( "" ); // zero-length group
} else {
if(stack.length<2)throw Error("No operand for " + rpn[i]);
const firstPop = stack.pop(); //lacks check if stack empty
stack.push( "(" + stack.pop() + rpn[i] + firstPop + ")" );
}
}
if (stack.length !== 1) throw Error("Invalid RPN got: " + stack);
for (let i = 0, len = variables.sort().length|0; i < len; i=i+1|0)
values[i] = +prompt(variables[i] + " = ", Math.random()*10|0);
variables.push( "gcd", "PI" );
values.push( function gcd(a, b) {return b ? gcd(b, a % b) : a;} );
values.push( function PI() {return Math.PI;} );
variables.push("'use strict';return(" + stack.pop() + ")");
alert("Result: " + Function.apply(0, variables).apply(0, values));
- The operator intrinsically stores its length (we'll append a number onto the function name for the purposes of this question). This solution inherits all of the advantages of traditional RPN. Additionally, it makes the reading aspect of the parser simple. Additionally, debugging is easier because there is no accidental insertion of new arguments. However, it makes manipulations and generation of RPN code more complex. Updating and generating arguments lists is difficult because this solution deviates from the token-independentness aspect of RPN such that adding an argument (and changing the arity) requires two actions and one lookup (verses the traditional one action and zero lookups): (1.) insert the argument, (2.) lookup the position of the variable-length operator, and (3.) update the length of the operator.
var rpn = prompt("Please enter RPN string, where each token is " +
"separated by a space", "x 210 gcd2 x 6 * 126 gcd3").trim()
.split(/\s+/);
var stack = [], variables = [], values = [];
for (let i = 0, len = rpn.length|0, m; i < len; i=i+1|0) {
if (/^\d*(\.\d*)?$/.test(rpn[i]) && rpn[i] !== "") {
stack.push( rpn[i] );
} else if (/^[a-z]$/i.test(rpn[i])) {
stack.push( rpn[i] );
if (!~variables.indexOf(rpn[i])) variables.push( rpn[i] );
} else if (m = rpn[i].match(/^([a-z]+)(\d+)$/i)) {
if(stack.length<m[2])throw Error("No operand for "+rpn[i]);
stack.push( m[1] + "(" + stack.splice(-m[2]) + ")" );
} else {
if(stack.length<2)throw Error("No operand for " + rpn[i]);
const firstPop = stack.pop(); //lacks check if stack empty
stack.push( "(" + stack.pop() + rpn[i] + firstPop + ")" );
}
}
if (stack.length !== 1) throw Error("Invalid RPN got: " + stack);
for (let i = 0, len = variables.sort().length|0; i < len; i=i+1|0)
values[i] = +prompt(variables[i] + " = ", Math.random()*10|0);
variables.push( "gcd" );
values.push( function gcd(a, b) {return b ? gcd(b, a % b) : a;} );
variables.push("'use strict';return(" + stack.pop() + ")");
alert("Result: " + Function.apply(0, variables).apply(0, values));
- Nested arrays on the stack (no demo possible). This solution involves storing the arguments in a list before the operator on the stack, which makes direct execution of the code very easy. However, this violates the entire precept and advantage of RPN, which is to have a flat list of items. Perhaps, if lists were only one deep, there would not be too much of a problem; however, for my use case, I would end up with deeply nested lists. Thus, manipulation of the RPN and generation of the RPN becomes very difficult.
Extrapolation of the single question: Are there any other possible solutions to this problem? What is the standard (most-used) solution to this problem? Are there fundamental problems with my solutions (please provide counter examples)? Did I overlook some pros/cons of my solutions? Could my solutions' algorithms be improved?
arg1 arg2 arg3 ... argN N op
. That's what forth'spick
androll
words do. – Doggerelfn
that separates the parameters from the rest of the stack. Seehelp fn
. – Frutescentfn
serves a similar purpose to#
in option 1.fn
appears to begin a list of arguments ended by the function to call them with, allowing multiple levels offn
s to be nested. This is the same behavior as the#
in option one. Please correct me if I am wrong. – Occupyfn
is a placeholder on the stack. However, it is mainly intended for calling Java functions. For 7th wordsfn>cnt
is necessary to removefn
and place the count of the following parameters onto the stack. Like in Forth, as Chen stated. – Frutescent