How would I extend the JavaScript language to support a new operator?
Asked Answered
P

2

29

The answer to the question Is it possible to create custom operators in JavaScript? is not yet, but @Benjamin suggested that it would be possible to add a new operator using third party tools:

It is possible to use third party tools like sweet.js to add custom operators though that'd require an extra compilation step.

I will take the same example, like in the previous question:

(ℝ, ∘), x ∘ y = x + 2y

For any two real numbers x and y: x ∘ y is x + 2y that is also a real number. How can I add this operator in my extended JavaScript language?

After the following code will be run:

var x = 2
  , y = 3
  , z = x ∘ y;

console.log(z);

The output will contain

8

(because 8 is 2 + 2 * 3)


How would I extend the JavaScript language to support a new operator?

Progenitive answered 24/12, 2013 at 14:7 Comment(11)
How is this question different to your previous one?Circuity
@OliCharlesworth This is more detailed. In the previous one I asked if is it possible, now I want to know how can I extend the language adding a new operator.Alienage
It seems to me that the answers to your last question already covered this. Voting to close as a duplicateDistichous
i bet there's a good reason why you don't want to use functions instead :D (i'm curious)Erinerina
@MikeW No, they didn't. I don't think it's a duplicate. A real solution is not really related to the previous question. To this question, it is. The closest answer is the Benjamin's answer that says it's possible using a third party tool.Alienage
@Erinerina Wouldn't be interesting to be able to define new operators in JavaScript? :-)Alienage
Voting to close. If you want an answer more targeted to using sweet.js, you should probably ask it that way.Revert
Sweet.js doesn't support infix operator yet: github.com/mozilla/sweet.js/issues/34Antimony
@FlorianMargaine If sweet.js doesn't support that yet, I am sure that it's possible with another tool...Alienage
Not really. Sweet.js is the 1st tool I know of that tries to implement macros in JS. There might be another, but I'm not sure it's as well supported as sweet.js. Anyway, the story is: if you want it, build a preprocessor yourself. The moral of this story: don't try to implement custom infix operators in JS.Antimony
You can't really extend the Javascript language, but in Node you can write modules in C++, and there you could add pretty much anything, even custom operators, to V8, the interpreter.Footton
U
57

Yes, it's possible and not even very hard :)


We'll need to discuss a few things:

  1. What are syntax and semantics.
  2. How are programming languages parsed? What is a syntax tree?
  3. Extending the language syntax.
  4. Extending the language semantics.
  5. How do I add an operator to the JavaScript language.

If you're lazy and just want to see it in action - I put the working code on GitHub

1. What is syntax and semantics?

Very generally - a language is composed of two things.

  • Syntax - these are the symbols in the language like unary operators like ++, as well as Expressions like a FunctionExpression that represent an "inline" function. The syntax represents just the symbols used and not their meaning. In short the syntax is just the drawings of letters and symbols - it holds no inherent meaning.

  • Semantics ties meaning to these symbols. Semantics is what says ++ means "increment by one", in fact here is the exact defintion. It ties meaning to our syntax and without it the syntax is just a list of symbols with an order.

2. How are programming languages parsed? What is a syntax tree?

At some point, when something executes your code in JavaScript or any other programming language - it needs to understand that code. A part of this called lexing (or tokenizing, let's not go into subtle differences here) means breaking up code like:

function foo(){ return 5;}

Into its meaningful parts - that is saying that there is a function keyword here, followed by an identifier, an empty arguments list, then a block opening { containing a return keyword with the literal 5, then a semicolon, then an end block }.

This part is entirely in the syntax, all it does is break it up to parts like function,foo,(,),{,return,5,;,} . It still has no understanding of the code.

After that - a Syntax Tree is built. A syntax tree is more aware of the grammar but is still entirely syntactic. For example, a syntax tree would see the tokens of:

function foo(){ return 5;}

And figure out "Hey! There is a function declaration here!".

It's called a tree because it's just that - trees allow nesting.

For example, the code above can produce something like:

                                        Program
                                  FunctionDeclaration (identifier = 'foo')
                                     BlockStatement
                                     ReturnStatement
                                     Literal (5)

This is rather simple, just to show you it isn't always so linear, let's check 5 +5:

                                        Program
                                  ExpressionStatement
                               BinaryExpression (operator +)
                            Literal (5)       Literal(5)   // notice the split her

Such splits can occur.

Basically, a syntax tree allows us to express the syntax.

This is where x ∘ y fails - it sees and doesn't understand the syntax.

3. Extending the language syntax.

This just requires a project that parses the syntax. What we'll do here is read the syntax of "our" language which is not the same as JavaScript (and does not comply to the specification) and replace our operator with something the JavaScript syntax is OK with.

What we'll be making is not JavaScript. It does not follow the JavaScript specification and a standards complaint JS parser will throw an exception on it.

4. Extending the language semantics

This we do all the time anyway :) All we'll do here is just define a function to call when the operator is called.

5. How do I add an operator to the JavaScript language.

Let me just start by saying after this prefix that we'll not be adding an operator to JS here, rather - we're defining our own language - let's call it "CakeLanguage" or something and add the operator it it. This is because is not a part of the JS grammar and the JS grammar does not allow arbitrary operators like some other languages.

We'll use two open source projects for this:

  • esprima which takes JS code and generates the syntax tree for it.
  • escodegen which does the other direction, generating JS code from the syntax tree esprima spits.

It you paid close attention you'd know we can't use esprima directly since we'll be giving it grammar it does not understand.

We'll add a # operator that does x # y === 2x + y for the fun. We'll give it the precedence of multiplicity (because operators have operator precedence).

So, after you get your copy of Esprima.js - we'll need to change the following:

To FnExprTokens - that is expressions we'll need to add # so it'd recognize it. Afterwards, it'd look as such:

FnExprTokens = ['(', '{', '[', 'in', 'typeof', 'instanceof', 'new',
                    'return', 'case', 'delete', 'throw', 'void',
                    // assignment operators
                    '=', '+=', '-=', '*=', '/=', '%=', '<<=', '>>=', '>>>=',
                    '&=', '|=', '^=', ',',
                    // binary/unary operators
                    '+', '-', '*', '/', '%','#', '++', '--', '<<', '>>', '>>>', '&',
                    '|', '^', '!', '~', '&&', '||', '?', ':', '===', '==', '>=',
                    '<=', '<', '>', '!=', '!=='];

To scanPunctuator we'll add it and its char code as a possible case: case 0x23: // #

And then to the test so it looks like:

 if ('<>=!+-*#%&|^/'.indexOf(ch1) >= 0) {

Instead of:

    if ('<>=!+-*%&|^/'.indexOf(ch1) >= 0) {

And then to binaryPrecedence let's give it the same precedence as multiplicity:

case '*':
case '/':
case '#': // put it elsewhere if you want to give it another precedence
case '%':
   prec = 11;
   break;

That's it! We've just extended our language syntax to support the # operator.

We're not done yet, we need to convert it back to JS.

Let's first define a short visitor function for our tree that recursively visits all its node.

function visitor(tree,visit){
    for(var i in tree){
        visit(tree[i]);
        if(typeof tree[i] === "object" && tree[i] !== null){
            visitor(tree[i],visit);
        }
    }
}

This just goes through the Esprima generated tree and visits it. We pass it a function and it runs that on every node.

Now, let's treat our special new operator:

visitor(syntax,function(el){ // for every node in the syntax
    if(el.type === "BinaryExpression"){ // if it's a binary expression

        if(el.operator === "#"){ // with the operator #
        el.type = "CallExpression"; // it is now a call expression
        el.callee = {name:"operator_sharp",type:"Identifier"}; // for the function operator_#
        el.arguments = [el.left, el.right]; // with the left and right side as arguments
        delete el.operator; // remove BinaryExpression properties
        delete el.left;
        delete el.right;
        }
    }
});

So in short:

var syntax = esprima.parse("5 # 5");

visitor(syntax,function(el){ // for every node in the syntax
    if(el.type === "BinaryExpression"){ // if it's a binary expression

        if(el.operator === "#"){ // with the operator #
        el.type = "CallExpression"; // it is now a call expression
        el.callee = {name:"operator_sharp",type:"Identifier"}; // for the function operator_#
        el.arguments = [el.left, el.right]; // with the left and right side as arguments
        delete el.operator; // remove BinaryExpression properties
        delete el.left;
        delete el.right;
        }
    }
});

var asJS = escodegen.generate(syntax); // produces operator_sharp(5,5);

The last thing we need to do is define the function itself:

function operator_sharp(x,y){
    return 2*x + y;
}

And include that above our code.

That's all there is to it! If you read so far - you deserve a cookie :)

Here is the code on GitHub so you can play with it.

Un answered 24/12, 2013 at 16:42 Comment(11)
Much better than a simple no, it's not possible answer. :-)Alienage
I'd be tempted to use instead of # in the second half of the answer to keep things consistent…Valet
@DonalFellows the rationale was that readers might want to do this themselves and don't have a nice striaghtforward way to write with their keyboards. Thanks for the feedback though I'll consider it.Un
I don't have it on my keyboard either, but I can cut-n-paste with the best of them…Valet
@BenjaminGruenbaum Something for you, again! :-)Alienage
What version of esprima are you using and could you provide an example for v4.0? My esprima doesn't have FnExprTokens and this answer seems to be 5 years old.Commix
@FireCubez Im also having this problem... Also Benjamin how can I add a new keyword, like "using MyNamespace" would be the same as calling a function called using "using(MyNamespace)"Fusco
@bluejayke If you will make alot of modification to the language you might want to build a JS grammar built on an existing grammar.Commix
The whole stack got very different, today I'd make this as a Babel transform (or a TypeScript transform). Here's an example with the pipeline operator: github.com/babel/babel/tree/master/packages/… . The code part in this answer is pretty outdatedUn
Hmm, it doesn't look like you can actually plug into the parser very well - you'd need to actually manipulate it like in my answer above: github.com/babel/babel/blob/…Un
I believe the best answer I have ever seen to such a simple question. You deserve a cake!Whim
A
3

As I said in the comments of your question, sweet.js doesn't support infix operators yet. You're free to fork sweet.js and add it yourself, or you're simply SOL.

Honestly, it's not worth it to implement custom infix operators yet. Sweet.js is a well supported tool, and it's the only one I know of that tries to implement macros in JS. Adding custom infix operators with a custom preprocessor is probably not worth the gain you might have.

That said, if you're working on this alone for non-professional work, do whatever you want...

EDIT

sweet.js does now support infix operators.

Antimony answered 24/12, 2013 at 14:28 Comment(1)
Just want to point out that support for infix operators has been introduced, check out the examples in sweet.js' homepage, as well as the if, range, and unless macros on my candyshop repositoryDouse

© 2022 - 2024 — McMap. All rights reserved.