How math operators are identified
Asked Answered
C

1

5

How does a simple 2 ++ 2 work behind the scenes in the Python language?

If we type this in Python interpreter:

>>> 2+++--2
4
>>> 2+++*2
  File "<stdin>", line 1
    2++*2
       ^
SyntaxError: invalid syntax

Looking towards the syntax errors here I have noticed that it was the way Python is designed/implemented by the Python designers.

It is said Python is open source code so I started to explore it more. I have read many articles on Python implementation using cpython.

So here the Python compiler easily identifies these ++*%- are operators. Because it in written using the C language. And C uses some direct assembly code compiler which then converts to machine code.

Question 1: How does Python compiler is designed to identify the operators? (regarding lexical and parsing functionality)

Question 2 : How can I modify this simple behavior of Python interpreter where it can throw syntax error for use of multiple operators as same that it does for multiply

>>> 2**2
4
>>> 2***2
  File "<stdin>", line 1
    2***2
       ^
SyntaxError: invalid syntax

I have read these files of cpython :compile.c parser.c,readline.c

But I didn't came across any such files on exceptions handling mechanism for syntax error.

Update :

I am still searching and waiting for any answers for Question-2

Cabal answered 2/2, 2017 at 8:13 Comment(8)
There are so many way to do this, have a look on wiki.Bivins
@Bivins Thanks for link. I knew about lexical analyzers and parsers, but I don't know its detail working for python.Cabal
This isn't a bad question, just a bit too broad. I think it would be perfectly on-topic if you could only narrow it down to one specific programming language, as Python and C are very different in this regard.Minna
@ShivkumarKondi I took the libery to edit the question in an attempt to narrow it down to Python specifically. Please check the edit and see if you think it is ok. Re-open vote cast.Minna
Thanks @Minna for the edit.Cabal
Note related C version of this question: Why doesn't a+++++b work in C?Strobila
I've updated my answer in an attempt to make my respons to your second question clearer. If I've misunderstood what you're actually asking, then please let me know and I'll continue refining.Barringer
your question applies to Java and C# too @ShivkumarkondiReganregard
B
7

You've tripped over the difference between binary and unary operators. In the briefest of terms, -2 is literally the number "negative two". --2 is "negative (negative two)", or more conventionally "positive two". 2+++--2 is parsed as "two plus positive positive negative negative two", so it boils down to 2+2 and gives you 4. Both +2 and -2 are numbers, but *2 isn't, so that's why your syntax error happens.

Read on if you want horrendous detail, but the first paragraph most directly answers your question.

You asked for detail, so here it comes. Programming languages are (usually...) defined by things called context-free grammars. The grammar of Python is described using Bachus Naur Form. From https://docs.python.org/2/reference/expressions.html#unary-arithmetic-and-bitwise-operations, we have the following definitions:

u_expr ::=  power | "-" u_expr | "+" u_expr | "~" u_expr

m_expr ::=  u_expr | m_expr "*" u_expr | m_expr "//" u_expr | m_expr "/" u_expr
            | m_expr "%" u_expr

a_expr ::=  m_expr | a_expr "+" m_expr | a_expr "-" m_expr

This defines unary expressions, multiplicative expressions and arithmetic expressions in the Python language. I'm going to trim these both down to the bits that are directly relevant to our question before I attempt to explain it:

u_expr ::=  "2" | "-" u_expr | "+" u_expr

m_expr ::=  u_expr | m_expr "*" u_expr

a_expr ::=  m_expr | a_expr "+" m_expr | a_expr "-" m_expr

So, in this grammar, a u_expr is either 2, or it's the literal string + or - followed by any other u_expr, so the following all fit the definition of a u_expr: '2', '-2', '+2', '+-2', '++++---++2'.

An m_expr is either a u_expr, or it's an m_expr followed by a * followed by a u_expr. 2, 2*2, 2*+2, 2*++-+2 all fit this definition.

An a_expr is either an m_expr, or it's an a_expr followed by a plus or minus followed by an m_expr. 2, 2*2, 2+2, 2+2*2, 2++2*-2, and so on.

Now let's start looking at your first syntax error, 2+++*2. We're trying to turn this into an a_expr. It starts with a 2+, so we must be looking for something of the form a_expr "+" m_expr. 2 is an a_expr, we've got our literal +, so for us to not syntax error, we have to somehow turn ++*2 into an m_expr. We can see that every a_expr must start with a "2", however, so parsing now fails.

2+++--2, however, can be parsed as an a_expr. Specifically, 2 is an a_expr, followed by a literal +, followed by ++--2, which is an m_expr.

With regards to your second question about making 2***2 meaningful, I'm afraid that in Python you'd have to redefine what it actually means for a program to be valid Python. Looking at the docs I linked, you can see that every operator is explicitly defined, and for ** we have:

power ::=  primary ["**" u_expr]

Some languages like Haskell have a different idea of what something like 2+2 fundamentally means, and will let you define your own arbitrary operators. In such a language you could define a *** operator, but Python has no such facility without raising a PEP and fundamentally rewriting parts of Python.

If you want more details, then you'll be straying into computer science rather than programming - yes they are different. Get yourself started by looking up topics like Regular Languages, Finite-State Automata, Context-free Languages and the Comsky Hierarchy

Barringer answered 2/2, 2017 at 8:43 Comment(3)
Plus one. This is an awesome effort.Parian
Thank you. its really a Very Nice answer.Cabal
One should perhaps add that in Python you can define what the operators do at the object level so (for example) you can write code to make sense of displ=Vector(3.0,4.5,1.2)+Vector(2.7,-4.3,-9.1). This is probably the justification for a unary + operator since it is a no-op when applied to standard numbers.Prehistoric

© 2022 - 2024 — McMap. All rights reserved.