Python Match Case (Switch) Performance
Asked Answered
C

4

12

I was expecting the Python match/case to have equal time access to each case, but seems like I was wrong. Any good explanation why?

Lets use the following example:

def match_case(decimal):
    match decimal:
        case '0':
            return "000"
        case '1':
            return "001"
        case '2':
            return "010"
        case '3':
            return "011"
        case '4':
            return "100"
        case '5':
            return "101"
        case '6':
            return "110"
        case '7':
            return "111"
        case _:
            return "NA"

And define a quick tool to measure the time:

import time
def measure_time(funcion):
    def measured_function(*args, **kwargs):
        init = time.time()
        c = funcion(*args, **kwargs)
        print(f"Input: {args[1]} Time: {time.time() - init}")
        return c
    return measured_function

@measure_time
def repeat(function, input):
    return [function(input) for i in range(10000000)]

If we run each 10000000 times each case, the times are the following:

for i in range(8):
    repeat(match_case, str(i))

# Input: 0 Time: 2.458001136779785
# Input: 1 Time: 2.36093807220459
# Input: 2 Time: 2.6832823753356934
# Input: 3 Time: 2.9995620250701904
# Input: 4 Time: 3.5054492950439453
# Input: 5 Time: 3.815168857574463
# Input: 6 Time: 4.164452791213989
# Input: 7 Time: 4.857251167297363

Just wondering why the access times are different. Isn't this optimised with perhaps a lookup table?. Note that I'm not interested in other ways of having equals access times (i.e. with dictionaries).

Cornelia answered 21/7, 2021 at 21:19 Comment(4)
@Jab It's Python 3.10.Alewife
Your expectation is wrong. This isn't a switch case on primitive data types. It's for pattern matching. It is *not a switchTaxonomy
There is room for optimization (with all literal patterns, I suspect a lookup table could be built), but at this time it does not appear to be optimized.Kuhn
Optimization of this using hash/indexing would require that the compiler know in advance the data types of every case value (i.e would only work with constants). That would be a very narrow special case. I doubt that it would warrant the extra effort given that the objective is pattern matching, not statement dispatch.Bresee
A
8

The match/case statement introduced in PEP 622 offers a more elegant and readable approach to pattern matching compared to traditional if-elif-else chains. Its primary advantage lies in its ability to streamline complex conditional logic.

Traditional approach

def is_tuple(node):
    if isinstance(node, Node) and node.children == [LParen(), RParen()]:
        return True
    return (isinstance(node, Node)
            and len(node.children) == 3
            and isinstance(node.children[0], Leaf)
            and isinstance(node.children[1], Node)
            and isinstance(node.children[2], Leaf)
            and node.children[0].value == "("
            and node.children[2].value == ")")

Using match/case

def is_tuple(node: Node) -> bool:
    match node:
        case Node(children=[LParen(), RParen()]):
            return True
        case Node(children=[Leaf(value="("), Node(), Leaf(value=")")]):
            return True
        case _:
            return False

While it may be equivalent to a dict lookup in the most primitive cases, in general it is not so. Case patterns are designed to look like normal python code but actually they conceal isinstance and len calls and don't execute what you'd expect to be executed when you see code like Node().

Essentially this is equivalent to a chain of if ... elif ... else statements. Note that unlike for the previously proposed switch statement, the pre-computed dispatch dictionary semantics does not apply here.

Anticyclone answered 21/7, 2021 at 23:25 Comment(0)
C
13

I come late to the party, but I feel like I can add something useful to this thread.
A while back, I published an article on Medium about Python's match-case performance, analyzing the generated byte code and performing benchmarks and comparisons. If you want, you can read the article here. I think it's worth your time.
However, I'm going to summarize it here.

Many programming languages implement their switch-case statements as if they were an if-else sequence. Sometimes the switch-case can be optimized into an efficient lookup table, but that's not always the case.

In Python, this optimization is never performed, thus always resulting in a series of condition checks.

From the article, a speed comparison between if-else and match-case:

Average time for match_case:  0.00424 seconds
Average time for if_else:     0.00413 seconds

As you can see, they are almost equal.
Plus, if you disassembled the two statements, you would find that they generate nearly identical byte code.

Just a difference to point out, if-else checks call the objects' __eq__() method while match-case does the comparisons internally.

Lastly, here's a benchmark that also includes a hash table (dictionary):

Average time for match_case:  0.00438 seconds
Average time for if_else:     0.00433 seconds
Average time for dict_lookup: 0.00083 seconds

So, if you're keen on performance, you should use hash tables. Although match-case is similar to a C-style switch-case, it's not: match-case is meant to be used for structural pattern matching, and not for replacing performant hash and lookup tables.

Complexioned answered 24/8, 2022 at 5:38 Comment(0)
A
8

The match/case statement introduced in PEP 622 offers a more elegant and readable approach to pattern matching compared to traditional if-elif-else chains. Its primary advantage lies in its ability to streamline complex conditional logic.

Traditional approach

def is_tuple(node):
    if isinstance(node, Node) and node.children == [LParen(), RParen()]:
        return True
    return (isinstance(node, Node)
            and len(node.children) == 3
            and isinstance(node.children[0], Leaf)
            and isinstance(node.children[1], Node)
            and isinstance(node.children[2], Leaf)
            and node.children[0].value == "("
            and node.children[2].value == ")")

Using match/case

def is_tuple(node: Node) -> bool:
    match node:
        case Node(children=[LParen(), RParen()]):
            return True
        case Node(children=[Leaf(value="("), Node(), Leaf(value=")")]):
            return True
        case _:
            return False

While it may be equivalent to a dict lookup in the most primitive cases, in general it is not so. Case patterns are designed to look like normal python code but actually they conceal isinstance and len calls and don't execute what you'd expect to be executed when you see code like Node().

Essentially this is equivalent to a chain of if ... elif ... else statements. Note that unlike for the previously proposed switch statement, the pre-computed dispatch dictionary semantics does not apply here.

Anticyclone answered 21/7, 2021 at 23:25 Comment(0)
T
3

I tried to replicate your experiment with another function call match_if :

def match_if(decimal):
    if decimal == '0':
        return "000"
    elif decimal == '1':
        return "001"
    elif decimal == '2':
        return "010"
    elif decimal == '3':
        return "011"
    elif decimal == '4':
        return "100"
    elif decimal == '5':
        return "101"
    elif decimal == '6':
        return "110"
    elif decimal == '7':
        return "111"
    else:
        return "NA"

It appears that if we use the if, elif, else statement is less efficient that the match / case method. Here my results :

for i in range(8):
    repeat(match_if, str(i))


Input: 0 Time: 1.6081502437591553
Input: 1 Time: 1.7993037700653076
Input: 2 Time: 2.094271659851074
Input: 3 Time: 2.3727521896362305
Input: 4 Time: 2.6943907737731934
Input: 5 Time: 2.922682285308838
Input: 6 Time: 3.3238701820373535
Input: 7 Time: 3.569467782974243

Results match / case :

for i in range(8):
    repeat(match_case, str(i))

 Input: 0 Time: 1.4507110118865967
 Input: 1 Time: 1.745032787322998
 Input: 2 Time: 1.988663911819458
 Input: 3 Time: 2.2570419311523438
 Input: 4 Time: 2.54061222076416
 Input: 5 Time: 2.7649216651916504
 Input: 6 Time: 3.1373682022094727
 Input: 7 Time: 3.3378067016601562

I don't have a precise answer about why these results, but this experiment show that if we use if statement is little bit longer than the match case.

Tiana answered 21/7, 2021 at 21:38 Comment(1)
It's interesting to note that the two functions generate very similar byte code.'Kuhn
H
1

Since python 3.10 it is possible to use match... case with default option (so called wildcard) denoted with case _

def http_error(status):
    match status:
        case 400:
            return "Bad request"
        case 404:
            return "Not found"
        case 418:
            return "I'm a teapot"
        case _:
            return "Something's wrong with the internet"

docs: https://docs.python.org/3/whatsnew/3.10.html#simple-pattern-match-to-a-literal

Performance

if you are concerned about PERFORMANCE of match... case vs if.. elif... else they are THE SAME.

As you can see, Python’s (andilabs: looking at assembly instructions) match-case statement is just a series of comparisons under the hood, exactly like the if-else method. This is why when we benchmarked the two approaches they performed equally.

source: https://betterprogramming.pub/pythons-match-case-is-too-slow-if-you-don-t-understand-it-8e8d0cf927d

what about approach with dictionary?

side note about using old pythonic approach to switch with dictionary - it is slower, so with python 3.10+ and match... case it might become really old-fashion approach.

Hymnist answered 23/11, 2022 at 10:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.