Get most significant digit in python
Asked Answered
A

2

8

Say I have list [34523, 55, 65, 2]

What is the most efficient way to get [3,5,6,2] which are the most significant digits. If possible without changing changing each to str()?

Archoplasm answered 26/11, 2015 at 22:6 Comment(1)
If you only have integrers, take a look at this answer. It's the C version of your question.Oesophagus
M
17

Assuming you're only dealing with positive numbers, you can divide each number by the largest power of 10 smaller than the number, and then take the floor of the result.

>>> from math import log10, floor
>>> lst = [34523, 55, 65, 2]
>>> [floor(x / (10**floor(log10(x)))) for x in lst]
[3, 5, 6, 2]

If you're using Python 3, instead of flooring the result, you can use the integer division operator //:

>>> [x // (10**floor(log10(x))) for x in lst]
[3, 5, 6, 2]

However, I have no idea whether this is more efficient than just converting to a string and slicing the first character. (Note that you'll need to be a bit more sophisticated if you have to deal with numbers between 0 and 1.)

>>> [int(str(x)[0]) for x in lst]
[3, 5, 6, 2]

If this is in a performance-critical piece of code, you should measure the two options and see which is faster. If it's not in a performance-critical piece of code, use whichever one is most readable to you.

Mcdaniels answered 26/11, 2015 at 22:10 Comment(2)
damn, you beat me to itGiamo
Also consider a divide by ten for loop, in case log and floor are expensiveRedeem
A
8

I did some timings using python 3.6.1:

from timeit import timeit

from math import *


lst = list(range(1, 10_000_000))


# 3.6043569352230804 seconds
def most_significant_str(i):
    return int(str(i)[0])


# 3.7258850016013865 seconds
def most_significant_while_floordiv(i):
    while i >= 10:
        i //= 10
    return i


# 4.515933519736952 seconds
def most_significant_times_floordiv(i):
    n = 10
    while i > n:
        n *= 10
    return i // (n//10)


# 4.661690454738387 seconds
def most_significant_log10_floordiv(i):
    return i // (10 ** (log10(i) // 1))


# 4.961193803243334 seconds
def most_significant_int_log(i):
    return i // (10 ** int(log10(i)))


# 5.722346990002692 seconds
def most_significant_floor_log10(i):
    return i // (10 ** floor(log10(i)))


for f in (
    'most_significant_str',
    'most_significant_while_floordiv',
    'most_significant_times_floordiv',
    'most_significant_log10_floordiv',
    'most_significant_int_log',
    'most_significant_floor_log10',
):
    print(
        f,
        timeit(
            f"""
for i in lst:
    {f}(i)
            """,
            globals=globals(),
            number=1,
        ),
    )

As you can see, for numbers in range(1, 10_000_000), int(str(i)[0]) is faster than other methods. The closest I could get was using a simple while loop:

def most_significant_while_floordiv(i):
    while i >= 10:
        i //= 10
    return i
Anastigmatic answered 2/7, 2017 at 2:39 Comment(1)
So, simplicity wins.Spurgeon

© 2022 - 2024 — McMap. All rights reserved.