Python elegant inverse function of int(string,base)
Asked Answered
S

12

61

Python allows conversions from string to integer using any base in the range [2,36] using:

int(string,base)

I am looking for an elegant inverse function that takes an integer and a base and returns a string.

For example:

>>> str_base(224,15)
'ee'

I came up with the following solution:

def digit_to_char(digit):
    if digit < 10: return chr(ord('0') + digit)
    else: return chr(ord('a') + digit - 10)

def str_base(number,base):
    if number < 0:
        return '-' + str_base(-number,base)
    else:
        (d,m) = divmod(number,base)
        if d:
            return str_base(d,base) + digit_to_char(m)
        else:
            return digit_to_char(m)

Note: digit_to_char() works for bases <= 169 arbitrarily using ASCII characters after z as digits for bases above 36.

Is there a Python built‑in, library function, or a more elegant inverse function of int(string,base)?

Sordid answered 14/1, 2010 at 10:28 Comment(2)
If efficiency is an issue you might also want to consider getting rid of digit_to_char and replace the digit_to_char(m) calls with digits[m], where you define digits as "012...89ab...xzy". Simpler code is easier to read and understand, and I'd be very surprised if you didn't see speed gains too.Ferrol
Sorry, I'd been working on a base-36 problem when I wrote the above! The string doesn't need to be that long for hex! But I see someone has already pointed you to the format solution for that.Ferrol
U
15

This thread has some example implementations.

Actually I think your solution looks rather nice, it's even recursive which is somehow pleasing here.

I'd still simplify it to remove the else, but that's probably a personal style thing. I think if foo: return is very clear, and doesn't need an else after it to make it clear it's a separate branch.

def digit_to_char(digit):
    if digit < 10:
        return str(digit)
    return chr(ord('a') + digit - 10)

def str_base(number,base):
    if number < 0:
        return '-' + str_base(-number, base)
    (d, m) = divmod(number, base)
    if d > 0:
        return str_base(d, base) + digit_to_char(m)
    return digit_to_char(m)

I simplified the 0-9 case in digit_to_char(), I think str() is clearer than the chr(ord()) construct. To maximize the symmetry with the >= 10 case an ord() could be factored out, but I didn't bother since it would add a line and brevity felt better. :)

Uranus answered 14/1, 2010 at 10:48 Comment(2)
That's helpful, but many of these have bugs, and it would be better if you posted one or two directly in your answer in case the other site's message gets deleted.Reactionary
"it's even recursive which is somehow pleasing here" I think recursion is pleasing... in languages that handle recursion well. In python, recursion is never pleasing.Chrestomathy
R
35

Maybe this shouldn't be an answer, but it could be helpful for some: the built-in format function does convert numbers to string in a few bases:

>>> format(255, 'b') # base 2
'11111111'
>>> format(255, 'd') # base 10
'255'
>>> format(255, 'o') # base 8
'377'
>>> format(255, 'x') # base 16
'ff'
Rabbinical answered 28/2, 2013 at 16:35 Comment(0)
L
33

If you use Numpy, there is numpy.base_repr.

You can read the code under numpy/core/numeric.py. Short and elegant

Lotus answered 29/5, 2014 at 5:54 Comment(5)
And here is a link to that code: github.com/numpy/numpy/blob/…Hickson
That should be the accepted and most up voted answer. You know, reinventing the wheel and all .... :-)Urbanite
numpy.base_repr is good, but could not return a fixed-length stringSire
For those interested in the reverse of this function, have a look here.Petronella
Bases greater than 36 not handled in base_reprAntebellum
U
15

This thread has some example implementations.

Actually I think your solution looks rather nice, it's even recursive which is somehow pleasing here.

I'd still simplify it to remove the else, but that's probably a personal style thing. I think if foo: return is very clear, and doesn't need an else after it to make it clear it's a separate branch.

def digit_to_char(digit):
    if digit < 10:
        return str(digit)
    return chr(ord('a') + digit - 10)

def str_base(number,base):
    if number < 0:
        return '-' + str_base(-number, base)
    (d, m) = divmod(number, base)
    if d > 0:
        return str_base(d, base) + digit_to_char(m)
    return digit_to_char(m)

I simplified the 0-9 case in digit_to_char(), I think str() is clearer than the chr(ord()) construct. To maximize the symmetry with the >= 10 case an ord() could be factored out, but I didn't bother since it would add a line and brevity felt better. :)

Uranus answered 14/1, 2010 at 10:48 Comment(2)
That's helpful, but many of these have bugs, and it would be better if you posted one or two directly in your answer in case the other site's message gets deleted.Reactionary
"it's even recursive which is somehow pleasing here" I think recursion is pleasing... in languages that handle recursion well. In python, recursion is never pleasing.Chrestomathy
R
10

The above answers are really nice. It helped me a lot to prototype an algortithm I had to implement in C

I'd like to come up with a little change (I used) to convert decimal to a base of symbolspace

I also ignored negativ values just for shortness and the fact that's mathematical incorrect --> other rules for modular arithmetics --> other math if you use binary, oct or hex --> diff in unsigned & signed values

def str_base(number, base):
   (d,m) = divmod(number,len(base))
   if d > 0:
      return str_base(d,base)+base[m]
   return base[m]

that lead's to following output

>>> str_base(13,'01')
'1101'
>>> str_base(255,'01')
'11111111'
>>> str_base(255,'01234567')
'377'
>>> str_base(255,'0123456789')
'255'
>>> str_base(255,'0123456789abcdef')
'ff'
>>> str_base(1399871903,'_helowrd')
'hello_world'

if you want to padd with the propper zero symbol you can use

symbol_space = 'abcdest'

>>> str_base(734,symbol_space).rjust(0,symbol_space[0])
'catt'
>>> str_base(734,symbol_space).rjust(6,symbol_space[0])
'aacatt'
Reeba answered 15/7, 2014 at 16:19 Comment(0)
F
4

review this.

def int2str(num, base=16, sbl=None):
    if not sbl:
        sbl = '0123456789abcdefghijklmnopqrstuvwxyz'
    if len(sbl) < 2:
        raise ValueError, 'size of symbols should be >= 2'
    if base < 2 or base > len(sbl):
        raise ValueError, 'base must be in range 2-%d' % (len(sbl))

    neg = False
    if num < 0:
        neg = True
        num = -num

    num, rem = divmod(num, base)
    ret = ''
    while num:
        ret = sbl[rem] + ret
        num, rem = divmod(num, base)
    ret = ('-' if neg else '') + sbl[rem] + ret

    return ret
Filip answered 12/1, 2011 at 3:20 Comment(0)
M
3

digit_to_char could be implemented like this:

def digit_to_char(digit):
    return (string.digits + string.lowercase)[digit]
Malapert answered 14/1, 2010 at 10:49 Comment(1)
And even better to avoid reconstructing the string string.digits + string.lowercase every call.Ferrol
C
3

Looks like this might be my time to shine. Believe it or not, the following is some ported and modified Scratch code I wrote nearly three years ago to see just how quickly I could convert from denary to hexadecimal.

Simply put, it works by first taking an integer, base, and an optional accompanying string of numerals, then calculating each digit of the converted integer beginning with the least significant.

def int2base(num, base, abc="0123456789abcdefghijklmnopqrstuvwxyz"):
  if num < 0:
    return '-' + int2base(-num, base, abc)

  output = abc[num % base] # rightmost digit

  while num >= base:
    num //= base # move to next digit to the left
    output = abc[num % base] + output # this digit

  return output

On my own PC, this code was able to complete 10 million iterations using the input range, 0-9999, and base, 36, in consistently below 5 seconds. Using the same test, I have found this to be at least 4 seconds faster than any other answer so far.

>>> timeit.timeit(lambda: [int2base(n, 36) for n in range(10000)], number=1000)
4.883068453882515
Csc answered 7/4, 2018 at 18:24 Comment(0)
T
2

I had once written my own function with the same goal but is now embarrassingly complicated.

from math import log, ceil, floor
from collections import deque
from itertools import repeat
from string import uppercase, digits
import re

__alphanumerals = (digits + uppercase)

class InvalidBaseError(ValueError): pass
class FloatConvertError(ValueError): pass
class IncorrectBaseError(ValueError): pass

def getbase(number, base=2, frombase = 10):
    if not frombase == 10:
        number = getvalue(number, frombase)
        #getvalue is also a personal function to replicate int(number, base)

    if 1 >= base or base >= len(__alphanumerals) or not floor(base) == base:
        raise InvalidBaseError("Invalid value: {} entered as base to convert
          to. \n{}".format(base,
        "Assert that the base to convert to is a decimal integer."))

    if isinstance(number, str):
        try:
            number = atof(number)
        except ValueError:
            #The first check of whether the base is 10 would have already corrected the number
            raise IncorrectBaseError("Incorrect base passed as base of number -> number: {} base: {}".format(number, frombase))
    #^ v was supporting float numbers incase number was the return of another operation
    if number > floor(number):
        raise FloatConvertError("The number to be converted must not be a float. {}".format(number))

    isNegative = False
    if number < 0:
        isNegative = True
        number = abs(number)

    logarithm = log(number, base) if number else 0 #get around number being zero easily

    ceiling = int(logarithm) + 1

    structure = deque(repeat(0, ceiling), maxlen = ceiling)

    while number:
        if number >= (base ** int(logarithm)):
            acceptable_digit = int(number / (base ** floor(logarithm)))
            structure.append(acceptable_digit if acceptable_digit < 10 else     __alphanumerals[acceptable_digit])
            number -= acceptable_digit * (base ** floor(logarithm))
        else:
            structure.append(0)

        logarithm -= 1

    while structure[0] == 0:
        #the result needs trailing zeros
        structure.rotate(-1)

    return ("-" if isNegative and number else "") + reduce(lambda a, b: a + b, map(lambda a: str(a), structure))

I think though that the function strbase should only support bases >= 2 and <= 36 to prevent conflict with other tools in python such as int. Also, I think that only one case of alphabets should be used preferably uppercase again to prevent conflict with other functions like int since it will consider both "a" and "A" to be 10.

from string import uppercase

dig_to_chr = lambda num: str(num) if num < 10 else uppercase[num - 10]

def strbase(number, base):
    if not 2 <= base <= 36:
        raise ValueError("Base to convert to must be >= 2 and <= 36")

    if number < 0:
        return "-" + strbase(-number, base)

    d, m = divmod(number, base)
    if d:
        return strbase(d, base) + dig_to_chr(m)

    return dig_to_chr(m)
Tuscan answered 2/10, 2016 at 16:6 Comment(0)
J
1

Here's my solution:

def int2base(a, base, numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
     baseit = lambda a=a, b=base: (not a) and numerals[0] or baseit(a-a%b,b*base)+numerals[a%b%(base-1) or (a%b) and (base-1)]
     return baseit()

Explanation

In any base, every number is equal to a1+a2*base**2+a3*base**3.... The "mission" is to find all a's.

For every N=1,2,3..., the code is isolating the aN*base**N by "mouduling" by b for b=base**(N+1) which slice all a's bigger than N, and slicing all the a's that their serial is smaller than N by decreasing a every time the function is called by the current aN*base**N.

Base%(base-1)==1 therefore base**p%(base-1)==1 and therefore q*base^p%(base-1)==q with only one exception when q=base-1 which returns 0. To fix that, in case it returns 0, the function is checking is it 0 from the beginning.


Advantages

In this sample, there is only one multiplication (instead of division) and some instances of modulus which take relatively small amounts of time.

Javelin answered 26/4, 2016 at 20:52 Comment(0)
P
1

Here is a recursive function:

def encode(nIn, nBase):
   n = nIn // nBase
   s = '0123456789abcdefghijklmnopqrstuvwxyz'[nIn % nBase]
   return encode(n, nBase) + s if n > 0 else s

n = 1577858399
s = encode(n, 36)
print(s == 'q3ezbz')
Pedestal answered 3/11, 2020 at 3:38 Comment(0)
B
1

To summarise, what I could find at Stackoverflow, the shortest way is something as follows:

base = lambda integer, b, \ 
  numerals='0123456789abcdefghijklmnopqrstuvwxyz': \
  numerals[0] if integer == 0 \
  else \
  base(integer // b, b, numerals).lstrip(numerals[0]) \
  + numerals[integer % b]
>>> base(983779, 36)
>>> l337

But you have to pay for brevity — no error checking and also recursion overflow is possible.

P.S.: In the iPython interpreter, remove the \ line breaks that were added for clarity here. These may, however, remain in place in a Python script file.

Background answered 20/12, 2021 at 0:46 Comment(1)
Unfortunately, I learned that recursive programming is not a good idea to do on embedded microcontrollers: «Avoid deep recursive function calls. Individual recursive function calls don’t always add a lot of stack usage each time they are called, but if each function includes large stack-based variables then the overhead can get quite high.» — EspressifPetronella
S
0

numpy.base_repr is a quite good solution, however, in some cases, we want a fixed-length string with leading zeros.

def base_repr(x: int, base: int, length: int):
    from numpy import base_repr
    from math import log, ceil
    s = base_repr(x, base, length - ceil(log(x, base)))
    return s
Sire answered 11/12, 2019 at 5:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.