How can I encode an integer with base 36 in Python and then decode it again?
Have you tried Wikipedia's sample code?
def base36encode(number, alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
"""Converts an integer to a base36 string."""
if not isinstance(number, int):
raise TypeError('number must be an integer')
base36 = ''
sign = ''
if number < 0:
sign = '-'
number = -number
if 0 <= number < len(alphabet):
return sign + alphabet[number]
while number != 0:
number, i = divmod(number, len(alphabet))
base36 = alphabet[i] + base36
return sign + base36
def base36decode(number):
return int(number, 36)
print(base36encode(1412823931503067241))
print(base36decode('AQF8AA0006EH'))
string
and replace alphabet value with string.digits+string.lowercase
–
Aquila base36encode
and base36decode
is broken, the latter will fail (possibly silently) to decode anything encoded with custom alphabet
argument –
Hemimorphite long
which is not supported in Python3. You can simply remove the long
type from the function call above or see @André C. Andersen solution below. –
Hypothesis I wish I had read this before. Here is the answer:
def base36encode(number):
if not isinstance(number, (int, long)):
raise TypeError('number must be an integer')
is_negative = number < 0
number = abs(number)
alphabet, base36 = ['0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', '']
while number:
number, i = divmod(number, 36)
base36 = alphabet[i] + base36
if is_negative:
base36 = '-' + base36
return base36 or alphabet[0]
def base36decode(number):
return int(number, 36)
print(base36encode(1412823931503067241))
print(base36decode('AQF8AA0006EH'))
assert(base36decode(base36encode(-9223372036721928027)) == -9223372036721928027)
base36
before you return it. –
Bb from numpy import base_repr
num = base_repr(num, 36)
num = int(num, 36)
Here is information about numpy.base_repr
.
You can use numpy's base_repr(...)
for this.
import numpy as np
num = 2017
num = np.base_repr(num, 36)
print(num) # 1K1
num = int(num, 36)
print(num) # 2017
Here is some information about numpy, int(x, base=10)
, and np.base_repr(number, base=2, padding=0)
.
(This answer was originally submitted as an edit to @christopher-beland's answer, but was rejected in favor of its own answer.)
You could use https://github.com/tonyseek/python-base36.
$ pip install base36
and then
>>> import base36
>>> assert base36.dumps(19930503) == 'bv6h3'
>>> assert base36.loads('bv6h3') == 19930503
leftpad
. Copy and pasting a trivial function to a file that does what you want is sometimes better than adding a new external dependency. –
Overpowering terrible answer, but was just playing around with this an thought i'd share.
import string, math
int2base = lambda a, b: ''.join(
[(string.digits +
string.ascii_lowercase +
string.ascii_uppercase)[(a // b ** i) % b]
for i in range(int(math.log(a, b)), -1, -1)]
)
num = 1412823931503067241
test = int2base(num, 36)
test2 = int(test, 36)
print test2 == num
I benchmarked the example encoders provided in answers to this question. On my Ubuntu 18.10 laptop, Python 3.7, Jupyter, the %%timeit
magic command, and the integer 4242424242424242
as the input, I got these results:
- Wikipedia's sample code: 4.87 µs ± 300 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- @mistero's
base36encode()
: 3.62 µs ± 44.2 ns per loop - @user1036542's
int2base
: 10 µs ± 400 ns per loop (after fixing py37 compatibility) - @mbarkhau's
int_to_base36()
: 3.83 µs ± 28.8 ns per loop
All timings were mean ± std. dev. of 7 runs, 100000 loops each.
Update on 2023-04-14:
I wanted to try out perfpy.com, and here are my results for https://perfpy.com/288:
If you are feeling functional
def b36_encode(i):
if i < 0: return "-" + b36_encode(-i)
if i < 36: return "0123456789abcdefghijklmnopqrstuvwxyz"[i]
return b36_encode(i // 36) + b36_encode(i % 36)
test
n = -919283471029384701938478
s = "-45p3wubacgd6s0fi"
assert int(s, base=36) == n
assert b36_encode(n) == s
b36 = lambda n: "-" + b36(-n) if n < 0 else "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[n] if n < 36 else b36(n // 36) + b36(n % 36)
then use e.g. b36(-12345)
–
Enwomb This works if you only care about positive integers.
def int_to_base36(num):
"""Converts a positive integer into a base36 string."""
assert num >= 0
digits = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
res = ''
while not res or num > 0:
num, i = divmod(num, 36)
res = digits[i] + res
return res
To convert back to int, just use int(num, 36)
. For a conversion of arbitrary bases see https://gist.github.com/mbarkhau/1b918cb3b4a2bdaf841c
Class that can encode and decode using an arbitrary alphabet (might be useful to someone):
class BaseAlphabet:
alphabet = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
def __init__(self, alphabet=None) -> None:
if alphabet:
self.alphabet = alphabet.upper()
self.len = len(self.alphabet)
def encode(self, number):
if not isinstance(number, int):
raise TypeError('num must be an integer')
result = []
sign = ''
if number < 0:
sign = '-'
number = -number
while number:
number, i = divmod(number, self.len)
result.append(self.alphabet[i])
result.reverse()
return f'{sign}{"".join(result)}'
def decode(self, value):
sign = 1
if value[0] == '-':
value = value[1:]
sign = -1
number = 0
for n, i in enumerate(value[::-1]):
number = number + self.alphabet.index(i) * (self.len ** n)
return number * sign
test:
b = BaseAlphabet('CBA')
def test(n):
c = b.encode(n)
print(n, c, b.decode(c))
test(100000)
test(111111)
test(-100000)
test(999999)
test(-93756210)
>> 100000 BACCACBBACB 100000
>> 111111 BABAABCACAC 111111
>> -100000 -BACCACBBACB -100000
>> 999999 BABAABCACACCC 999999
>> -93756210 -ACBBABCACAABCCCAC -93756210
© 2022 - 2024 — McMap. All rights reserved.