Constructing an object that's greater than any string
Asked Answered
C

3

5

In Python 3, I have a list of strings and would find it useful to be able to append a sentinel that would compare greater than all elements in the list.

Is there a straightforward way to construct such an object?

I can define a class, possibly subclassing str, but it feels like there ought to be a simpler way.

For this to be useful in simplifying my algorithm, I need to do this ahead of time, before I know what the strings contained in the list are going to be (and so it can't be a function of those strings).

Cartier answered 8/11, 2019 at 4:4 Comment(2)
Do you need a specific implementation of str.__gt__() or you want to construct a string that results to greater than any other string?Ewell
You don't need to subclass str, and object would do, that implements the comparison operators accordingly.Haughty
G
4

This is kind of a naïve answer, but when you're dealing with numbers and need a sentinel value for comparison purposes, it's not uncommon to use the largest (or smallest) number that a specific number type can hold.

Python strings are compared lexicographically, so to create a "max string", you'd simply need to create a long string of the "max char":

MAX_CHAR = chr(sys.maxunicode)

# One million is entirely arbitrary here.
# It should ideally be 1 + the length of the longest possible string that you'll compare against 
MAX_STRING = MAX_CHAR * int(1e6)

Unless there's weird corner cases that I'm not aware of, MAX_STRING should now be considered greater than any other string (other than itself); provided that it's long enough.

Gainey answered 8/11, 2019 at 4:41 Comment(2)
Why bother making it long? If that's correct at all, which it probably is because of the probably not otherwise used character, then length 1 suffices.Fructify
@KellyBundy I made it long for the odd chance that they had a string starting with \U0010ffff characters.Gainey
F
2

You can inline objects of a custom class:

from functools import total_ordering

M = total_ordering(type('maxstring', (), {'__lt__': lambda *_: False}))()

print(sorted([M, 'abc', M, 'zyx']))
# ['abc', 'zyx', <__main__.maxstring object>, <__main__.maxstring object>]

Unfortunately, if you're going to inline this object more than once, and want to make sure that multiple maxstring objects compare equal, you should fix its comparison == operator:

M1 = total_ordering(type('maxstring', (), {'__lt__': lambda *_: False}))()
M2 = total_ordering(type('maxstring', (), {'__lt__': lambda *_: False}))()

print(M1 == M1)
# True
print(M1 == M2)
# False

M1 = total_ordering(type('maxstring', (), {'__lt__': lambda *_: False, '__eq__': lambda s,o: type(o).__name__=='maxstring'}))()
M2 = total_ordering(type('maxstring', (), {'__lt__': lambda *_: False, '__eq__': lambda s,o: type(o).__name__=='maxstring'}))()

print(M1 == M2)
# True

Important note: Do not make your maxstring an instance of str, otherwise it will be treated half like a maxstring and half like an empty string, and this will mess up the comparisons:

M = total_ordering(type('maxstring', (str,), {'__lt__': lambda *_: False}))()

print(isinstance(M, str))
# True
print(sorted([M, 'abc', M, 'zyx']))
# ['', 'abc', '', 'zyx']

print((M < 'abc'), (M > 'abc'), ('abc' < M), ('abc' > M))
# False False False False
Faxan answered 26/2, 2023 at 16:47 Comment(0)
F
2

For fun, here's an hack that depends on implementation details. I build a character/string that's not possible without such a hack:

import ctypes

# From https://stackoverflow.com/a/62188431
def deref(addr, typ):
    return ctypes.cast(addr, ctypes.POINTER(typ))

# Largest allowed character
S = chr(0x10FFFF)

# Make it larger
deref(id(S), ctypes.c_int)[18] += 1

Let's take it for a spin:

import random

# Show that it's indeed 0x110000
print(hex(ord(S)))

# Test that it's larger than all others
chars = [S, *map(chr, range(0x110000))]
for _ in range(3):
    random.shuffle(chars)
    print(max(chars) is S)

# Show that it's normally impossible
chr(ord(S))

Output (try it):

0x110000
True
True
True
Traceback (most recent call last):
  File "/ATO/code", line 18, in <module>
    chr(ord(S))
ValueError: chr() arg not in range(0x110000)
Fructify answered 26/2, 2023 at 17:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.