Python Data Model, Type Protocols, & Magic Methods
Asked Answered
P

1

4

Is there a "mind map", UML diagram, graphic, or some solid reference for the different python types and the magic methods they must implement?

I'm using Python 3.8. The data model docs and built-in types docs docs are very terse and it would help me to see a high-level overview of the different protocols that exist for different types in python (I've "learned" (read?) that, e.g., to implement an "immutable-like" object, you must adhere to the Immutable Protocol in python, which means you must implement __len__ and __getitem__. To make it "mutable-like", you must further add __setitem__ and __delitem__).

I don't see (from as far as I've searched) the word "protocol" being used in the python docs, and the closest good description seems to come from the collections abstract base classes module. However, as a newcomer, I'm not sure if collections.abc is something else entirely, or if the information provided there applies to the python builtin types (i.e. list, tuple, dict, etc.; particularly becuase the collections.abc docs states it "provid[es] alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple" and second because I wouldn't think to go to the collections module if I wanted to learn about the default builtins from the python language itself).

Does anyone have a solid resource? It would be helpful if the information were in a kind of inheritance relationship format so that info doesn't get repeated. For instance, in my example above, rather than say that a mutable-like object implements __len__, __getitem__, __setitem__, and __delitem__, it would be easier for my puny brain to hold if it were "chunked" and the info was something like "a mutable-like object inherits from immutable-like and adds __setitem__, and __delitem__.

Are there any thoughts? I think this is a huge source of confusion for newcomers and many errors result from a misunderstanding of the data types in python.

Pachton answered 30/3, 2021 at 15:51 Comment(0)
P
5

Python Data Model

Building a lot from the glossary, this is the best I could do for now. I hope this question can be expanded by multiple answers from others.

Definitions

  • Mutable: an object whose value can change after instantiation
  • Immutable: an object whose value cannot change after instantiation
  • Container: an object containing references to other objects
  • Sequence: a finite ordered set indexed by non-negative numbers
  • Mapping: a finite set of objects indexed by arbitrary index sets
  • Callable: types to with the function call operation can be applied
  • Number (numbers): an immutable value supporting numeric operations
  • Range: an immutable sequence of numbers
  • String: an immutable sequence of characters accessed by index
  • Byte: an immutable sequence of short integers
  • List ([]): a mutable sequence of object references accessed by index
  • Dictionary ({key: value}): a mutable mapping of object references accessed by key
  • Tuple (()): an immutable sequence of object references accessed by index
  • Set ({}): a mutable unordered collection of unique and immutable objects
  • Frozen Set: an immutable and hashable collection of unique and hashable objects

Types

  • Orderable/Comparable: implements __eq__() and __lt__(), __le__(), __gt__(), __ge__(). For containment checking (i.e. use with in), implements __contains__() or is an Iterable
  • Iterable: an object capable of returning its members one at a time
  • Iterator: implements __iter__() (returns object) and __next__() (returns next item from container; raises StopIteration when complete)
  • Generator: implements __iter__() (yields object) and __next__() (yields next item from container; raises StopIteration when complete)
  • Immutable Sequences: implements __hash__()
  • Context Manageable: implements __enter__() and __exit__() (to be used in a with statement)
  • Descriptor: implements __get__(), __set__(), and/or __delete__()

Overloaded Operations

  • Sequences:
    • + is concatentation
    • * is repetition2

2 NOTE: Repeated items in the sequence are not copied, but referenced multiple times. To make unique copies, use a list comprehension or generator expression.

Additional Comments

  • The built-in types are all the lower case objects (e.g. list, tuple, str, etc.)
  • All Sequences are Iterables
  • For new immutables to be created from old ones, a new object must be created
  • Sets and Dicitionaries also support comprehension
  • Bitwise Operations can be used on Sets to implement set theoretic operations (i.e. Union, Intersection, Symmetric Difference, etc.)

Conceptual Groupings of Sequence Types

  1. Container/Flat Sequences

    • Container Sequences: can hold items of different types (list, tuple, collections.deque)
    • Flat Sequences: can only hold items of one type (str, bytes, bytearray, memoryview, array.array)
  2. Mutable/Immutable Sequences

    • Mutable Sequences: list, bytearray, array.array, collections.deque, memoryview
    • Immutable Sequences: tuple, str, bytes

Another good reference is Fluent Python, by Luciano Ramalho

Pachton answered 3/4, 2021 at 19:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.