Is there a Python equivalent for C++ "multiset<int>"?
Asked Answered
M

6

41

I am porting some C++ code to Python and one of the data structures is a multiset, but I am not sure how to model this in Python.

Let ms be the C++ multiset<int>

How ms is used (posting some examples)

multiset<int>::iterator it = ms.find(x)
ms.erase(it)

ms.insert(x)
ms.end()
ms.lower_bound(x)
ms.clear()
Merissa answered 27/6, 2013 at 15:11 Comment(16)
You may check Counter class in python: docs.python.org/2/library/collections.html#collections.CounterGastrotrich
Are there equivalents for the functions/methods I've listed?Merissa
Counter is not an equivalent to std::multiset.Embolism
Looking through it -- I don't think this has what I need unfortunatelyMerissa
You don't need exact equivalents for specific functions. 'end' is a C++ way of doing things - it will differ in python. The C++ is using find, I guess on the ordered set for speed.Choplogic
Is there a difference between a multiset and a Python list or set (or dict)? I am just trying to find a way to emulate this.Merissa
Do you need log-time existence check? If not, how about going just with python's list()?Disconnect
Well, multiset is ordered, and has logarithmic lookup.Embolism
@Merissa Well yes, please study the documentation. Time complexity aside, python's set is what is says - no duplicities allowed.Disconnect
So is a multiset just an ordered list with a built-in logarithmic lookup? i.e. a Python list where I can use the bisect module or a binary search function?Merissa
I did Google it and didn't find what I was looking for. If you are just going to respond with "snark," please do not reply in this thread.Merissa
The C++ multiset seems to have a richer interface than the Python Counter. Additionally, the C++ multiset is ordered, so methods like lower_bound don't have any meaning in Python's Counter. Basically, they their use cases overlap somewhat, but they are not the same thing.Dialyser
@Merissa No I am not used to do that, sorry if you took it this way. It's just that when you google "c++ multiset", the first two results give you just what you were wondering about in your previous post ("So is a multiset just an ordered list with a built-in logarithmic lookup?")Disconnect
A sorted list data type fits your criteria. Python has a couple implementations of those on the Python Package Index (PyPI).Nereidanereids
@Gastrotrich You should post this as an answer (with a Python 3 link).Statist
Boost Python can help address your challenges of porting without a Python substitute for multiset in C++. boost.org/doc/libs/1_81_0/libs/python/doc/html/index.htmlRelish
D
10

There isn't. See Python's standard library - is there a module for balanced binary tree? for a general discussion of the equivalents of C++ tree containers (map, set, multimap, multiset) in Python.

The closest I can think of is to use a dictionary mapping integers to counts (also integers). However this doesn't get you the keys in order, so you can't search using lower_bound. An alternative is a to use an ordered list, as suggested by others already, maybe a list of (integer, count) tuples? If you only need to search after you've done all your insertions, you could use the dictionary as a temporary structure for construction, build the list after you've done all the insertions, then use the list for searching.

Deutschland answered 27/6, 2013 at 15:39 Comment(2)
Yes, the code does a lot of insertions first, and then searches/removes various items afterwards. Both processes use a lot of lower_bound calls.Merissa
@Merissa Do you know why there are lower_bound calls during the insertion process? Re lower_bound calls during the removal process, if you had a list of (integer, count) tuples, you could remove items by decrementing the count, and if a count reached zero, you could leave the item there and clean up sometime later. (Insertions of completely new integers would be more of a problem!)Deutschland
N
9

There are a couple implementations of sorted list data types which would fit your criteria. Two popular choices are SortedContainers and blist modules. Each of these modules provides a SortedList data type which automatically maintains the elements in sorted order and would allow for fast insertion and lower/upper bound lookups. There's a performance comparison that's helpful too.

The equivalent code using the SortedList type from the SortedContainers module would be:

from sortedcontainers import SortedList
sl = SortedList()

# Start index of `x` values
start = sl.bisect_left(x)

# End index of `x` values
end = sl.bisect_right(x)

# Iterator for those values
iter(sl[start:end])

# Erase an element
del sl[start:end]

# Insert an element
sl.add(x)

# Iterate from lower bound
start = sl.bisect_left(x)
iter(sl[x] for x in range(start, len(sl)))

# Clear elements
sl.clear()

All of those operations should work efficiently on a sorted list data type.

Nereidanereids answered 28/4, 2014 at 18:53 Comment(1)
Can SortedList from sortedcontainers can perform set operations like union , difference and intersection ?Avisavitaminosis
C
4

You can keep a list ordered using the bisect functions. For example find will become

def index(a, x):
    'Locate the leftmost value exactly equal to x'
    i = bisect_left(a, x)
    if i != len(a) and a[i] == x:
        return i
    raise ValueError

You will find other equivalents in the docs. Instead of checking against end you will now get a ValueError

Choplogic answered 27/6, 2013 at 15:30 Comment(0)
B
4

There are a couple of data-structures which come close.

  • python collections:

    • Ordered dict: dict subclass that remembers the order entries were added. link
    • Counter: dict subclass for counting hashable objects. link
  • provided by django framework:

    • A dict with multiple keys with same value: link
    • Sorted dict which is deprecated as python collection includes an ordered dict now: link
Benitobenjamen answered 10/1, 2015 at 20:50 Comment(0)
V
-1

If you don't need sorting, you can use this as a multiset<int> (or unordered_multiset<int>):

from collections import Counter

def multiset(array):
    return set(Counter(array).items())
Victualler answered 11/8, 2021 at 13:23 Comment(1)
You would just use Counter directly. Converting the Counter's items() to a set doesn't make any sense. But in any case this would be an equivalent to unorderered_multiset, not to multiset.Terefah
D
-2

Yes, There is a model like cpp u can:-> You can use https://pypi.org/project/sortedmultiset/ same as the multiset like cpp
User Guide for this https://github.com/rishiko/sortedmultiset/blob/main/User%20Guide

$ pip install orderdmultiset
import sortedmultiset
mset=sortedmultiset.orderdmultiset()--->To create an object of sorted multiset
mset.append(5)#-----> add an element to the multiset
mset.append(4)
print(mset.multiset)#-------->To print the multiset #output [4,5]
mset.erase(4) #---------> True element present and erased #output True
print(mset.multiset)#-------->To print the multiset #output [5]
mset.erase(56)-----------------> False as element not present in multiset #output False.
mset.append(6)
mset.append(3)
print(mset.multiset)#-------->To print the multiset #output [3,5,6]
chq=mset.search(5)#----------> will give a memory address as element present #output 
                              <sortedmultiset.sortedmultiset.TreeNode object at 0x000002C6C69E44F0>
chq=mset.search(678) #-----------> output will be None as element is not present

It is an just for a integer numbers i am working on string

Dingy answered 10/2, 2024 at 16:43 Comment(1)
SO is not the place to promote your personal projects.Terefah

© 2022 - 2025 — McMap. All rights reserved.