multiset

This package provides a multiset implementation for Python.

A multiset is similar to the builtin set, but it allows an element to occur multiple times. It is an unordered collection of element which have to be hashable just like in a set. It supports the same methods and operations as set does, e.g. membership test, union, intersection, and (symmetric) difference. Multisets can be used in combination with regular sets for those operations.

The implementation is based on a dict that maps the elements to their multiplicity in the multiset. Hence, some dictionary operations are supported.

In contrast to the collections.Counter from the standard library, it has proper support for set operations and only allows positive counts. Also, elements with a zero multiplicity are automatically removed from the multiset.

There is currently no immutable version of the multiset, because there is no immutable version of dict.

The package also uses the typing module for type hints (see PEP 484) so you can specify the type of a multiset like Multiset[ElementType]. Furthermore, the Multiset class also inherits from collections.abc.MutableSet so that it can be used anywhere a set can be used.

API Overview

The following is an overview over the methods of the Multiset class. For more details on each method and some examples see the autogenerated API Documentation.

class Multiset([mapping or iterable])

Return a new multiset object whose elements are taken from the given optional iterable or mapping. If a mapping is given, it must have positive int values representing each element’s multiplicity. If no iterable or mapping is specified, a new empty multiset is returned.

The elements of a set must be hashable. In contrast to regular sets, duplicate elements will not be removed from the multiset.

The Multiset class provides the following operations which the builtin set also supports:

len(s)

Return the total number of elements in multiset s (cardinality of s).

Note that this is the sum of the multiplicities and not not the number of distinct elements. Since the multiset is essentially a dict, the following can be used to get the number of distinct elements:

>>> len(Multiset('aab').keys())
2
x in s

Test x for membership in s.

x not in s

Test x for non-membership in s.

isdisjoint(other)

Return True if the multiset has no elements in common with other.

issubset(other)
multiset <= other

Test whether every element in the multiset is in other and each element’s multiplicity in the multiset is less than or equal to its multiplicity in other.

multiset < other

Test whether the multiset is a proper subset of other, that is, multiset <= other and multiset != other.

issuperset(other)
multiset >= other

Test whether every element in other is in the multiset and each element’s multiplicity in other is less than or equal to its multiplicity in the multiset.

multiset > other

Test whether the multiset is a proper superset of other, that is, multiset >= other and multiset != other.

union(*others)
multiset | other | ...

Return a new multiset with elements from the multiset and all others. The maximal multiplicity over all sets is used for each element.

combine(*others)
multiset + other + ...

Return a new multiset with elements from the multiset and all others. Each element’s multiplicities are summed up for the new set.

intersection(*others)
multiset & other & ...

Return a new multiset with elements common to the multiset and all others. The minimal multiplicity over all sets is used for each element.

difference(*others)
multiset - other - ...

Return a new multiset with elements in the multiset that are not in the others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.

symmetric_difference(other)
multiset ^ other

Return a new multiset with elements from either the multiset or other where their multiplicity is the absolute difference of the two multiplicities.

copy()
Multiset(multiset)

Return a new shallow copy of the multiset.

union_update(*others)
multiset |= other | ...

Update the multiset, adding elements from all others. The maximal multiplicity over all sets is used for each element. For a version of this method that works more closely to dict.update(), see update().

intersection_update(*others)
multiset &= other & ...

Update the multiset, keeping only elements found in it and all others. The minimal multiplicity over all sets is used for each element.

difference_update(*others)
multiset -= other | ...

Update the multiset, removing elements found in others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.

symmetric_difference_update(other)
multiset ^= other

Update the multiset, keeping only elements found in either the multiset or other but not both. The new multiplicity is the absolute difference of the two original multiplicities.

add(elem, multiplicity=1)
s[elem] += multiplicity
s[elem] = multiplicity

Add element elem to the multiset s. If the optional multiplicity is specified, more than one element can be added at the same time by adding to its.

Note that adding the same element multiple times will increase its multiplicity and thus change the multiset, whereas for a regular set this would have no effect.

You can also set the element’s multiplicity directly via key assignment.

remove(elem, multiplicity=None)
del s[elem]

Remove all elements elem from the multiset. Raises KeyError if elem is not contained in the set.

If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.

This method returns the original multiplicity of the element before it was removed.

You can also delete the element directly via key access.

discard(elem, multiplicity=None)
s[elem] -= multiplicity
s[elem] = 0

Remove element elem from the set if it is present. If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.

This method returns the original multiplicity of the element before it was removed.

You can also set the element’s multiplicity directly via key assignment.

clear()

Remove all elements from the multiset.

Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset(), update(), intersection_update(), difference_update(), and symmetric_difference_update() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like Multiset('abc') & 'cbs' in favor of the more readable Multiset('abc').intersection('cbs').

The Multiset supports set to set comparisons. Two multisets are equal if and only if every element of each multiset is contained in the other and each element’s multiplicity is the same in both multisets (each is a subset of the other). A multiset is less than another set if and only if it is a proper subset of the second set (is a subset, but is not equal). A multiset is greater than another set if and only if it is a proper superset of the second set (is a superset, but is not equal). These comparisons work with both sets and multisets:

>>> Multiset('ab') == {'a', 'b'}
True

Multiset elements, like set elements and dictionary keys, must be hashable.

Binary operations that mix set or frozenset instances with Multiset instances will always return a Multiset.

Since the Multiset is also a dict, it inherits some of its methods:

s[elem]

Return the multiplicity of elem in s. Returns 0 for elements that are not in the multiset.

s[elem] = value

Set the multiplicity of elem to value. Setting the multiplicity to 0 removes the element.

del s[elem]

See remove().

iter(s)

Return an iterator over the elements in the multiset.

In contrast to both the dict and set implementations, this will repeat elements whose multiplicity is greater than 1:

>>> sorted(Multiset('aab'))
['a', 'a', 'b']

To only get distinct elements, use the keys() method.

classmethod fromkeys(elements, multiplicity)

Create a new multiset with elements from elements and all multiplicities set to multiplicity.

get(elem, default)

Return the multiplicity for elem if it is in the multiset, else default.

items()

Return a new view of the multiset’s items ((elem, multiplicity) pairs). See the documentation of dict view objects.

Note that this view is unordered.

keys()

Return a new view of the multiset’s distinct elements. See the documentation of dict view objects.

Note that this view is unordered.

update(*others)
multiset += other + ...

Update the multiset, adding elements from all others. Each element’s multiplicities is summed up for the new multiset. For a version of this method that works more closely to the original set.update(), see union_update().

pop(elem, default)

If elem is in the multiset, remove it and return its multiplicity, else return default.

popitem()

Remove and return an arbitrary (elem, multiplicity) pair from the multiset.

popitem() is useful to destructively iterate over a multiset. If the multiset is empty, calling popitem() raises a KeyError.

setdefault(elem, default)

If elem is in the multiset, return its multiplicity. If not, insert elem with a multiplicity of default and return default.

values()

Return a new view of the multiset’s multiplicities. See the documentation of dict view objects.

Note that this view is unordered.

The multiset also adds some new methods for multiplying a multiset with an int factor:

times(factor)
multiset * factor

Return a copy of the multiset where each multiplicity is multiplied by factor.

times_update(factor)
multiset *= factor

Update the multiset, multiplying each multiplicity with factor.