multiset¶
This package provides a multiset implementation for Python.
A multiset is similar to the builtin set
, but it allows an element to occur multiple times.
It is an unordered collection of element which have to be hashable just like in a set
.
It supports the same methods and operations as set
does, e.g. membership test, union
, intersection
, and
(symmetric
) difference
. Multisets can be used in combination with regular sets for those operations.
The implementation is based on a dict
that maps the elements to their multiplicity in the multiset.
Hence, some dictionary operations are supported.
In contrast to the collections.Counter
from the standard library, it has proper support for set
operations and only allows positive counts. Also, elements with a zero multiplicity are automatically
removed from the multiset.
There is currently no immutable version of the multiset, because there is no immutable version of dict
.
The package also uses the typing
module for type hints (see PEP 484) so you can specify the type of a multiset like
Multiset[ElementType]
. Furthermore, the Multiset
class also inherits from collections.abc.MutableSet
so that
it can be used anywhere a set
can be used.
API Overview¶
The following is an overview over the methods of the Multiset
class. For more details on each method and some examples see
the autogenerated API Documentation.
-
class
Multiset
([mapping or iterable])¶ Return a new multiset object whose elements are taken from the given optional iterable or mapping. If a mapping is given, it must have positive int values representing each element’s multiplicity. If no iterable or mapping is specified, a new empty multiset is returned.
The elements of a set must be hashable. In contrast to regular sets, duplicate elements will not be removed from the multiset.
The
Multiset
class provides the following operations which the builtinset
also supports:-
len(s)
Return the total number of elements in multiset s (cardinality of s).
Note that this is the sum of the multiplicities and not not the number of distinct elements. Since the multiset is essentially a dict, the following can be used to get the number of distinct elements:
>>> len(Multiset('aab').keys()) 2
-
x in s
Test x for membership in s.
-
x not in s
Test x for non-membership in s.
-
isdisjoint
(other)¶ Return
True
if the multiset has no elements in common with other.
-
issubset
(other)¶ -
multiset <= other
Test whether every element in the multiset is in other and each element’s multiplicity in the multiset is less than or equal to its multiplicity in other.
-
multiset < other
Test whether the multiset is a proper subset of other, that is,
multiset <= other and multiset != other
.
-
issuperset
(other)¶ -
multiset >= other
Test whether every element in other is in the multiset and each element’s multiplicity in other is less than or equal to its multiplicity in the multiset.
-
multiset > other
Test whether the multiset is a proper superset of other, that is,
multiset >= other and multiset != other
.
-
union
(*others)¶ -
multiset | other | ...
Return a new multiset with elements from the multiset and all others. The maximal multiplicity over all sets is used for each element.
-
combine
(*others)¶ -
multiset + other + ...
Return a new multiset with elements from the multiset and all others. Each element’s multiplicities are summed up for the new set.
-
intersection
(*others)¶ -
multiset & other & ...
Return a new multiset with elements common to the multiset and all others. The minimal multiplicity over all sets is used for each element.
-
difference
(*others)¶ -
multiset - other - ...
Return a new multiset with elements in the multiset that are not in the others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.
-
symmetric_difference
(other)¶ -
multiset ^ other
Return a new multiset with elements from either the multiset or other where their multiplicity is the absolute difference of the two multiplicities.
-
union_update
(*others)¶ -
multiset |= other | ...
Update the multiset, adding elements from all others. The maximal multiplicity over all sets is used for each element. For a version of this method that works more closely to
dict.update()
, seeupdate()
.
-
intersection_update
(*others)¶ -
multiset &= other & ...
Update the multiset, keeping only elements found in it and all others. The minimal multiplicity over all sets is used for each element.
-
difference_update
(*others)¶ -
multiset -= other | ...
Update the multiset, removing elements found in others. This will subtract all the others’ multiplicities and remove the element from the multiset if its multiplicity reaches zero.
-
symmetric_difference_update
(other)¶ -
multiset ^= other
Update the multiset, keeping only elements found in either the multiset or other but not both. The new multiplicity is the absolute difference of the two original multiplicities.
-
add
(elem, multiplicity=1)¶ -
s[elem] += multiplicity
-
s[elem] = multiplicity
Add element elem to the multiset s. If the optional multiplicity is specified, more than one element can be added at the same time by adding to its.
Note that adding the same element multiple times will increase its multiplicity and thus change the multiset, whereas for a regular
set
this would have no effect.You can also set the element’s multiplicity directly via key assignment.
-
remove
(elem, multiplicity=None)¶ -
del s[elem]
Remove all elements elem from the multiset. Raises
KeyError
if elem is not contained in the set.If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.
This method returns the original multiplicity of the element before it was removed.
You can also delete the element directly via key access.
-
discard
(elem, multiplicity=None)¶ -
s[elem] -= multiplicity
-
s[elem] = 0
Remove element elem from the set if it is present. If the optional multiplicity is specified, only the given number is subtracted from the element’s multiplicity. This might still completely remove the element from the multiset depending on its original multiplicity.
This method returns the original multiplicity of the element before it was removed.
You can also set the element’s multiplicity directly via key assignment.
-
clear
()¶ Remove all elements from the multiset.
Note, the non-operator versions of
union()
,intersection()
,difference()
, andsymmetric_difference()
,issubset()
, andissuperset()
,update()
,intersection_update()
,difference_update()
, andsymmetric_difference_update()
methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions likeMultiset('abc') & 'cbs'
in favor of the more readableMultiset('abc').intersection('cbs')
.The
Multiset
supports set to set comparisons. Two multisets are equal if and only if every element of each multiset is contained in the other and each element’s multiplicity is the same in both multisets (each is a subset of the other). A multiset is less than another set if and only if it is a proper subset of the second set (is a subset, but is not equal). A multiset is greater than another set if and only if it is a proper superset of the second set (is a superset, but is not equal). These comparisons work with both sets and multisets:>>> Multiset('ab') == {'a', 'b'} True
Multiset elements, like set elements and dictionary keys, must be hashable.
Binary operations that mix
set
orfrozenset
instances withMultiset
instances will always return aMultiset
.Since the
Multiset
is also adict
, it inherits some of its methods:-
s[elem]
Return the multiplicity of elem in s. Returns
0
for elements that are not in the multiset.
-
s[elem] = value
Set the multiplicity of elem to value. Setting the multiplicity to
0
removes the element.
-
del s[elem]
See
remove()
.
-
iter(s)
Return an iterator over the elements in the multiset.
In contrast to both the
dict
andset
implementations, this will repeat elements whose multiplicity is greater than1
:>>> sorted(Multiset('aab')) ['a', 'a', 'b']
To only get distinct elements, use the
keys()
method.
-
classmethod
fromkeys
(elements, multiplicity)¶ Create a new multiset with elements from elements and all multiplicities set to multiplicity.
-
get
(elem, default)¶ Return the multiplicity for elem if it is in the multiset, else default.
-
items
()¶ Return a new view of the multiset’s items (
(elem, multiplicity)
pairs). See the documentation of dict view objects.Note that this view is unordered.
-
keys
()¶ Return a new view of the multiset’s distinct elements. See the documentation of dict view objects.
Note that this view is unordered.
-
update
(*others)¶ -
multiset += other + ...
Update the multiset, adding elements from all others. Each element’s multiplicities is summed up for the new multiset. For a version of this method that works more closely to the original
set.update()
, seeunion_update()
.
-
pop
(elem, default)¶ If elem is in the multiset, remove it and return its multiplicity, else return default.
-
popitem
()¶ Remove and return an arbitrary
(elem, multiplicity)
pair from the multiset.popitem()
is useful to destructively iterate over a multiset. If the multiset is empty, callingpopitem()
raises aKeyError
.
-
setdefault
(elem, default)¶ If elem is in the multiset, return its multiplicity. If not, insert elem with a multiplicity of default and return default.
-
values
()¶ Return a new view of the multiset’s multiplicities. See the documentation of dict view objects.
Note that this view is unordered.
The multiset also adds some new methods for multiplying a multiset with an
int
factor:-
times
(factor)¶ -
multiset * factor
Return a copy of the multiset where each multiplicity is multiplied by factor.
-
times_update
(factor)¶ -
multiset *= factor
Update the multiset, multiplying each multiplicity with factor.
-