sPickle¶
The module sPickle is an enhanced version of the standard module pickle
.
It provides an improved Pickler
class and a
utility class SPickleTools
.
The sPickle package tries to push the limits for pickling. The implementation tries to create correct pickles, but it does not try to be efficient or portable or nice to read or ... Consider it a proof of concept, a demonstration, that shows what could be done.
Warning
Although the author is using the sPickle package in production, it is more or less untested outside the specific environment it was written for.
Note
The sPickle package is currently requires Python 2.7.
-
sPickle.
MODULE_TO_BE_PICKLED_FLAG_NAME
= '__module_must_be_pickled__'¶ If global (=module level) variable __module_must_be_pickled__ is true, the module gets pickled by value.
-
class
sPickle.
Pickler
(file, protocol=2, serializeableModules=None, mangleModuleName=None, logger=None, object_dispatch=None)¶ Bases:
pickle.Pickler
The sPickle Pickler.
This Pickler is a subclass of
pickle.Pickler
that adds the ability to pickle modules, most classes and program state. It is intended to be API-compatible withpickle.Pickler
so you can use it as a plug in replacement. However its constructor has more optional arguments.Parameters: - file – The file argument must be either an instance of
collections.MutableSequence
or have a write(str) - method that accepts a single string argument. It can thus be an open file object, a StringIO object, or any other custom object that meets this interface. As an alternative you can use a list or any other instance of collections.MutableSequence. - protocol (int) – The optional protocol argument tells the pickler to use the given protocol; For this implementation, the only supported protocol is 2 or pickle.HIGHEST_PROTOCOL. Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
- serializeableModules –
The optional argument serializeableModules must be an iterable collection of modules and strings. If the pickler needs to serialize a module, it checks this collection to decide, if the module needs to be pickled by value or by name. The module gets pickled by value, if at least one of the following conditions is true. Otherwise it gets pickled by reference:
- The module has a global variable named __module_must_be_pickled__ and the value of this variable is true.
- The module object is contained in serializeableModules.
- The the name of the module starts with a string contained in serializeableModules.
- The module has the attribute __file__ and serializeableModules contains a string, that is a substring of __file__ after applying path and case normalization as appropriate for the current system.
- logger (
logging.Logger
) – The optional argument logger must be an instance of classlogging.Logger
. If given, it is used instead of the default logger. - mangleModuleName –
Experimental feature: the optional argument mangleModuleName must be a callable with three arguments. The first argument is this pickler, the second the name of module and the third is None or - if the caller is going to pickle a module reference - the module object. The callable must return a pickleable object that unpickles as a string. You can use this callable to rename modules in the pickle. For instance you may want to replace “posixpath” by “os.path”.
Note
In order to be able to unpickle a module pickled by name, the module must be importable. If this is not the case or if the content of the module might change, you should tell the pickler to pickle the module by value.
- object_dispatch – Experimental feature: the optional argument
object_dispatch must be either an
ObjectDispatchBuilder
or aMutableMapping
from numeric object ids - as returned by id() - to callables, which take two arguments, first the pickler and then the object to be pickled. It is used to initialize the attributesobject_dispatch_builder
andobject_dispatch
. If no value is given, thePickler
updates the global defaultObjectDispatchBuilder
(as returned byget_default_instance()
) and then sets object_dispatch to a shallow copy of the globalObjectDispatchBuilder
.
Attributes and Methods
-
dispatch
¶ A per instance version of the global dispatch table
pickle.Pickler.dispatch
. Using a per instance dispatch table keeps the global table unchanged.
-
object_dispatch
¶ Certain “global” objects require a special treatment, because the values of their attributes
__module__
and / or__name__
are missing, wrong or otherwise not useful. Examples include platform dependent implementations of standard functions likeos.getcwd()
, which reports to bent.getcwd
or various types from the moduletypes
. The attributeobject_dispatch
is a mapping from a numeric object id - as returned byid()
- to a callable, which takes two arguments: first the pickler and then the object to be pickled. If the pickler finds the id of an object to be pickled inobject_dispatch
, it dispatches pickling to the callable.If the constructor argument object_dispatch was a
MutableMapping
, the pickler sets this attribute to object_dispatch. Otherwise the pickler setsobject_dispatch
toobject_dispatch.object_dispatch
. Finally the pickler adds a few additional entries to the mapping for special cases.
-
object_dispatch_builder
¶ If the constructor argument object_dispatch was not a
MutableMapping
, this attribute is the value of object_dispatch or, if object_dispatch was None a copy of the defaultObjectDispatchBuilder
as returned byget_default_instance()
. Otherwiseobject_dispatch_builder
is None.
-
classmethod
analysePicklerStack
(traceback_or_frame, stopObjectId=None)¶ Analyse the stack of a
Pickler
.This method creates a list of dictionaries, one for each object currently being serialised. (That is, objects already serialised or objects not yet started are not in this list.) The first list item represents the object whose processing started last, the last entry represents the object whose processing started first. The pickler reorders the sequence of of objects to be pickled if required. Therefore it is not guaranteed that the last list item represents the object, that was initially given to the pickler.
Possible entries of the dictionaries in the returned list are
- Key
ANALYSE_OBJECT_KEY
- the object to be pickled. This item is always present.
- Key
ANALYSE_DICT_OF_KEY
- This item is present, if the object to be pickled is the __dict__ attribute of a another object. The value is the object, that has the __dict__ attribute.
- Key
ANALYSE_MEMO_KEY
- If the object to be pickled has already been added to the memo, the value of this item is the memo key.
Parameters: - traceback_or_frame (
types.TracebackType
or types.FrameType) – a traceback object or a frame object. In case of a traceback object, the method follows the chain of traceback objects and extracts the innermost frame object. - stopObjectId (int) – the id of the top most object, the caller is interested in. If this method encounters an object with the given id, it stops building the result list.
Returns: a list of dictionaries
Return type: - Key
- file – The file argument must be either an instance of
-
class
sPickle.
ObjectDispatchBuilder
(modules=None)¶ Bases:
object
A builder for the
object_dispatch
table of aPickler
A
ObjectDispatchBuilder
has a list of names of API-modules. It inspects theses modules and looks for public objects, that won’t be pickled by value (strings, numbers, lists, dicts, sets, modules) and have unusable values of their attributes__module__
and/or__name__
. The builder adds an item to theobject_dispatch
mapping for each problematic object.Parameters: modules – see method extend_pending_analysis_queue()
-
ALL_PORTABLE_STDLIB_MODULES
¶ A constant list of portable modules in the Python 2.7.9 standard library.
-
DEFAULT_PRIO1
¶ The default priority for module entries, which were enumerated in the module variable
__all__
. Because the author of the module explicitly declared those names, they get a higher priority.
-
DEFAULT_PRIO2
¶ The default priority for public module entries from a module without a
__all__
variable.
-
object_dispatch
¶ The attribute
object_dispatch
is a mapping from a numeric object id - as returned byid()
- to a callable. The callable takes two arguments: first the pickler and then the object to be pickled. The callable must use the pickler to pickle the object.The callable has a numerical priority. The priority is the value of the callable´s attribute
priority
or, if the callable lacks the attribute, 1000. The priority indicates how “good” a API module is. This is a heuristic approach to the problem, that many API modules without an__all__
variable accidently export objects imported from other modules. If an object is exported by more than one API-module, theObjectDispatchBuilder
uses the module with the highest priority.
-
acceptable_module_names
¶ Intentionally undocumented.
-
append_pending_analysis_queue
(name, prio1=None, prio2=None)¶ Append a single module to the list of API modules.
Parameters:
-
build
()¶ Update the content of
object_dispatch
.
-
extend_pending_analysis_queue
(modules)¶ Extend the list of API modules by modules.
Parameters: modules (an iterable collection) – the iterable collection of module specifications to be considered. A module specification is one of
- name (str) of a module
- a sequence (name, priority) of a module
- a sequence (name, prio1, prio2) of a module
For the meaning of the priority values see method
append_pending_analysis_queue()
.
-
classmethod
get_default_instance
()¶ Return a global default instance of class
ObjectDispatchBuilder
.Returns: the default object dispatch builder Return type: ObjectDispatchBuilder
-
-
class
sPickle.
SPickleTools
(serializeableModules=None, pickler_class=None)¶ Bases:
object
A collection of simple utility methods.
Warning
This class is still under development. Don’t rely on its methods. If you need a stable API use the class
Pickler
directly or copy the code.The optional argument serializeableModules is passed on to the class
Pickler
.The optional argument pickler_class can be used to set a different pickler class. It must accept the same arguments as class
Pickler
.-
classmethod
dis
(str_, out=None, memo=None, indentlevel=4)¶ Disassemble an optionally compressed pickle.
See function
pickletools.dis()
for details.
-
dumps
(obj, persistent_id=None, persistent_id_method=None, doCompress=True, mangleModuleName=None, object_dispatch=None)¶ Pickle an object and return the pickle
This method works similar to the regular dumps method, but also optimizes and optionally compresses the pickle.
Parameters: - obj (object) – object to be pickled
- persistent_id – the persistent_id function (or another callable) for the pickler. The function is called with a single positional argument, and must return None`or the persistent id for its argument. See the section “Pickling and unpickling external objects” of the documentation of module :mod:`Pickle.
- persistent_id_method – a variant of the persistent_id function, that takes the pickler object as its first argument and an object as its second argument.
- doCompress – If doCompress yields True in a boolean context, the
pickle will be compressed, if the compression actually
reduces the size of the pickle. The compression method depends
on the exact value of doCompress. If doCompress is callable,
it is called to perform the compression. doCompress
must be a function (or method), that takes a single string parameter
and returns a compressed version. Otherwise, if doCompress is not
callable the function
bz2.compress()
is used. - mangleModuleName –
Unless mangleModuleName is None, it must be a callable with 3 arguments: the first receives the pickler, the second the module name of the object to be pickled. If the caller is going to save a module reference, the third argument is the module. The callable must return an object to be pickled instead of the module name. This can be a different string or a object that gets unpickled as a string.
Example:
import os.path def mangleOsPath(pickler, name, module) '''use 'os.path' instead of the platform specific module name''' if module is os.path: return "os.path" return name spt = SPickleTools() p = spt.dumps(object_to_be_pickled, mangleModuleName=mangleOsPath)
- object_dispatch – the optional argument
object_dispatch must be either a
MutableMapping
or anObjectDispatchBuilder
. It is passed on to the constructor of the pickler. SeePickler
for details.
Returns: the pickle, optionally compressed
Return type:
-
dumps_with_external_ids
(obj, idmap, matchResources=False, matchNetref=False, additionalResourceObjects=(), **kw)¶ Pickle an object, that references objects that can’t be pickled.
If you want to pickle an object, that references a resource (files, sockets, etc) or references a RPyC-proxy for an object on a remote system you can’t pickle the referenced object. But if you are going to transfer the pickle to a remote system using the package RPyC, you can replace the resources by an RPyC proxy objects and replace RPyC proxy objects by the real objects.
This method creates an
Pickler
object with a persistent_id method that optionally replaces resources and proxy objects by their object id. It stores the mapping between ids and objects in the idmap dictionary (or any other mutable mapping).Parameters: - obj (object) – the object to be pickled
- idmap (
dict
) – receives the id to object mapping - matchResources (object) – if true in a boolean context, replace resource objects.
- matchNetref (object) – if true in a boolean context, replace RPyC proxies (technically
objects of class
rpyc.core.netref.BaseNetref
). - additionalResourceObjects – a collection of objects that encapsulate some kind of resource and must be replaced by an RPyC proxy.
- kw – other keyword arguments that are passed on to
dumps()
.
-
classmethod
getImportList
(str_)¶ Return a list containing all imported modules from the pickle str_.
Somtimes useful for debuging.
-
classmethod
loads
(str_, persistent_load=None, useCPickle=True, unpickler_class=None)¶ Unpickle an object from a string.
Parameters: - str (
str
) – the pickle - persistent_load – The persistent_load method for the
unpickler.
See the section “Pickling and unpickling external objects” of the
documentation of module
Pickle
. - useCPickle (object) – if True in a boolean context, use the Unpickler from the
module
cPickle
. Otherwise use the much slower Unpickler from the modulepickle
. - unpickler_class – the unpickler class to be used. If this parameter is given, the value of useCPickle is ignored.
Returns: the reconstructed object
Return type: - str (
-
classmethod
loads_with_external_ids
(str_, idmap, useCPickle=True, unpickler_class=None)¶ Unpickle an object from a string.
Replace ids for external objects with the objects provided in idmap.
Parameters: - str (
str
) – the pickle - idmap (dict) – the mapping, that contains the objects for the id values used in the pickle
- useCPickle (object) – if True in a boolean context, use the Unpickler from the
module
cPickle
. Otherwise use the much slower Unpickler from the modulepickle
. - unpickler_class – the unpickler class to be used. If this parameter is given, the value of useCPickle is ignored.
Returns: the reconstructed object
Return type: - str (
-
classmethod
module_for_globals
(callable_or_moduledict, withDefiningModules=False)¶ Get the module associated with a callable or a module dictionary.
If you pickle a module, make sure to keep a reference to the unpickled module. Otherwise the destruction of the module will clear the modules dictionary. Usually, the sPickle code for serializing modules, preserves a reference to modules created from a pickle but not imported into sys.modules. However, there might be cases, where you need to identify relevant modules yourself. This method can be used, to find the relevant module(s).
Parameters: - callable_or_moduledict – a function or a method or a module dictionary
- withDefiningModules (object) – if True and callable_or_moduledict is a callable, return also the module defining the callable.
Returns: None, or a single module or a set of modules
-
classmethod
reducer
(*args)¶ Get an object with a method __reduce__.
This method creates an object that has a custom method __reduce__. The __reduce__ method returns the given arguments when called.
This method can be used to implement complex __reduce__ method that need more than one function call on unpickling.
-
remotemethod
(rpycconnection, method=None, create_only_once=None, **kw)¶ Create a remote function.
This method takes an active RPyC connection and a locally defined function (or method) and returns a proxy for an equivalent function on the remote side. If you invoke the proxy, it will create a pickle containing the function, transfer this pickle to the remote side, unpickle it and invoke the function. It then pickles the result and transfers the result back to the local side. It will not pickle the function arguments. If you need to transfer the function arguments by value, use
functools.partial()
to apply them to your function prior to the call of remotemethod.Parameters: - rpycconnection (
rpyc.core.protocol.Connection
) – an active RPyC connection. If set to None, execute method localy. - method (object) – a callable object. If you do not give this argument, you can use remotemethod as a decorator.
- create_only_once – controlls the creation of the function on the
remote side. If you want to create the function during the execution
of
remotemethod()
, passCREATE_IMMEDIATELY
. Otherwise, if you want to create the remote function on its first invokation, set create_only_once to a value that is True in a boolean context. Otherwise, if you set create_only_once evaluates to False, the local proxy creates the create the remote function on every invocation. - kw – other keyword arguments that are passed on to
dumps_with_external_ids()
.
Returns: the proxy for the remote function
Note
If you use remotemethod as a decorator, do not apply it on regular methods of a class. It does not work in the desired way, because decorators work on the underlying function object, not on the method object. Therefore you will end up with a remote function, that recives a RPyC proxy for self.
- rpycconnection (
-
CREATE_EVERYTIME
= False¶ Constant to be given to the create_only_once argument of method
remotemethod()
. Create the function on the remote side on every invocation of the function returned byremotemethod()
. The actual value is False.
-
CREATE_IMMEDIATELY
= 'immedately'¶ Constant to be given to the create_only_once argument of method
remotemethod()
. Create the function on the remote side during the invocation ofremotemethod()
.
-
CREATE_LAZY
= True¶ Constant to be given to the create_only_once argument of method
remotemethod()
. Create the function on the remote side on the first invocation of the function returned byremotemethod()
. The actual value is True.
-
classmethod
-
class
sPickle.
FailSavePickler
(file, protocol=2, serializeableModules=None, mangleModuleName=None, logger=None, object_dispatch=None)¶ Bases:
sPickle._sPickle.Pickler
A failsave variant of class
Pickler
.If this pickler detects an unpickleable object, it calls its method
get_replacement()
to retrieve a surrogate object to be pickled instead of the unpickleable object.To use this feature you must either assign a suitable callable as attribute ‘get_replacement’ or derive a create your own subclass of
FailSavePickler
and override methodget_replacement()
.Parameters: - file – The file argument must be either an instance of
collections.MutableSequence
or have a write(str) - method that accepts a single string argument. It can thus be an open file object, a StringIO object, or any other custom object that meets this interface. As an alternative you can use a list or any other instance of collections.MutableSequence. - protocol (int) – The optional protocol argument tells the pickler to use the given protocol; For this implementation, the only supported protocol is 2 or pickle.HIGHEST_PROTOCOL. Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
- serializeableModules –
The optional argument serializeableModules must be an iterable collection of modules and strings. If the pickler needs to serialize a module, it checks this collection to decide, if the module needs to be pickled by value or by name. The module gets pickled by value, if at least one of the following conditions is true. Otherwise it gets pickled by reference:
- The module has a global variable named __module_must_be_pickled__ and the value of this variable is true.
- The module object is contained in serializeableModules.
- The the name of the module starts with a string contained in serializeableModules.
- The module has the attribute __file__ and serializeableModules contains a string, that is a substring of __file__ after applying path and case normalization as appropriate for the current system.
- logger (
logging.Logger
) – The optional argument logger must be an instance of classlogging.Logger
. If given, it is used instead of the default logger. - mangleModuleName –
Experimental feature: the optional argument mangleModuleName must be a callable with three arguments. The first argument is this pickler, the second the name of module and the third is None or - if the caller is going to pickle a module reference - the module object. The callable must return a pickleable object that unpickles as a string. You can use this callable to rename modules in the pickle. For instance you may want to replace “posixpath” by “os.path”.
Note
In order to be able to unpickle a module pickled by name, the module must be importable. If this is not the case or if the content of the module might change, you should tell the pickler to pickle the module by value.
- object_dispatch – Experimental feature: the optional argument
object_dispatch must be either an
ObjectDispatchBuilder
or aMutableMapping
from numeric object ids - as returned by id() - to callables, which take two arguments, first the pickler and then the object to be pickled. It is used to initialize the attributesobject_dispatch_builder
andobject_dispatch
. If no value is given, thePickler
updates the global defaultObjectDispatchBuilder
(as returned byget_default_instance()
) and then sets object_dispatch to a shallow copy of the globalObjectDispatchBuilder
.
Attributes and Methods
-
dispatch
¶ A per instance version of the global dispatch table
pickle.Pickler.dispatch
. Using a per instance dispatch table keeps the global table unchanged.
-
object_dispatch
¶ Certain “global” objects require a special treatment, because the values of their attributes
__module__
and / or__name__
are missing, wrong or otherwise not useful. Examples include platform dependent implementations of standard functions likeos.getcwd()
, which reports to bent.getcwd
or various types from the moduletypes
. The attributeobject_dispatch
is a mapping from a numeric object id - as returned byid()
- to a callable, which takes two arguments: first the pickler and then the object to be pickled. If the pickler finds the id of an object to be pickled inobject_dispatch
, it dispatches pickling to the callable.If the constructor argument object_dispatch was a
MutableMapping
, the pickler sets this attribute to object_dispatch. Otherwise the pickler setsobject_dispatch
toobject_dispatch.object_dispatch
. Finally the pickler adds a few additional entries to the mapping for special cases.
-
object_dispatch_builder
¶ If the constructor argument object_dispatch was not a
MutableMapping
, this attribute is the value of object_dispatch or, if object_dispatch was None a copy of the defaultObjectDispatchBuilder
as returned byget_default_instance()
. Otherwiseobject_dispatch_builder
is None.
-
get_replacement
(pickler, obj, exception)¶ Get a surrogate for an unpicklable object.
This method is called if the pickler encounters an otherwise unpickleable object. The method can return an replacement object or its argument ‘exception’, if the function is unwilling to profide a replacement.
This implementation always returns ‘exception’.
Parameters: - pickler (
FailSavePickler
or a subclass thereof) – the pickler - obj – the unpickleable object
- exception – the exception raised on pickling obj
Returns: a pickleable surrogate for obj or ‘exception’.
- pickler (
- file – The file argument must be either an instance of
-
exception
sPickle.
RecursionDetectedError
(msg, oid, level)¶ Bases:
pickle.PicklingError
Raised by
FailSavePickler
on infinite recursionsParameters:
-
exception
sPickle.
UnpicklingWillFailError
¶ Bases:
pickle.PicklingError
This error indicates, that an object can be pickled, but unpickling will probably fail.
This usually caused by an incomplete implementation of the pickling protocol or by a hostile __getattr__ or __getattribute__ method.