gc3libs.utils

Generic Python programming utility functions.

This module collects general utility functions, not specifically related to GC3Libs. A good rule of thumb for determining if a function or class belongs in here is the following: place a function or class in this module if you could copy its code into the sources of a different project and it would not stop working.

class gc3libs.utils.Enum

A generic enumeration class. Inspired by: http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python/2182437#2182437 with some more syntactic sugar added.

An Enum class must be instanciated with a list of strings, that make the enumeration “label”:

>>> Animal = Enum('CAT', 'DOG')

Each label is available as an instance attribute, evaluating to itself:

>>> Animal.DOG
'DOG'

>>> Animal.CAT == 'CAT'
True

As a consequence, you can test for presence of an enumeration label by string value:

>>> 'DOG' in Animal
True

Finally, enumeration labels can also be iterated upon:

>>> for a in Animal: print a
DOG
CAT
class gc3libs.utils.ExponentialBackoff(slot_duration=0.05, max_retries=5)

Generate waiting times with the exponential backoff algorithm.

Returned times are in seconds (or fractions thereof); they are integral multiples of the basic time slot, which is set with the slot_duration constructor parameter.

After max_retries have been attempted, any call to this iterator will raise a StopIteration exception.

The ExponentialBackoff class implements the iterator protocol, so you can just retrieve waiting times with the .next() method, or by looping over it:

>>> random.seed(314) # not-so-random for testing purposes...
>>> for wt in ExponentialBackoff():
...   print wt,
...
0.0 0.0 0.0 0.25 0.15 0.3
next()

Return next waiting time.

wait()

Wait for another while.

class gc3libs.utils.History

A list of messages with timestamps and (optional) tags.

The append method should be used to add a message to the History:

>>> L = History()
>>> L.append('first message')
>>> L.append('second one')

The last method returns the text of the last message appended, with its timestamp:

>>> L.last().startswith('second one at')
True

Iterating over a History instance returns message texts in the temporal order they were added to the list, with their timestamp:

>>> for msg in L: print(msg) 
first message ...
append(message, *tags)

Append a message to this History.

The message is timestamped with the time at the moment of the call.

The optional tags argument is a sequence of strings. Tags are recorded together with the message and may be used to filter log messages given a set of labels. (This feature is not yet implemented.)

format_message(message)

Return a formatted message, appending to the message its timestamp in human readable format.

last()

Return text of last message appended. If log is empty, return empty string.

gc3libs.utils.Log

alias of History

class gc3libs.utils.PlusInfinity

An object that is greater-than any other object.

>>> x = PlusInfinity()
>>> x > 1
True
>>> 1 < x
True
>>> 1245632479102509834570124871023487235987634518745 < x
True
>>> x > sys.maxint
True
>>> x < sys.maxint
False
>>> sys.maxint < x
True

PlusInfinity objects are actually larger than any given Python object:

>>> x > 'azz'
True
>>> x > object()
True

Note that PlusInfinity is a singleton, therefore you always get the same instance when calling the class constructor:

>>> x = PlusInfinity()
>>> y = PlusInfinity()
>>> x is y
True

Relational operators try to return the correct value when comparing PlusInfinity to itself:

>>> x < y
False
>>> x <= y
True
>>> x == y
True
>>> x >= y
True
>>> x > y
False
class gc3libs.utils.Singleton

Derived classes of Singleton can have only one instance in the running Python interpreter.

>>> x = Singleton()
>>> y = Singleton()
>>> x is y
True
class gc3libs.utils.Struct(initializer=None, **extra_args)

A dict-like object, whose keys can be accessed with the usual ‘[...]’ lookup syntax, or with the ‘.’ get attribute syntax.

Examples:

>>> a = Struct()
>>> a['x'] = 1
>>> a.x
1
>>> a.y = 2
>>> a['y']
2

Values can also be initially set by specifying them as keyword arguments to the constructor:

>>> a = Struct(z=3)
>>> a['z']
3
>>> a.z
3

Like dict instances, Struct`s have a `copy method to get a shallow copy of the instance:

>>> b = a.copy()
>>> b.z
3
copy()

Return a (shallow) copy of this Struct instance.

gc3libs.utils.backup(path)

Rename the filesystem entry at path by appending a unique numerical suffix; return new name.

For example,

  1. create a test file:
>>> import tempfile
>>> path = tempfile.mkstemp()[1]
  1. then make a backup of it; the backup will end in .~1~:
>>> path1 = backup(path)
>>> os.path.exists(path + '.~1~')
True

3. re-create the file, and make a second backup: this time the file will be renamed with a .~2~ extension:

>>> open(path, 'w').close()
>>> path2 = backup(path)
>>> os.path.exists(path + '.~2~')
True

cleaning up tests

>>> os.remove(path+'.~1~')
>>> os.remove(path+'.~2~')
gc3libs.utils.basename_sans(path)

Return base name without the extension.

gc3libs.utils.cache_for(lapse)

Cache the result of a (nullary) method invocation for a given amount of time. Use as a decorator on object methods whose results are to be cached.

Store the result of the first invocation of the decorated method; if another invocation happens before lapse seconds have passed, return the cached value instead of calling the real function again. If a new call happens after the grace period has expired, call the real function and store the result in the cache.

Note: Do not use with methods that take keyword arguments, as they will be discarded! In addition, arguments are compared to elements in the cache by identity, so that invoking the same method with equal but distinct object will result in two separate copies of the result being computed and stored in the cache.

Cache results and timestamps are stored into the objects’ _cache_value and _cache_last_updated attributes, so the caches are destroyed with the object when it goes out of scope.

The working of the cached method can be demonstrated by the following simple code:

>>> class X(object):
...     def __init__(self):
...         self.times = 0
...     @cache_for(2)
...     def foo(self):
...             self.times += 1
...             return ("times effectively run: %d" % self.times)
>>> x = X()
>>> x.foo()
'times effectively run: 1'
>>> x.foo()
'times effectively run: 1'
>>> time.sleep(3)
>>> x.foo()
'times effectively run: 2'
gc3libs.utils.cat(*args, **extra_args)

Concatenate the contents of all args into output. Both output and each of the args can be a file-like object or a string (indicating the path of a file to open).

If append is True, then output is opened in append-only mode; otherwise it is overwritten.

gc3libs.utils.copy_recursively(src, dst, overwrite=False)

Copy src to dst, descending it recursively if necessary.

gc3libs.utils.copyfile(src, dst, overwrite=False, link=False)

Copy a file from src to dst; return True if the copy was actually made. If overwrite is False (default), an existing destination entry is left unchanged and False is returned.

If link is True, an attempt at hard-linking is done first; failing that, we copy the source file onto the destination one. Permission bits and modification times are copied as well.

If dst is a directory, a file with the same basename as src is created (or overwritten) in the directory specified.

gc3libs.utils.copytree(src, dst, overwrite=False)

Recursively copy an entire directory tree rooted at src. If overwrite is False (default), entries that already exist in the destination tree are left unchanged and not overwritten.

See also: shutil.copytree.

gc3libs.utils.count(seq, predicate)

Return number of items in seq that match predicate. Argument predicate should be a callable that accepts one argument and returns a boolean.

gc3libs.utils.defproperty(fn)

Decorator to define properties with a simplified syntax in Python 2.4. See http://code.activestate.com/recipes/410698-property-decorator-for-python-24/#c6 for details and examples.

gc3libs.utils.deploy_configuration_file(filename, template_filename=None)

Ensure that configuration file filename exists; possibly copying it from the specified template_filename.

Return True if a file with the specified name exists in the configuration directory. If not, try to copy the template file over and then return False; in case the copy operations fails, a NoConfigurationFile exception is raised.

The template_filename is always resolved relative to GC3Libs’ ‘package resource’ directory (i.e., the etc/ directory in the sources. If template_filename is None, then it is assumed to be the base name of filename.

gc3libs.utils.dirname(pathname)

Same as os.path.dirname but return . in case of path names with no directory component.

gc3libs.utils.first(seq)

Return the first element of sequence or iterator seq. Raise TypeError if the argument does not implement either of the two interfaces.

Examples:

>>> s = [0, 1, 2]
>>> first(s)
0

>>> s = {'a':1, 'b':2, 'c':3}
>>> first(sorted(s.keys()))
'a'
gc3libs.utils.from_template(template, **extra_args)

Return the contents of template, substituting all occurrences of Python formatting directives ‘%(key)s’ with the corresponding values taken from dictionary extra_args.

If template is an object providing a read() method, that is used to gather the template contents; else, if a file named template exists, the template contents are read from it; otherwise, template is treated like a string providing the template contents itself.

gc3libs.utils.getattr_nested(obj, name)

Like Python’s getattr, but perform a recursive lookup if name contains any dots.

gc3libs.utils.ifelse(test, if_true, if_false)

Return if_true is argument test evaluates to True, return if_false otherwise.

This is just a workaround for Python 2.4 lack of the conditional assignment operator:

>>> a = 1
>>> b = ifelse(a, "yes", "no"); print b
yes
>>> b = ifelse(not a, 'yay', 'nope'); print b
nope
gc3libs.utils.irange(start, stop, step=1)

Iterate over all values greater or equal than start and less than stop. (Or the reverse, if step < 0.)

Example:

>>> list(irange(1, 5))
[1, 2, 3, 4]
>>> list(irange(0, 8, 3))
[0, 3, 6]
>>> list(irange(8, 0, -2))
[8, 6, 4, 2]

Unlike the built-in range function, irange also accepts floating-point values:

>>> list(irange(0.0, 1.0, 0.5))
[0.0, 0.5]

Also unlike the built-in range, both start and stop have to be specified:

>>> irange(42)
Traceback (most recent call last):
  ...
TypeError: irange() takes at least 2 arguments (1 given)

Of course, a null step is not allowed:

>>> list(irange(1, 2, 0))
Traceback (most recent call last):
  ...
AssertionError: Null step in irange.
gc3libs.utils.lock(path, timeout, create=True)

Lock the file at path. Raise a LockTimeout error if the lock cannot be acquired within timeout seconds.

Return a lock object that should be passed unchanged to the gc3libs.utils.unlock function.

If no path points to a non-existent location, an empty file is created before attempting to lock (unless create is False). An attempt is made to remove the file in case an error happens.

See also: gc3libs.utils.unlock()

gc3libs.utils.mkdir(path, mode=511)

Like os.makedirs, but does not throw an exception if PATH already exists.

gc3libs.utils.mkdir_with_backup(path, mode=511)

Like os.makedirs, but if path already exists and is not empty, rename the existing one to a backup name (see the backup function).

Unlike os.makedirs, no exception is thrown if the directory already exists and is empty, but the target directory permissions are not altered to reflect mode.

gc3libs.utils.prettyprint(D, indent=0, width=0, maxdepth=None, step=4, only_keys=None, output=<open file '<stdout>', mode 'w' at 0x2aaaaab17150>, _key_prefix='', _exclude=None)

Print dictionary instance D in a YAML-like format. Each output line consists of:

  • indent spaces,
  • the key name,
  • a colon character :,
  • the associated value.

If the total line length exceeds width, the value is printed on the next line, indented by further step spaces; a value of 0 for width disables this line wrapping.

Optional argument only_keys can be a callable that must return True when called with keys that should be printed, or a list of key names to print.

Dictionary instances appearing as values are processed recursively (up to maxdepth nesting). Each nested instance is printed indented step spaces from the enclosing dictionary.

gc3libs.utils.progressive_number(qty=None, id_filename='/home/rmurri/.gc3/next_id.txt')

Return a positive integer, whose value is guaranteed to be monotonically increasing across different invocations of this function, and also across separate instances of the calling program.

Example:

(create a temporary directory to avoid bug #)
>>> import tempfile, os
>>> (fd, tmp) = tempfile.mkstemp()
>>> n = progressive_number(id_filename=tmp)
>>> m = progressive_number(id_filename=tmp)
>>> m > n
True

If you specify a positive integer as argument, then a list of monotonically increasing numbers is returned. For example:

>>> ls = progressive_number(5, id_filename=tmp)
>>> len(ls)
5
(clean up test environment)
>>> os.remove(tmp)

In other words, progressive_number(N) is equivalent to:

nums = [ progressive_number() for n in range(N) ]

only more efficient, because it has to obtain and release the lock only once.

After every invocation of this function, the last returned number is stored into the file passed as argument id_filename. If the file does not exist, an attempt to create it is made before allocating an id; the method can raise an IOError or OSError if id_filename cannot be opened for writing.

Note: as file-level locking is used to serialize access to the counter file, this function may block (default timeout: 30 seconds) while trying to acquire the lock, or raise a LockTimeout exception if this fails.

Raise :LockTimeout, IOError, OSError
Returns:A positive integer number, monotonically increasing with every call. A list of such numbers if argument qty is a positive integer.
gc3libs.utils.read_contents(path)

Return the whole contents of the file at path as a single string.

Example:

>>> read_contents('/dev/null')
''

>>> import tempfile
>>> (fd, tmpfile) = tempfile.mkstemp()
>>> w = open(tmpfile, 'w')
>>> w.write('hey')
>>> w.close()
>>> read_contents(tmpfile)
'hey'

(If you run this test, remember to do cleanup afterwards)

>>> os.remove(tmpfile)
gc3libs.utils.safe_repr(obj)

Return a string describing Python object obj.

Avoids calling any Python magic methods, so should be safe to use as a ‘last resort’ in implementation of __str__ and __repr__.

gc3libs.utils.same_docstring_as(referenced_fn)

Function decorator: sets the docstring of the following function to the one of referenced_fn.

Intended usage is for setting docstrings on methods redefined in derived classes, so that they inherit the docstring from the corresponding abstract method in the base class.

gc3libs.utils.samefile(path1, path2)

Like os.path.samefile but return False if either one of the paths does not exist.

gc3libs.utils.sh_quote_safe(text)

Escape a string for safely passing as argument to a shell command.

Return a single-quoted string that expands to the exact literal contents of text when used as an argument to a shell command. Examples (note that backslashes are doubled because of Python’s string read syntax):

>>> print(sh_quote_safe("arg"))
'arg'
>>> print(sh_quote_safe("'arg'"))
''\''arg'\'''
gc3libs.utils.sh_quote_unsafe(text)

Double-quote a string for passing as argument to a shell command.

Return a double-quoted string that expands to the contents of text but still allows variable expansion and \-escapes processing by the UNIX shell. Examples (note that backslashes are doubled because of Python’s string read syntax):

>>> print(sh_quote_unsafe("arg"))
"arg"
>>> print(sh_quote_unsafe('"arg"'))
"\"arg\""
>>> print(sh_quote_unsafe(r'"\"arg\""'))
"\"\\\"arg\\\"\""
gc3libs.utils.string_to_boolean(word)

Convert word to a Python boolean value and return it. The strings true, yes, on, 1 (with any capitalization and any amount of leading and trailing spaces) are recognized as meaning Python True:

>>> string_to_boolean('yes')
True
>>> string_to_boolean('Yes')
True
>>> string_to_boolean('YES')
True
>>> string_to_boolean(' 1 ')
True
>>> string_to_boolean('True')
True
>>> string_to_boolean('on')
True

Any other word is considered as boolean False:

>>> string_to_boolean('no')
False
>>> string_to_boolean('No')
False
>>> string_to_boolean('Nay!')
False
>>> string_to_boolean('woo-hoo')
False

This includes also the empty string and whitespace-only:

>>> string_to_boolean('')
False
>>> string_to_boolean('  ')
False
gc3libs.utils.stripped(iterable)

Iterate over lines in iterable and return each of them stripped of leading and trailing blanks.

gc3libs.utils.test_file(path, mode, exception=<type 'exceptions.RuntimeError'>, isdir=False)

Test for access to a path; if access is not granted, raise an instance of exception with an appropriate error message. This is a frontend to os.access(), which see for exact semantics and the meaning of path and mode.

Parameters:
  • path – Filesystem path to test.
  • mode – See os.access()
  • exception – Class of exception to raise if test fails.
  • isdir – If True then also test that path points to a directory.

If the test succeeds, True is returned:

>>> test_file('/bin/cat', os.F_OK)
True
>>> test_file('/bin/cat', os.R_OK)
True
>>> test_file('/bin/cat', os.X_OK)
True
>>> test_file('/tmp', os.X_OK)
True

However, if the test fails, then an exception is raised:

>>> test_file('/bin/cat', os.W_OK)
Traceback (most recent call last):
  ...
RuntimeError: Cannot write to file '/bin/cat'.

If the optional argument isdir is True, then additionally test that path points to a directory inode:

>>> test_file('/tmp', os.F_OK, isdir=True)
True

>>> test_file('/bin/cat', os.F_OK, isdir=True)
Traceback (most recent call last):
  ...
RuntimeError: Expected '/bin/cat' to be a directory, but it's not.
gc3libs.utils.to_bytes(s)

Convert string s to an integer number of bytes. Suffixes like ‘KB’, ‘MB’, ‘GB’ (up to ‘YB’), with or without the trailing ‘B’, are allowed and properly accounted for. Case is ignored in suffixes.

Examples:

>>> to_bytes('12')
12
>>> to_bytes('12B')
12
>>> to_bytes('12KB')
12000
>>> to_bytes('1G')
1000000000

Binary units ‘KiB’, ‘MiB’ etc. are also accepted:

>>> to_bytes('1KiB')
1024
>>> to_bytes('1MiB')
1048576
gc3libs.utils.uniq(seq)

Iterate over all unique elements in sequence seq.

Distinct values are returned in a sorted fashion.

gc3libs.utils.unlock(lock)

Release a previously-acquired lock.

Argument lock should be the return value of a previous gc3libs.utils.lock call.

See also: gc3libs.utils.lock()

gc3libs.utils.update_parameter_in_file(path, var_in, new_val, regex_in)

Updates a parameter value in a parameter file using predefined regular expressions in _loop_regexps.

Parameters:
  • path – Full path to the parameter file.
  • var_in – The variable to modify.
  • new_val – The updated parameter value.
  • regex – Name of the regular expression that describes the format of the parameter file.
gc3libs.utils.write_contents(path, data)

Overwrite the contents of the file at path with the given data. If the file does not exist, it is created.

Example:

>>> import tempfile
>>> (fd, tmpfile) = tempfile.mkstemp()
>>> write_contents(tmpfile, 'big data here')
>>> read_contents(tmpfile)
'big data here'

(If you run this test, remember to clean up afterwards)

>>> os.remove(tmpfile)

Previous topic

gc3libs.url

Next topic

gc3libs.workflow