# util Package¶

## util Package¶

The util package implements various utilities that are commonly used by various packages.

## coord_utils Module¶

Utilities for manipulating coordinates or list of coordinates, under periodic boundary conditions or otherwise. Many of these are heavily vectorized in numpy for performance.

barycentric_coords(coords, simplex)[source]

Converts a list of coordinates to barycentric coordinates, given a simplex with d+1 points. Only works for d >= 2.

Args:
coords:
list of n coords to transform, shape should be (n,d)
simplex:
list of coordinates that form the simplex, shape should be (d+1, d)
Returns:
a LIST of barycentric coordinates (even if the original input was 1d)
find_in_coord_list(coord_list, coord, atol=1e-08)[source]

Find the indices of matches of a particular coord in a coord_list.

Args:
coord_list:
List of coords to test
coord:
Specific coordinates
atol:
Absolute tolerance. Defaults to 1e-8. Accepts both scalar and array
Returns:
Indices of matches, e.g., [0, 1, 2, 3]. Empty list if not found.
find_in_coord_list_pbc(fcoord_list, fcoord, atol=1e-08)[source]

Get the indices of all points in a fractional coord list that are equal to a fractional coord (with a tolerance), taking into account periodic boundary conditions.

Args:
fcoord_list:
List of fractional coords
fcoord:
A specific fractional coord to test.
atol:
Absolute tolerance. Defaults to 1e-8.
Returns:
Indices of matches, e.g., [0, 1, 2, 3]. Empty list if not found.
get_angle(v1, v2, units='degrees')[source]

Calculates the angle between two vectors.

Args:
v1:
Vector 1
v2:
Vector 2
units:
“degrees” or “radians”. Defaults to “degrees”.
Returns:
Angle between them in degrees.
get_linear_interpolated_value(x_values, y_values, x)[source]

Returns an interpolated value by linear interpolation between two values. This method is written to avoid dependency on scipy, which causes issues on threading servers.

Args:
x_values:
Sequence of x values.
y_values:
Corresponding sequence of y values
x:
Get value at particular x
Returns:
Value at x.
get_points_in_sphere_pbc(lattice, frac_points, center, r)[source]

Find all points within a sphere from the point taking into account periodic boundary conditions. This includes sites in other periodic images.

Algorithm:

1. place sphere of radius r in crystal and determine minimum supercell (parallelpiped) which would contain a sphere of radius r. for this we need the projection of a_1 on a unit vector perpendicular to a_2 & a_3 (i.e. the unit vector in the direction b_1) to determine how many a_1”s it will take to contain the sphere.

Nxmax = r * length_of_b_1 / (2 Pi)

2. keep points falling within r.

Args:
lattice:
The lattice/basis for the periodic boundary conditions.
frac_points:
All points in the lattice in fractional coordinates.
center:
cartesian coordinates of center of sphere.
r:
Returns:
[(fcoord, dist) ...] since most of the time, subsequent processing requires the distance.
in_coord_list(coord_list, coord, atol=1e-08)[source]

Tests if a particular coord is within a coord_list.

Args:
coord_list:
List of coords to test
coord:
Specific coordinates
atol:
Absolute tolerance. Defaults to 1e-8. Accepts both scalar and array
Returns:
True if coord is in the coord list.
in_coord_list_pbc(fcoord_list, fcoord, atol=1e-08)[source]

Tests if a particular fractional coord is within a fractional coord_list.

Args:
fcoord_list:
List of fractional coords to test
fcoord:
A specific fractional coord to test.
atol:
Absolute tolerance. Defaults to 1e-8.
Returns:
True if coord is in the coord list.
pbc_all_distances(lattice, fcoords1, fcoords2)[source]

Returns the distances between two lists of coordinates taking into account periodic boundary conditions and the lattice. Note that this computes an MxN array of distances (i.e. the distance between each point in fcoords1 and every coordinate in fcoords2). This is different functionality from pbc_diff.

Args:
lattice:
lattice to use
fcoords1:
First set of fractional coordinates. e.g., [0.5, 0.6, 0.7] or [[1.1, 1.2, 4.3], [0.5, 0.6, 0.7]]. It can be a single coord or any array of coords.
fcoords2:
Second set of fractional coordinates.
Returns:
2d array of cartesian distances. E.g the distance between fcoords1[i] and fcoords2[j] is distances[i,j]
pbc_diff(fcoords1, fcoords2)[source]

Returns the ‘fractional distance’ between two coordinates taking into account periodic boundary conditions.

Args:
fcoords1:
First set of fractional coordinates. e.g., [0.5, 0.6, 0.7] or [[1.1, 1.2, 4.3], [0.5, 0.6, 0.7]]. It can be a single coord or any array of coords.
fcoords2:
Second set of fractional coordinates.
Returns:
Fractional distance. Each coordinate must have the property that abs(a) <= 0.5. Examples: pbc_diff([0.1, 0.1, 0.1], [0.3, 0.5, 0.9]) = [-0.2, -0.4, 0.2] pbc_diff([0.9, 0.1, 1.01], [0.3, 0.5, 0.9]) = [-0.4, -0.4, 0.11]
pbc_shortest_vectors(lattice, fcoords1, fcoords2)[source]

Returns the shortest vectors between two lists of coordinates taking into account periodic boundary conditions and the lattice.

Args:
lattice:
lattice to use
fcoords1:
First set of fractional coordinates. e.g., [0.5, 0.6, 0.7] or [[1.1, 1.2, 4.3], [0.5, 0.6, 0.7]]. It can be a single coord or any array of coords.
fcoords2:
Second set of fractional coordinates.
Returns:
array of displacement vectors

## decorators Module¶

This module contains useful decorators for a variety of functions.

cached_class(klass)[source]

Decorator to cache class instances by constructor arguments. This results in a class that behaves like a singleton for each set of constructor arguments, ensuring efficiency.

Note that this should be used for immutable classes only. Having a cached mutable class makes very little sense. For efficiency, avoid using this decorator for situations where there are many constructor arguments permutations.

The keywords argument dictionary is converted to a tuple because dicts are mutable; keywords themselves are strings and so are always hashable, but if any arguments (keyword or positional) are non-hashable, that set of arguments is not cached.

deprecated(replacement=None)[source]

Decorator to mark classes or functions as deprecated, with a possible replacement.

Args:
replacement:
A replacement class or method.
Returns:
Original function, but with a warning to use the updated class.
logged(level=10)[source]

Useful logging decorator. If a method is logged, the beginning and end of the method call will be logged at a pre-specified level.

Args:
level:
Level to log method at. Defaults to DEBUG.
class requires(condition, message)[source]

Bases: object

Decorator to mark classes or functions as requiring a specified condition to be true. This can be used to present useful error messages for optional dependencies. For example, decorating the following code will check if scipy is present and if not, a runtime error will be raised if someone attempts to call the use_scipy function:

```try:
import scipy
except ImportError:
scipy = None

@requires(scipy is not None, "scipy is not present.")
def use_scipy():
print scipy.majver
```
Args:
condition:
Condition necessary to use the class or function.
message:
A message to be displayed if the condition is not True.
singleton(cls)[source]

This decorator can be used to create a singleton out of a class.

## io_utils Module¶

This module provides utility classes for io operations.

class FileLock(file_name, timeout=10, delay=0.05)[source]

Bases: object

A file locking mechanism that has context-manager support so you can use it in a with statement. This should be relatively cross-compatible as it doesn’t rely on msvcrt or fcntl for the locking. Taken from http://www.evanfosmark.com/2009/01/cross-platform-file-locking -support-in-python/

Prepare the file locker. Specify the file to lock and optionally the maximum timeout and the delay between each attempt to lock.

Args:
file_name:
Name of file to lock.
timeout:
Maximum timeout for locking. Defaults to 10.
delay:
Delay between each attempt to lock. Defaults to 0.05.
acquire()[source]

Acquire the lock, if possible. If the lock is in use, it check again every delay seconds. It does this until it either gets the lock or exceeds timeout number of seconds, in which case it throws an exception.

release()[source]

Get rid of the lock by deleting the lockfile. When working in a with statement, this gets automatically called at the end.

exception FileLockException[source]

Bases: exceptions.Exception

clean_json(input_json, strict=False)[source]

This method cleans an input json-like dict object, either a list or a dict, nested or otherwise, by converting all non-string dictionary keys (such as int and float) to strings.

Args:
input_dict:
input dictionary.
strict:
This parameters sets the behavior when clean_json encounters an object it does not understand. If strict is True, clean_json will try to get the to_dict attribute of the object. If no such attribute is found, an attribute error will be thrown. If strict is False, clean_json will simply call str(object) to convert the object to a string representation.
Returns:
Sanitized dict that can be json serialized.
clean_lines(string_list, remove_empty_lines=True)[source]

Strips whitespace, carriage returns and empty lines from a list of strings.

Args:
string_list:
List of strings
remove_empty_lines:
Set to True to skip lines which are empty after stripping.
Returns:
List of clean strings with no whitespaces.
micro_pyawk(filename, search, results=None, debug=None, postdebug=None)[source]

Small awk-mimicking search routine.

‘file’ is file to search through. ‘search’ is the “search program”, a list of lists/tuples with 3 elements; i.e. [[regex,test,run],[regex,test,run],...] ‘results’ is a an object that your search program will have access to for storing results.

Here regex is either as a Regex object, or a string that we compile into a Regex. test and run are callable objects.

This function goes through each line in filename, and if regex matches that line and test(results,line)==True (or test == None) we execute run(results,match),where match is the match object from running Regex.match.

The default results is an empty dictionary. Passing a results object let you interact with it in run() and test(). Hence, in many occasions it is thus clever to use results=self.

Author: Rickard Armiento

Returns:
results

Generator method to read a file line-by-line, but backwards. This allows one to efficiently get data at the end of a file.

Based on code by Peter Astrand <astrand@cendio.se>, using modifications by Raymond Hettinger and Kevin German. http://code.activestate.com/recipes/439045-read-a-text-file-backwards -yet-another-implementat/

Reads file forwards and reverses in memory for files smaller than the max_mem parameter, or for gzip files where reverse seeks are not supported.

Files larger than max_mem are dynamically read backwards.

Args:
m_file:
blk_size:
The buffer size. Defaults to 4096.
max_mem:
The maximum amount of memory to involve in this operation. This is used to determine when to reverse a file in-memory versus seeking portions of a file. For bz2 files, this sets the maximum block size.
Returns:
Generator that returns lines from the file. Similar behavior to the file.readline() method, except the lines are returned from the back of the file.
which(program)[source]

Returns full path to a executable.

zopen(filename, *args, **kwargs)[source]

This wrapper wraps around the bz2, gzip and standard python’s open function to deal intelligently with bzipped, gzipped or standard text files.

Args:
filename:
filename
args:
Standard args for python open(..). E.g., ‘r’ for read, ‘w’ for write.
kwargs:
Standard kwargs for python open(..).
Returns:
File handler
zpath(filename)[source]

Returns an existing (zipped or unzipped) file path given the unzipped version. If no path exists, returns the filename unmodified

Args:
filename:
filename without zip extension
Returns:
filename with a zip extension (unless an unzipped version exists)

## num_utils Module¶

This module provides utilities for basic math operations.

chunks(items, n)[source]

Yield successive n-sized chunks from a list-like object.

```>>> import pprint
>>> pprint.pprint(list(chunks(range(1, 25), 10)))
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24]]
```
iterator_from_slice(s)[source]

Constructs an iterator given a slice object s.

Note

The function returns an infinite iterator if s.stop is None

min_max_indexes(seq)[source]

Uses enumerate, max, and min to return the indices of the values in a list with the maximum and minimum value:

sort_dict(d, key=None, reverse=False)[source]

Sorts a dict by value.

Args:
d:
Input dictionary
key:
function which takes an tuple (key, object) and returns a value to compare and sort by. By default, the function compares the values of the dict i.e. key = lambda t : t[1]
reverse:
allows to reverse sort order.
Returns:
OrderedDict object whose keys are ordered according to their value.

## plotting_utils Module¶

Utilities for generating nicer plots.

get_publication_quality_plot(width=8, height=None, plt=None)[source]

Provides a publication quality plot, with nice defaults for font sizes etc.

Args:
width:
Width of plot in inches. Defaults to 8in.
height.
Height of plot in inches. Defaults to width * golden ratio.
plt:
If plt is supplied, changes will be made to an existing plot. Otherwise, a new plot will be created.
Returns:
Matplotlib plot object with properly sized fonts.

## string_utils Module¶

This module provides utility classes for string operations.

class StringColorizer(stream)[source]

Bases: object

colours = {'default': '', 'blue': '\x1b[01;34m', 'green': '\x1b[01;32m', 'red': '\x1b[01;31m', 'cyan': '\x1b[01;36m'}
formula_double_format(afloat, ignore_ones=True, tol=1e-08)[source]

This function is used to make pretty formulas by formatting the amounts. Instead of Li1.0 Fe1.0 P1.0 O4.0, you get LiFePO4.

Args:
afloat:
a float
ignore_ones:
if true, floats of 1 are ignored.
tol:
Tolerance to round to nearest int. i.e. 2.0000000001 -> 2
Returns:
A string representation of the float for formulas.

Generates a string latex table from a sequence of sequence.

Args:
Returns:
String representation of Latex table with data.
latexify(formula)[source]

Generates a latex formatted formula. E.g., Fe2O3 is transformed to Fe\$_{2}\$O\$_{3}\$.

Args:
formula:
Input formula.
Returns:
Formula suitable for display as in LaTeX with proper subscripts.
latexify_spacegroup(spacegroup_symbol)[source]

Generates a latex formatted spacegroup. E.g., P2_1/c is converted to P2\$_{1}\$/c and P-1 is converted to P\$overline{1}\$.

Args:
spacegroup_symbol:
A spacegroup symbol
Returns:
A latex formatted spacegroup with proper subscripts and overlines.
list_strings(arg)[source]

Always return a list of strings, given a string or list of strings as input.

```>>> list_strings('A single string')
['A single string']
```
```>>> list_strings(['A single string in a list'])
['A single string in a list']
```
```>>> list_strings(['A','list','of','strings'])
['A', 'list', 'of', 'strings']
```
pprint_table(table, out=<open file '<stdout>', mode 'w' at 0x10028a150>, rstrip=False)[source]

Prints out a table of data, padded for alignment Each row must have the same number of columns.

Args:
table:
The table to print. A list of lists.
out:
Output stream (file-like object)
rstrip:
if True, trailing withespaces are removed from the entries.

Given a tuple, generate a nicely aligned string form. >>> results = [[“a”,”b”,”cz”],[“d”,”ez”,”f”],[1,2,3]] >>> print str_aligned(results) a b cz d ez f 1 2 3

Args:
Returns:
Aligned string output in a table-like format.

Given a tuple of tuples, generate a delimited string form. >>> results = [[“a”,”b”,”c”],[“d”,”e”,”f”],[1,2,3]] >>> print str_delimited(results,delimiter=”,”) a,b,c d,e,f 1,2,3

Args:
Returns:
Aligned string output in a table-like format.
stream_has_colours(stream)[source]

True if stream supports colours. Python cookbook, #475186

## testing Module¶

Common test support for pymatgen test scripts.

This single module should provide all the common functionality for pymatgen tests in a single location, so that test scripts can just import it and work right away.

class PymatgenTest(methodName='runTest')[source]

Bases: unittest.case.TestCase

Extends unittest.TestCase with functions (taken from numpy.testing.utils) that support the comparison of arrays.

Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

static assertArrayAlmostEqual(actual, desired, decimal=7, err_msg='', verbose=True)

Tests if two arrays are almost equal to a tolerance. The CamelCase naming is so that it is consistent with standard unittest methods.

static assertArrayEqual(actual, desired, err_msg='', verbose=True)
Tests if two arrays are equal. The CamelCase naming is so that it is
consistent with standard unittest methods.
static assert_almost_equal(actual, desired, decimal=7, err_msg='', verbose=True)[source]

Alternative naming for assertArrayAlmostEqual.

static assert_equal(actual, desired, err_msg='', verbose=True)[source]

Alternative naming for assertArrayEqual.