Developer reference


The following is a functionality that may be useful, but it is not considered as public API and it may somehow evolve over time.

loader module

The module contains a layer of functionality that allows abstract saving and loading of files.

A loader class inherits from Loader. A singleton class LoaderSet is the main public interface of this module. available as a global variable LOADERS. It keeps track of all registered loaders and takes care after them (presents them with options, requests etc.) Loaders are registered as classes using the decorator loader().

The concept is that there is one loader instance per file loaded. When we want to save a file, we use a loading loader to provide data to save and then we instantiate a saving loader (if needed) and save the data.

Individual loaders absolutely have to implement methods Loader._save() and Loader._load2reg().

This module facilitates integration of its functionality by defining update_parser() and settle_loaders(). While the first one can add capabilities to a parser (or parser group), the second one updates LOADERS accordingly while given parsed arguments.

Rough edges (but not rough enough to be worth the trouble):

  • You can’t force different loaders for image, template and output. If you need this, you have to rely on autodetection based on file extension.
  • Similarly, there is a problem with loader options — they are shared among all loaders. This is both a bug and a feature though.
  • To show the loaders help, you have to satisfy the parser by specifying a template and image file strings (they don’t have to be real filenames tho).
class imreg_dft.loader.Loader

To be implemented by derived class. Save data to fname, possibly taking into account previous loads and/or options passed upon the class creation.


To be implemented by derived class. Load data from fname in a way that they can be used in the registration process (so it is a 2D array). Possibly take into account options passed upon the class creation.


Guess whether we can load a filename just according to the name (extension)


Given a filename, it loads it and returns in a form suitable for registration (i.e. float, flattened, ...).

save(fname, what, loader)

Given the registration result, save the transformed input.


Makes a new instance of the object’s class BUT it conserves vital data.

imreg_dft.loader.loader(lname, priority)

A decorator interconnecting an abstract loader with the rest of imreg_dft It sets the “nickname” of the loader and its priority during autodetection

imreg_dft.loader.settle_loaders(args, fnames=None)

The function to be called as soon as args are parsed. It:

  1. If requested by passed args, it prints loaders help

    and then exits the app

  2. If filenames are supplied, it returns list of respective loaders.

  • args (namespace) – The output of argparse.parse_args()
  • fnames (list, optional) – List of filenames to load

list - list of loaders to load respective fnames.

utils module

This module contains various support functions closely related to image registration. They are used mainly by the ird tool.

FFT based image registration. — utility functions


Transform angle in degrees to complex phasor

imreg_dft.utils._apodize(what, aporad=None, ratio=None)

Given an image, it apodizes it (so it becomes quasi-seamless). When ratio is None, color near the edges will converge to the same colour, whereas when ratio is a float number, a blurred original image will serve as background.

  • what – The original image
  • aporad (int) – Radius [px], width of the band near the edges that will get modified
  • ratio (float or None) – When None, the apodization background will be a flat color. When a float number, the background will be the image itself convolved with Gaussian kernel of sigma (aporad / ratio).

The apodized image

imreg_dft.utils._argmax2D(array, reports=None)

Simple 2D argmax function with simple sharpness indication

imreg_dft.utils._argmax_ext(array, exponent)

Calculate coordinates of the COM (center of mass) of the provided array.

  • array (ndarray) – The array to be examined.
  • exponent (float or 'inf') – The exponent we power the array with. If the value ‘inf’ is given, the coordinage of the array maximum is taken.

The COM coordinate tuple, float values are allowed!

Return type:


imreg_dft.utils._calc_tform(shape, orig, scale, angle, tvec, newshape=None)

probably not used

imreg_dft.utils._calc_tform_complete(shape, scale, angle, tvec, newshape=None)
imreg_dft.utils._compensate_fftshift(vec, shape)

Inversion of _ang2complex()

imreg_dft.utils._extend_array(arr, point, radius)
imreg_dft.utils._getCut(big, small, offset)

Given a big array length and small array length and an offset, output a list of starts of small arrays, so that they cover the big one and their offset is <= the required offset.

  • big (int) – The source length array
  • small (float) – The small length

list - list of possible start locations


In the log-polar spectrum, the (first) coord corresponds to an angle. This function returns a mapping of (the two) coordinates to the respective angle.

imreg_dft.utils._get_constraint_mask(shape, log_base, constraints=None)

Prepare mask to apply to constraints to a cross-power spectrum.

imreg_dft.utils._get_dst1(pt, pts)

Given a point in 2D and vector of points, return vector of distances according to Manhattan metrics

imreg_dft.utils._get_emslices(shape1, shape2)

Common code used by embed_to() and undo_embed()

imreg_dft.utils._get_lograd(shape, log_base)

In the log-polar spectrum, the (second) coord corresponds to an angle. This function returns a mapping of (the two) coordinates to the respective scale.

2D np.ndarray of shape shape, -1 coord contains scales
from 0 to log_base ** (shape[1] - 1)
imreg_dft.utils._get_subarr(array, center, rad)
  • array (ndarray) – The array to search
  • center (2-tuple) – The point in the array to search around
  • rad (int) – Search radius, no radius (i.e. get the single point) implies rad == 0
imreg_dft.utils._get_success(array, coord, radius=2)

Given a coord, examine the array around it and return a number signifying how good is the “match”.

  • radius – Get the success as a sum of neighbor of coord of this radius
  • coord – Coordinates of the maximum. Float numbers are allowed (and converted to int inside)

Success as float between 0 and 1 (can get slightly higher than 1). The meaning of the number is loose, but the higher the better.

imreg_dft.utils._highpass(dft, lo, hi)
imreg_dft.utils._interpolate(array, rough, rad=2)

Returns index that is in the array after being rounded.

The result index tuple is in each of its components between zero and the array’s shape.

imreg_dft.utils._lowpass(dft, lo, hi)
imreg_dft.utils._xpass(shape, lo, hi)

Compute a pass-filter mask with values ranging from 0 to 1.0 The mask is low-pass, application has to be handled by a calling funcion.

imreg_dft.utils.argmax_angscale(array, log_base, exponent, constraints=None, reports=None)

Given a power spectrum, we choose the best fit.

The power spectrum is treated with constraint masks and then passed to _argmax_ext().

imreg_dft.utils.argmax_translation(array, filter_pcorr, constraints=None, reports=None)
imreg_dft.utils.decompose(what, outshp, coef)

Given an array and a shape, it creates a decomposition of the array in form of subarrays and their respective position

  • what (np.ndarray) – The array to be decomposed
  • outshp (tuple-like) – The shape of decompositions

Decomposition — a list of tuples (subarray (np.ndarray), coordinate (np.ndarray))

Return type:


imreg_dft.utils.embed_to(where, what)

Given a source and destination arrays, put the source into the destination so it is centered and perform all necessary operations (cropping or aligning)

  • where – The destination array (also modified inplace)
  • what – The source array

The destination array

imreg_dft.utils.extend_by(what, dst)

Given a source array, extend it by given number of pixels and try to make the extension smooth (not altering the original array).

imreg_dft.utils.extend_to(what, newdim)

Given an image, it puts it in a (typically larger) array. To prevent rough edges from appearing, the containing array has a color that is close to the image’s border color, and image edges smoothly blend into the background.

  • what (ndarray) – What to extend
  • newdim (tuple) – The resulting dimension
imreg_dft.utils.extend_to_3D(what, newdim_2D)

Extend 2D and 3D arrays (when being supplied with their x–y shape).

imreg_dft.utils.frame_img(img, mask, dst, apofield=None)

Given an array, a mask (floats between 0 and 1), and a distance, alter the area where the mask is low (and roughly within dst from the edge) so it blends well with the area where the mask is high. The purpose of this is removal of spurious frequencies in the image’s Fourier spectrum.

  • img (np.array) – What we want to alter
  • maski (np.array) – The indicator what can be altered (0) and what can not (1)
  • dst (int) – Parameter controlling behavior near edges, value could be probably deduced from the mask.
imreg_dft.utils.getCuts(shp0, shp1, coef=0.5)

Given an array shape, tile shape and density coefficient, return list of possible points of the array decomposition.

  • shp0 (np.ndarray) – Shape of the big array
  • shp1 (np.ndarray) – Shape of the tile
  • coef (float) – Density coefficient — lower means higher density and 1.0 means no overlap, 0.5 50% overlap, 0.1 90% overlap etc.

List of tuples (y, x) coordinates of possible tile corners.

Return type:


imreg_dft.utils.getSlices(inshp, outshp, coef)
imreg_dft.utils.get_apofield(shape, aporad)

Returns an array between 0 and 1 that goes to zero close to the edges.

imreg_dft.utils.get_best_cluster(points, scores, rad=0)

Given some additional data, choose the best cluster and the index of the best point in the best cluster. Score of a cluster is sum of scores of points in it.

Note that the point of the best score may not be in the best cluster and a point may be members of multiple cluster.

  • points – Array of bools, indices that belong to the cluster are True
  • scores – Rates a point by a number — higher is better.
imreg_dft.utils.get_borderval(img, radius=None)

Given an image and a radius, examine the average value of the image at most radius pixels from the edge

imreg_dft.utils.get_clusters(points, rad=0)

Given set of points and radius upper bound, return a binary matrix telling whether a given point is close to other points according to _get_dst1(). (point = matrix row).

  • points (np.ndarray) – Shifts.
  • rad (float) – What is closer than rad is considered close.

The result matrix has always True on diagonals.

imreg_dft.utils.get_values(cluster, shifts, scores, angles, scales)

Given a cluster and some vectors, return average values of the data in the cluster. Treat the angular data carefully.

imreg_dft.utils.imfilter(img, low=None, high=None, cap=None)

Given an image, it a high-pass and/or low-pass filters on its Fourier spectrum.

  • img (ndarray) – The image to be filtered
  • low (tuple) – The low-pass filter parameters, 0..1
  • high (tuple) – The high-pass filter parameters, 0..1
  • cap (tuple) – The quantile cap parameters, 0..1. A filtered image will have extremes below the lower quantile and above the upper one cut.

The real component of the image after filtering

Return type:


imreg_dft.utils.mkCut(shp0, dims, start)

Make a cut from shp0 and keep the given dimensions. Also obey the start, but if it is not possible, shift it backwards

Returns:list - List of slices defining the subarray.

Rotate the input array over 180°


Convenience function. Given a tuple of slices, it returns an array of their starts.


Given starts of tiles, deduce the shape of the decomposition from them.

Parameters:starts (list of ints) –
Returns:shape of the decomposition
Return type:tuple
imreg_dft.utils.undo_embed(what, orig_shape)

Undo an embed operation

  • what – What has once be the destination array
  • orig_shape – The shape of the once original array

The closest we got to the undo

imreg_dft.utils.unextend_by(what, dst)

Try to undo as much as the extend_by() does. Some things can’t be undone, though.

imreg_dft.utils.wrap_angle(angles, ceil=6.283185307179586)
  • angles (float or ndarray, unit depends on kwarg ceil) –
  • ceil (float) – Turnaround value

tiles module

This module contains generic functionality for phase correlation, so it can be reused easily.

imreg_dft.tiles._distribute_resdict(resdict, ii)
imreg_dft.tiles._fill_globals(tiles, poss, image, opts)
imreg_dft.tiles._postprocess_unextend(ims, im2, extend, rcoef=1)
imreg_dft.tiles._preprocess_extend(ims, extend, low, high, cut, rcoef)
imreg_dft.tiles._preprocess_extend_single(im, extend, low, high, cut, rcoef, bigshape)
imreg_dft.tiles.filter_images(imgs, low, high, cut)
imreg_dft.tiles.process_images(ims, opts, tosa=None, get_unextended=False, reports=None)
  • tosa (np.ndarray) – An array where to save the transformed subject.
  • get_unextended (bool) – Whether to get the transformed subject in the same shape and coord origin as the template.
imreg_dft.tiles.process_tile(ii, reports=None)
imreg_dft.tiles.resample(img, coef)
imreg_dft.tiles.settle_tiles(imgs, tiledim, opts, reports=None)

imreg module

This module contains mostly high-level functions.

FFT based image registration. — main functions

imreg_dft.imreg._get_ang_scale(ims, bgval, exponent='inf', constraints=None, reports=None)

Given two images, return their scale and angle difference.

  • ims (2-tuple-like of 2D ndarrays) – The images
  • bgval – We also pad here in the map_coordinates()
  • exponent (float or 'inf') – The exponent stuff, see similarity()
  • constraints (dict, optional) –
  • reports (optional) –

Scale, angle. Describes the relationship of the subject image to the first one.

Return type:


imreg_dft.imreg._get_log_base(shape, new_r)

Basically common functionality of _logpolar() and _get_ang_scale()

This value can be considered fixed, if you want to mess with the logpolar transform, mess with the shape.

  • shape – Shape of the original image.
  • new_r (float) – The r-size of the log-polar transform array dimension.

Base of the log-polar transform. The following holds: \(log\_base = \exp( \ln [ \mathit{spectrum\_dim} ] / \mathit{loglpolar\_scale\_dim} )\), or the equivalent \(log\_base^{\mathit{loglpolar\_scale\_dim}} = \mathit{spectrum\_dim}\).

Return type:


imreg_dft.imreg._get_odds(angle, target, stdev)

Determine whether we are more likely to choose the angle, or angle + 180°

  • angle (float, degrees) – The base angle.
  • target (float, degrees) – The angle we think is the right one. Typically, we take this from constraints.
  • stdev (float, degrees) – The relevance of the target value. Also typically taken from constraints.

The greater the odds are, the higher is the preferrence

of the angle + 180 over the original angle. Odds of -1 are the same as inifinity.

Return type:


imreg_dft.imreg._get_precision(shape, scale=1)

Given the parameters of the log-polar transform, get width of the interval where the correct values are.

  • shape (tuple) – Shape of images
  • scale (float) – The scale difference (precision varies)
imreg_dft.imreg._logpolar(image, shape, log_base, bgval=None)

Return log-polar transformed image Takes into account anisotropicity of the freq spectrum of rectangular images

  • image – The image to be transformed
  • shape – Shape of the transformed image
  • log_base – Parameter of the transformation, get it via _get_log_base()
  • bgval – The backround value. If None, use minimum of the image.

The transformed image


Make a radial cosine filter for the logpolar transform. This filter suppresses low frequencies and completely removes the zero freq.

imreg_dft.imreg._phase_correlation(im0, im1, callback=None, *args)

Computes phase correlation between im0 and im1

  • im0
  • im1
  • callback (function) – Process the cross-power spectrum (i.e. choose coordinates of the best element, usually of the highest one). Defaults to imreg_dft.utils.argmax2D()

The translation vector (Y, X). Translation vector of (0, 0)

means that the two images match.

Return type:


imreg_dft.imreg._similarity(im0, im1, numiter=1, order=3, constraints=None, filter_pcorr=0, exponent='inf', bgval=None, reports=None)

This function takes some input and returns mutual rotation, scale and translation. It does these things during the process:

  • Handles correct constraints handling (defaults etc.).
  • Performs angle-scale determination iteratively. This involves keeping constraints in sync.
  • Performs translation determination.
  • Calculates precision.
Returns:Dictionary with results.
imreg_dft.imreg._translation(im0, im1, filter_pcorr=0, constraints=None, reports=None)

The plain wrapper for translation phase correlation, no big deal.

imreg_dft.imreg.imshow(im0, im1, im2, cmap=None, fig=None, **kwargs)

Plot images using matplotlib. Opens a new figure with four subplots:

|                      |                     |
|   <template image>   |   <subject image>   |
|                      |                     |
| <difference between  |                     |
|  template and the    |<transformed subject>|
| transformed subject> |                     |
  • im0 (np.ndarray) – The template image
  • im1 (np.ndarray) – The subject image
  • im2 – The transformed subject — it is supposed to match the template
  • cmap (optional) – colormap
  • fig (optional) – The figure you would like to have this plotted on

The figure with subplots

Return type:

matplotlib figure

imreg_dft.imreg.similarity(im0, im1, numiter=1, order=3, constraints=None, filter_pcorr=0, exponent='inf', reports=None)

Return similarity transformed image im1 and transformation parameters. Transformation parameters are: isotropic scale factor, rotation angle (in degrees), and translation vector.

A similarity transformation is an affine transformation with isotropic scale and without shear.

  • im0 (2D numpy array) – The first (template) image
  • im1 (2D numpy array) – The second (subject) image
  • numiter (int) – How many times to iterate when determining scale and rotation
  • order (int) – Order of approximation (when doing transformations). 1 = linear, 3 = cubic etc.
  • filter_pcorr (int) – Radius of a spectrum filter for translation detection
  • exponent (float or 'inf') – The exponent value used during processing. Refer to the docs for a thorough explanation. Generally, pass “inf” when feeling conservative. Otherwise, experiment, values below 5 are not even supposed to work.
  • constraints (dict or None) –

    Specify preference of seeked values. Pass None (default) for no constraints, otherwise pass a dict with keys angle, scale, tx and/or ty (i.e. you can pass all, some of them or none of them, all is fine). The value of a key is supposed to be a mutable 2-tuple (e.g. a list), where the first value is related to the constraint center and the second one to softness of the constraint (the higher is the number, the more soft a constraint is).

    More specifically, constraints may be regarded as weights in form of a shifted Gaussian curve. However, for precise meaning of keys and values, see the documentation section Using constraints. Names of dictionary keys map to names of command-line arguments.


Contains following keys: scale, angle, tvec (Y, X), success and timg (the transformed subject image)

Return type:



There are limitations

  • Scale change must be less than 2.
  • No subpixel precision (but you can use resampling to get around this).
imreg_dft.imreg.similarity_matrix(scale, angle, vector)

Return homogeneous transformation matrix from similarity parameters.

Transformation parameters are: isotropic scale factor, rotation angle (in degrees), and translation vector (of size 2).

The order of transformations is: scale, rotate, translate.

imreg_dft.imreg.transform_img(img, scale=1.0, angle=0.0, tvec=(0, 0), mode='constant', bgval=None, order=1)

Return translation vector to register images.

  • img (2D or 3D numpy array) – What will be transformed. If a 3D array is passed, it is treated in a manner in which RGB images are supposed to be handled - i.e. assume that coordinates are (Y, X, channels).
  • scale (float) – The scale factor (scale > 1.0 means zooming in)
  • angle (float) – Degrees of rotation (clock-wise)
  • tvec (2-tuple) – Pixel translation vector, Y and X component.
  • mode (string) – The transformation mode (refer to e.g. scipy.ndimage.shift() and its kwarg mode).
  • bgval (float) – Shade of the background (filling during transformations) If None is passed, imreg_dft.utils.get_borderval() with radius of 5 is used to get it.
  • order (int) – Order of approximation (when doing transformations). 1 = linear, 3 = cubic etc. Linear works surprisingly well.

The transformed img, may have another i.e. (bigger) shape than the source.

Return type:


imreg_dft.imreg.transform_img_dict(img, tdict, bgval=None, order=1, invert=False)

Wrapper of transform_img(), works well with the similarity() output.

  • img
  • tdict (dictionary) – Transformation dictionary — supposed to contain keys “scale”, “angle” and “tvec”
  • bgval
  • order
  • invert (bool) – Whether to perform inverse transformation — doesn’t work very well with the translation.

See also


Return type:


imreg_dft.imreg.translation(im0, im1, filter_pcorr=0, odds=1, constraints=None, reports=None)

Return translation vector to register images. It tells how to translate the im1 to get im0.

  • im0 (2D numpy array) – The first (template) image
  • im1 (2D numpy array) – The second (subject) image
  • filter_pcorr (int) – Radius of the minimum spectrum filter for translation detection, use the filter when detection fails. Values > 3 are likely not useful.
  • constraints (dict or None) – Specify preference of seeked values. For more detailed documentation, refer to similarity(). The only difference is that here, only keys tx and/or ty (i.e. both or any of them or none of them) are used.
  • odds (float) – The greater the odds are, the higher is the preferrence of the angle + 180 over the original angle. Odds of -1 are the same as inifinity. The value 1 is neutral, the converse of 2 is 1 / 2 etc.

Contains following keys: angle, tvec (Y, X),

and success.

Return type:


How to release

The build process in Python is not straightforward (as of 2014). Generally, you want this to be taken care of:

  • The version mentioned in src/imreg_dft/ is the right one.
  • Documentation can be generated (after make clean) and tests run OK too.
  • The source tree is tagged (this is obviously the last step).

For this, there is a bash script tests/ It accepts one argument — the version string. It runs everything and although it doesn’t do anything, it helps you to keep track of what is OK and what still needs to be worked on.

You can execute it from anywhere, for example from the project root:

[user@linuxbox imreg_dft]$ bash tests/ 1.0.5

The output should be self-explanatory. The script is not supposed to rewrite anything important; however, it may run the documentation generation and tests. Those, however, can.

Become part of it!

Do you like the project? Do you feel inspired? Do you want to help out?

You are warmly welcome to do so!

How to contribute

The process is pretty standard if you are used to Github.

most likely

  1. Become familiar with git and learn how to use it properly, i.e. tell git who you are so it can label the commits you’ve made:

    git config --global
    git config --global "Your Name Comes Here"
  2. You can do two things now:

    1. Fork imreg_dft using Github web interface and clone it.

    2. If you want to make a minor modification and/or don’t have a Github account, just clone imreg_dft:

      git clone
      cd imreg_dft
  3. Make a ‘feature branch’. This will be where you work on your bug fix or whatever. It’s nice and safe and leaves you with access to an unmodified copy of the code in the main branch:

    git branch the-fix-im-thinking-of
    git checkout the-fix-im-thinking-of

    Then, do some edits, and commit them as you go.

  4. Finally, you have to deliver your precious work in a smart way to the project. How to do this depends on whether you have created a pull request using Github or whether you went the simpler, but hardcore way. So, you have to do either

    1. use again the Github interface, select your feature branch there and do some clicking stuff to create a pull request,

    2. or make your commits into patches. You want all the commits since you branched from the master branch:

      git format-patch -M -C master

      You will now have several files named for the commits:


      Send these files to the current project maintainer.


If you hack the code, remember these things:

  • Add yourself into the AUTHORS file and briefly describe your contribution.
  • If your contribution affects how imreg_dft works (this is very likely), mention this in the documentation.