Python API

This section includes information for using the pure Python API of bob.ip.base.

Classes

bob.ip.base.GeomNorm Objects of this class, after configuration, can perform a geometric
bob.ip.base.FaceEyesNorm Objects of this class, after configuration, can perform a geometric
bob.ip.base.LBP A class that extracts local binary patterns in various types
bob.ip.base.LBPTop A class that extracts local binary patterns (LBP) in three orthogonal
bob.ip.base.DCTFeatures Objects of this class, after configuration, can extract DCT features.
bob.ip.base.TanTriggs Objects of this class, after configuration, can preprocess images
bob.ip.base.Gaussian Objects of this class, after configuration, can perform Gaussian
bob.ip.base.Wiener A Wiener filter
bob.ip.base.MultiscaleRetinex This class allows after configuration to apply the Self Quotient Image
bob.ip.base.WeightedGaussian This class performs weighted gaussian smoothing (anisotropic filtering)
bob.ip.base.SelfQuotientImage This class allows after configuration to apply the Self Quotient Image
bob.ip.base.GaussianScaleSpace This class allows after configuration the generation of Gaussian
bob.ip.base.GSSKeypoint Structure to describe a keypoint on the
bob.ip.base.GSSKeypointInfo This is a companion structure to the
bob.ip.base.SIFT This class allows after configuration the extraction of SIFT
bob.ip.base.VLSIFT Computes SIFT features using the VLFeat library
bob.ip.base.VLDSIFT Computes dense SIFT features using the VLFeat library
bob.ip.base.GradientMagnitude Gradient ‘magnitude’ used
bob.ip.base.BlockNorm Enumeration that defines the norm that is used for normalizing the
bob.ip.base.HOG Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors.
bob.ip.base.GLCMProperty Enumeration that defines the properties of GLCM, to be used in
bob.ip.base.GLCM(\*args, \*\*kwargs) Objects of this class, after configuration, can compute Grey-Level

Functions

bob.ip.base.flip((src, [dst]) -> dst) Flip a 2D or 3D array/image upside-down.
bob.ip.base.flop((src, [dst]) -> dst) Flip a 2D or 3D array/image left-right.
bob.ip.base.crop((src, crop_offset, ...) Crops the given image src image to the given offset (might be negative) and to the given size (might be greater than src image).
bob.ip.base.shift((src, offset, [dst], ...) Shifts the given image src image with the given offset (might be negative).
bob.ip.base.scale
  • scale(src, scaling_factor) -> dst
bob.ip.base.scaled_output_shape((src, ...) This function returns the shape of the scaled image for the given
bob.ip.base.rotate
  • rotate(src, rotation_angle) -> dst
bob.ip.base.rotated_output_shape((src, ...) This function returns the shape of the rotated image for the given
bob.ip.base.angle_to_horizontal((right, ...) Get the angle needed to level out (horizontally) two points.
bob.ip.base.block((input, block_size, ...) Performs a block decomposition of a 2D array/image
bob.ip.base.block_output_shape((input, ...) Returns the shape of the output image that is required to compute the
bob.ip.base.crop((src, crop_offset, ...) Crops the given image src image to the given offset (might be negative) and to the given size (might be greater than src image).
bob.ip.base.shift((src, offset, [dst], ...) Shifts the given image src image with the given offset (might be negative).
bob.ip.base.extrapolate_mask
  • extrapolate_mask(mask, img) -> None
bob.ip.base.max_rect_in_mask((mask) -> rect) Given a 2D mask (a 2D blitz array of booleans), compute the maximum rectangle which only contains true values.
bob.ip.base.angle_to_horizontal((right, ...) Get the angle needed to level out (horizontally) two points.
bob.ip.base.histogram
  • histogram(src, [bin_count]) -> hist
bob.ip.base.lbphs((input, lbp, block_size, ...) Computes an local binary pattern histogram sequences from the given
bob.ip.base.lbphs_output_shape((input, lbp, ...) Returns the shape of the output image that is required to compute the
bob.ip.base.histogram_equalization
  • histogram_equalization(src) -> None
bob.ip.base.gamma_correction((src, gamma, ...) Performs a power-law gamma correction of a given 2D image
bob.ip.base.integral((src, dst, [sqr], ...) Computes an integral image for the given input image
bob.ip.base.zigzag((src, dst, ...) Extracts a 1D array using a zigzag pattern from a 2D array
bob.ip.base.median((src, radius, [dst]) -> dst) Performs a median filtering of the input image with the given radius
bob.ip.base.sobel((src, [border], [dst]) -> dst) Performs a Sobel filtering of the input image

Detailed Information

bob.ip.base.get_config()[source]

Returns a string containing the configuration information.

class bob.ip.base.BlockNorm

Bases: object

Enumeration that defines the norm that is used for normalizing the descriptor blocks

Possible values are:

  • L2: Euclidean norm
  • L2Hys: L2 norm with clipping of high values
  • L1: L1 norm (Manhattan distance)
  • L1sqrt: Square root of the L1 norm
  • Nonorm: no norm used

Class Members:

L1 = 2
L1sqrt = 3
L2 = 0
L2Hys = 1
Nonorm = 4
entries = {'Nonorm': 4, 'L2Hys': 1, 'L2': 0, 'L1sqrt': 3, 'L1': 2}
class bob.ip.base.DCTFeatures

Bases: object

Objects of this class, after configuration, can extract DCT features.

The DCT feature extraction is described in more detail in [Sanderson2002]. This class also supports block normalization and DCT coefficient normalization.

Constructor Documentation:

  • bob.ip.base.DCTFeatures (coefficients, block_size, [block_overlap], [normalize_block], [normalize_dct], [square_pattern])
  • bob.ip.base.DCTFeatures (dct_features)

Constructs a new DCT features extractor

Todo

Explain DCTFeatures constructor in more detail.

Parameters:

coefficients : int

The number of DCT coefficients;

Note

the real number of DCT coefficient returned by the extractor is coefficients-1 when the block normalization is enabled by setting normalize_block=True (as the first coefficient is always 0 in this case)

block_size : (int, int)

The size of the blocks, in which the image is decomposed

block_overlap : (int, int)

[default: (0, 0)] The overlap of the blocks

normalize_block : bool

[default: False] Normalize each block to zero mean and unit variance before extracting DCT coefficients? In this case, the first coefficient will always be zero and hence will not be returned

normalize_dct : bool

[default: False] Normalize DCT coefficients to zero mean and unit variance after the DCT extraction?

square_pattern : bool

[default: False] Select, whether a zigzag pattern or a square pattern is used for the DCT extraction; for a square pattern, the number of DCT coefficients must be a square integer

dct_features : bob.ip.base.DCTFeatures

The DCTFeatures object to use for copy-construction

Class Members:

block_overlap

(int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-DCTFeatures extractor, with read and write access

Note

The block_overlap must be smaller than the block_size.

block_size

(int, int) <– The size of each block for the block decomposition, with read and write access

coefficients

int <– The number of DCT coefficients, with read and write access

Note

The real number of DCT coefficient returned by the extractor is coefficients-1 when the block normalization is enabled (as the first coefficient is always 0 in this case)

extract()
  • extract(input, [flat]) -> output
  • extract(input, output) -> None

Extracts DCT features from either uint8, uint16 or double arrays

The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D or 3D array of type float64 and allocated with the correct dimensions (see output_shape()). If the destination array is not given (first version), it is generated in the required size. The blocks can be split into either a 2D array of shape (block_index, coefficients) by setting flat=True, or into a 3D array of shape (block_index_y, block_index_x, coefficients) with flat=False.

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D)

The input image for which DCT features should be extracted

flat : bool

[default: True] The flat parameter is used to decide whether 2D (flat = True) or 3D (flat = False) output shape is generated

output : array_like (2D, float)

The output image that need to be of shape output_shape()

Returns:

output : array_like (2D, float)

The resulting DCT features
normalization_epsilon

float <– The epsilon value to avoid division-by-zero when performing block or DCT coefficient normalization (read and write access)

The default value for this epsilon is 10 * sys.float_info.min, and usually there is little necessity to change that.

normalize_block

bool <– Normalize each block to zero mean and unit variance before extracting DCT coefficients (read and write access)

Note

In case normalize_block is set to True the first coefficient will always be zero and, hence, will not be returned.

normalize_dct

bool <– Normalize DCT coefficients to zero mean and unit variance after the DCT extraction (read and write access)

output_shape()
  • output_shape(input, [flat]) -> dct_shape
  • output_shape(shape, [flat]) -> dct_shape

This function returns the shape of the DCT output for the given input

The blocks can be split into either a 2D array of shape (block_index, coefficients) by setting flat=True, or into a 3D array of shape (block_index_y, block_index_x, coefficients) with flat=False.

Parameters:

input : array_like (2D)

The input image for which DCT features should be extracted

shape : (int, int)

The shape of the input image for which DCT features should be extracted

flat : bool

[default: True] The flat parameter is used to decide whether 2D (flat = True) or 3D (flat = False) output shape is generated

Returns:

dct_shape : (int, int) or (int, int, int)

The shape of the DCT features image that is required in a call to extract()
square_pattern

bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?

Note

For a square pattern, the number of DCT coefficients must be a square integer.

class bob.ip.base.FaceEyesNorm

Bases: object

Objects of this class, after configuration, can perform a geometric normalization of facial images based on their eye positions

The geometric normalization is a combination of rotation, scaling and cropping an image. The underlying implementation relies on a bob.ip.base.GeomNorm object to perform the actual geometric normalization.

Constructor Documentation:

  • bob.ip.base.FaceEyesNorm (crop_size, eyes_distance, eyes_center)
  • bob.ip.base.FaceEyesNorm (crop_size, right_eye, left_eye)
  • bob.ip.base.FaceEyesNorm (other)

Constructs a FaceEyesNorm object.

Basically there exist two ways to define a FaceEyesNorm. Both ways require the resulting crop_size. The first constructor takes the inter-eye-distance and the center of the eyes, which will be used as transformation center. The second version takes the image resolution and two arbitrary positions in the face, with which the image will be aligned. Usually, these positions are the eyes, but any other pair (like mouth and eye for profile faces) can be specified.

Parameters:

crop_size : (int, int)

The resolution of the normalized face

eyes_distance : float

The inter-eye-distance in the normalized face

eyes_center : (float, float)

The center point between the eyes in the normalized face

right_eye : (float, float)

The location of the right eye (or another fix point) in the normalized image

left_eye : (float, float)

The location of the left eye (or another fix point) in the normalized image

other : FaceEyesNorm

Another FaceEyesNorm object to copy

Class Members:

crop_offset

(float, float) <– The transformation center in the processed image, which is usually the center between the eyes; with read and write access

crop_size

(int, int) <– The size of the normalized image, with read and write access

extract()
  • extract(input, right_eye, left_eye) -> output
  • extract(input, output, right_eye, left_eye) -> None
  • extract(input, input_mask, output, output_mask, right_eye, left_eye) -> None

This function extracts and normalized the facial image

This function extracts the facial image based on the eye locations (or the location of other fixed point, see note below). The geometric normalization is applied such that the eyes are placed to fixed positions in the normalized image. The image is cropped at the same time, so that no unnecessary operations are executed.

Note

Instead of the eyes, any two fixed positions can be used to normalize the face. This can simply be achieved by selecting two other nodes in the constructor (see FaceEyesNorm) and in this function. Just make sure that ‘right’ and ‘left’ refer to the same landmarks in both functions.

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D or 3D)

The input image to which FaceEyesNorm should be applied

output : array_like (2D or 3D, float)

The output image, which must be of size crop_size

right_eye : (float, float)

The position of the right eye (or another landmark) in input image coordinates.

left_eye : (float, float)

The position of the left eye (or another landmark) in input image coordinates.

input_mask : array_like (2D, bool)

An input mask of valid pixels before geometric normalization, must be of same size as input

output_mask : array_like (2D, bool)

The output mask of valid pixels after geometric normalization, must be of same size as output

Returns:

output : array_like(2D or 3D, float)

The resulting normalized face image, which is of size crop_size
eyes_angle

float <– The angle between the eyes in the normalized image (relative to the horizontal line), with read and write access

eyes_distance

float <– The distance between the eyes in the normalized image, with read and write access

geom_norm

bob.ip.base.GeomNorm <– The geometric normalization class that was used to compute the last normalization, read access only

last_angle

float <– The rotation angle that was applied on the latest normalized image, read access only

last_offset

(float, float) <– The original transformation offset (eye center) in the normalization process, read access only

last_scale

float <– The scale that was applied on the latest normalized image, read access only

class bob.ip.base.GLCM(*args, **kwargs)[source]

Bases: bob.ip.base.GLCM

Objects of this class, after configuration, can compute Grey-Level Co-occurence Matrix of an image

This class allows to extract a Grey-Level Co-occurence Matrix (GLCM) [Haralick1973]. A thorough tutorial about GLCM and the textural (so-called Haralick) properties that can be derived from it, can be found at: http://www.fp.ucalgary.ca/mhallbey/tutorial.htm. A MatLab implementation can be found at: http://www.mathworks.ch/ch/help/images/ref/graycomatrix.html

Constructor Documentation:

  • bob.ip.base.GLCM ([levels], [min_level], [max_level], [dtype])
  • bob.ip.base.GLCM (quantization_table)
  • bob.ip.base.GLCM (glcm)

Constructor

Todo

The parameter(s) ‘levels, max_level, min_level, quantization_table’ are used, but not documented.

Parameters:

dtype : numpy.dtype

[default: numpy.uint8] The data-type for the GLCM class

glcm : bob.ip.base.GLCM

The GLCM object to use for copy-construction

Class Members:

angular_second_moment(input) → property

Computes the angular_second_moment property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘angular_second_moment’ property
auto_correlation(input) → property

Computes the auto_correlation property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘auto_correlation’ property
cluster_prominence(input) → property

Computes the cluster_prominence property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘cluster_prominence’ property
cluster_shade(input) → property

Computes the cluster_shade property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘cluster_shade’ property
contrast(input) → property

Computes the contrast property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘contrast’ property
correlation(input) → property

Computes the correlation property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘correlation’ property
correlation_matlab(input) → property

Computes the correlation_matlab property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘correlation_matlab’ property
difference_entropy(input) → property

Computes the difference_entropy property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘difference_entropy’ property
difference_variance(input) → property

Computes the difference_variance property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘difference_variance’ property
dissimilarity(input) → property

Computes the dissimilarity property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘dissimilarity’ property
dtype

numpy.dtype <– The data type, which was used in the constructor

Only images of this data type can be processed in the extract() function.

energy(input) → property

Computes the energy property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘energy’ property
entropy(input) → property

Computes the entropy property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘entropy’ property
extract(input[, output]) → output

Extracts the GLCM matrix from the given input image

If given, the output array should have the expected type (numpy.float64) and the size as defined by output_shape() .

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D)

The input image to extract GLCM features from

output : array_like (3D, float)

[default: None] If given, the output will be saved into this array; must be of the shape as output_shape()

Returns:

output : array_like (3D, float)

The resulting output data, which is the same as the parameter output (if given)
homogeneity(input) → property

Computes the homogeneity property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘homogeneity’ property
information_measure_of_correlation_1(input) → property

Computes the information_measure_of_correlation_1 property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘information_measure_of_correlation_1’ property
information_measure_of_correlation_2(input) → property

Computes the information_measure_of_correlation_2 property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘information_measure_of_correlation_2’ property
inverse_difference(input) → property

Computes the inverse_difference property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘inverse_difference’ property
inverse_difference_moment(input) → property

Computes the inverse_difference_moment property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘inverse_difference_moment’ property
inverse_difference_moment_normalized(input) → property

Computes the inverse_difference_moment_normalized property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘inverse_difference_moment_normalized’ property
inverse_difference_normalized(input) → property

Computes the inverse_difference_normalized property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘inverse_difference_normalized’ property
levels

int <– Specifies the number of gray-levels to use when scaling the gray values in the input image

This is the number of the values in the first and second dimension in the GLCM matrix. The default is the total number of gray values permitted by the type of the input image.

max_level

int <– Gray values greater than or equal to this value are scaled to levels The default is the maximum gray-level permitted by the type of input image.

maximum_probability(input) → property

Computes the maximum_probability property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘maximum_probability’ property
min_level

int <– Gray values smaller than or equal to this value are scaled to 0The default is the minimum gray-level permitted by the type of input image.

normalized

bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?

Note

For a square pattern, the number of DCT coefficients must be a square integer.

offset

array_like (2D, int) <– The offset specifying the column and row distance between pixel pairs

The shape of this array is (num_offsets, 2), where num_offsets is the total number of offsets to be taken into account when computing GLCM.

output_shape() → shape

Get the shape of the GLCM matrix goven the input image

The shape has 3 dimensions: two for the number of gray levels, and one for the number of offsets

Returns:

shape : (int, int, int)

The shape of the output array required to call extract()
properties_by_name(glcm_matrix, prop_names) → prop_values

Query the properties of GLCM by specifying a name

Returns a list of numpy.array of the queried properties. Please see the documentation of bob.ip.base.GLCMProperty for details on the possible properties.

Parameters:

glcm_matrix : array_like (3D, float)

The result of the GLCM extraction

prop_names : [bob.ip.base.GLCMProperty]

[default: None] A list of GLCM properties; either by value (int) or by name (str)

Returns:

prop_values : [array_like (1D, float)]

The GLCM properties for the given prop_names
quantization_table

array_like (1D) <– The thresholds of the quantizationEach element corresponds to the lower boundary of the particular quantization level. E.g.. array([ 0, 5, 10]) means quantization in 3 levels. Input values in the range [0,4] will be quantized to level 0, input values in the range[5,9] will be quantized to level 1 and input values in the range [10-max_level] will be quantized to level 2.

sum_average(input) → property

Computes the sum_average property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘sum_average’ property
sum_entropy(input) → property

Computes the sum_entropy property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘sum_entropy’ property
sum_variance(input) → property

Computes the sum_variance property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘sum_variance’ property
symmetric

bool <– Tells whether a zigzag pattern or a square pattern is used for the DCT extraction (read and write access)?

Note

For a square pattern, the number of DCT coefficients must be a square integer.

variance(input) → property

Computes the variance property

Parameters

input : array_like (3D, float)

The result of the extract() function

Returns

property :array_like (1D, float)

The resulting ‘variance’ property
class bob.ip.base.GLCMProperty

Bases: object

Enumeration that defines the properties of GLCM, to be used in bob.ip.base.GLCM.properties_by_name()

Possible values are:

  • 'angular_second_moment' [1] / energy [6]
  • 'energy' [4]
  • 'variance' (sum of squares) [1]
  • 'contrast' [1], [6]
  • 'auto_correlation' [2]
  • 'correlation' [1]
  • 'correlation_matlab' as in MATLAB Image Processing Toolbox method graycoprops() [6]
  • 'inverse_difference_moment' [1] = homogeneity [2], homop[5]
  • 'sum_average' [1]
  • 'sum_variance' [1]
  • 'sum_entropy' [1]
  • 'entropy' [1]
  • 'difference_variance' [4]
  • 'difference_entropy' [1]
  • 'dissimilarity' [4]
  • 'homogeneity' [6]
  • 'cluster_prominence' [2]
  • 'cluster_shade' [2]
  • 'maximum_probability' [2]
  • 'information_measure_of_correlation_1' [1]
  • 'information_measure_of_correlation_2' [1]
  • 'inverse_difference' (INV) is homom [3]
  • 'inverse_difference_normalized' (INN) [3]
  • 'inverse_difference_moment_normalized' [3]

The references from above are as follows:

Class Members:

angular_second_moment = 0
auto_correlation = 4
cluster_prominence = 16
cluster_shade = 17
contrast = 3
correlation = 5
correlation_matlab = 6
difference_entropy = 13
difference_variance = 12
dissimilarity = 14
energy = 1
entries = {'cluster_prominence': 16, 'energy': 1, 'homogeneity': 15, 'entropy': 11, 'difference_variance': 12, 'inverse_difference_normalized': 22, 'inverse_difference_moment': 7, 'sum_entropy': 10, 'angular_second_moment': 0, 'difference_entropy': 13, 'correlation_matlab': 6, 'sum_variance': 9, 'contrast': 3, 'cluster_shade': 17, 'auto_correlation': 4, 'maximum_probability': 18, 'inverse_difference_moment_normalized': 23, 'information_measure_of_correlation_1': 19, 'dissimilarity': 14, 'sum_average': 8, 'correlation': 5, 'inverse_difference': 21, 'variance': 2, 'information_measure_of_correlation_2': 20}
entropy = 11
homogeneity = 15
information_measure_of_correlation_1 = 19
information_measure_of_correlation_2 = 20
inverse_difference = 21
inverse_difference_moment = 7
inverse_difference_moment_normalized = 23
inverse_difference_normalized = 22
maximum_probability = 18
sum_average = 8
sum_entropy = 10
sum_variance = 9
variance = 2
class bob.ip.base.GSSKeypoint

Bases: object

Structure to describe a keypoint on the bob.ip.base.GaussianScaleSpace

It consists of a scale sigma, a location (y,x) and an orientation.

Constructor Documentation:

bob.ip.base.GSSKeypoint (sigma, location, [orientation])

Creates a GSS keypoint

Parameters:

sigma : float

The floating point value describing the scale of the keypoint

location : (float, float)

The location of the keypoint

orientation : float

[default: 0] The orientation of the keypoint (in degrees)

Class Members:

location

(float, float) <– The location (y, x) of the keypoint, with read and write access

orientation

float <– The orientation of the keypoint (in degree), with read and write access

sigma

float <– The floating point value describing the scale of the keypoint, with read and write access

class bob.ip.base.GSSKeypointInfo

Bases: object

This is a companion structure to the bob.ip.base.GSSKeypoint

It provides additional and practical information such as the octave and scale indices, the integer location location = (y,x), and eventually the scores associated to the detection step (peak_score and edge_score)

Constructor Documentation:

bob.ip.base.GSSKeypointInfo ([octave_index], [scale_index], [location], [peak_score], [edge_score])

Creates a GSS keypoint

Parameters:

octave_index : int

[default: 0] The octave index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object

scale_index : int

[default: 0] The scale index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object

location : (int, int)

[default: (0, 0)] The integer unnormalized location (y,x) of the keypoint

peak_score : float

[default: 0] The orientation of the keypoint (in degrees)

edge_score : float

[default: 0] The orientation of the keypoint (in degrees)

Class Members:

edge_score

float <– The edge score of the keypoint during the SIFT-like detection step, with read and write access

location

(int, int) <– The integer unnormalized location (y, x) of the keypoint, with read and write access

octave_index

int <– The octave index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object, with read and write access

peak_score

float <– The peak score of the keypoint during the SIFT-like detection step, with read and write access

scale_index

int <– The scale index associated with the keypoint in the bob.ip.base.GaussianScaleSpace object, with read and write access

class bob.ip.base.Gaussian

Bases: object

Objects of this class, after configuration, can perform Gaussian filtering (smoothing) on images

The Gaussian smoothing is done by convolving the image with a vertical and a horizontal smoothing filter.

Constructor Documentation:

  • bob.ip.base.Gaussian (sigma, [radius], [border])
  • bob.ip.base.Gaussian (gaussian)

Constructs a new Gaussian filter

The Gaussian kernel is generated in both directions independently, using the given standard deviation and the given radius, where the size of the kernels is actually 2*radius+1. When the radius is not given or negative, it will be automatically computed ad 3*sigma.

Note

Since the Gaussian smoothing is done by convolution, a larger radius will lead to longer execution time.

Parameters:

sigma : (double, double)

The standard deviation of the Gaussian along the y- and x-axes in pixels

radius : (int, int)

[default: (-1, -1) -> 3*sigma ] The radius of the Gaussian in both directions – the size of the kernel is 2*radius+1

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

gaussian : bob.ip.base.Gaussian

The Gaussian object to use for copy-construction

Class Members:

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access

filter(src[, dst]) → dst

Smooths an image (2D/grayscale or 3D/color)

If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be smoothed

dst : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as src

Returns:

dst : array_like (2D, float)

The resulting output image, which is the same as dst (if given)
kernel_x

array_like (1D, float) <– The values of the kernel in horizontal direction; read only access

kernel_y

array_like (1D, float) <– The values of the kernel in vertical direction; read only access

radius

(int, int) <– The radius of the Gaussian along the y- and x-axes (size of the kernel=2*radius+1); with read and write access

When setting the radius to a negative value, it will be automatically computed as 3*sigma.

sigma

(float, float) <– The standard deviation of the Gaussian along the y- and x-axes; with read and write access

Note

The radius of the kernel is not reset by setting the sigma value.

class bob.ip.base.GaussianScaleSpace

Bases: object

This class allows after configuration the generation of Gaussian Pyramids that can be used to extract SIFT features

For details, please read [Lowe2004].

Constructor Documentation:

  • bob.ip.base.GaussianScaleSpace (size, scales, octaves, octave_min, [sigma_n], [sigma0], [kernel_radius_factor], [border])
  • bob.ip.base.GaussianScaleSpace (gss)

Constructs a new DCT features extractor

Todo

Explain GaussianScaleSpace constructor in more detail.

Warning

The order of the parameters scales and octaves has changed compared to the old implementation, in order to keep it consistent with bob.ip.base.VLSIFT!

Parameters:

size : (int, int)

The height and width of the images to process

scales : int

The number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT features

octaves : int

The number of octaves of the pyramid

octave_min : int

The index of the minimum octave

sigma_n : float

[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scale

sigma0 : float

[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scale

kernel_radius_factor : float

[default: 4.] Factor used to determine the kernel radii: size=2*radius+1. For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

gss : bob.ip.base.GaussianScaleSpace

The GaussianScaleSpace object to use for copy-construction

Class Members:

allocate_output() → pyramid

Allocates a python list of arrays for the Gaussian pyramid

Returns:

pyramid : [array_like(3D, float)]

A list of output arrays in the size required to call :py:func`process`
border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access

get_gaussian(index) → gaussian

Returns the Gaussian at index/interval/scale i

Parameters:

index : int

The index of the scale for which the Gaussian should be retrieved

Returns:

gaussian : bob.ip.base.Gaussian

The Gaussian at the given index
kernel_radius_factor

float <– Factor used to determine the kernel radii size=2*radius+1

For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})

octave_max

int <– The index of the minimum octave, read only access

This is equal to octave_min+n_octaves-1.

octave_min

int <– The index of the minimum octave, with read and write access

octaves

int <– The number of octaves of the pyramid, with read and write access

process(src[, dst]) → dst

Computes a Gaussian Pyramid for an input 2D image

If given, the results are put in the output dst, which output should already be allocated and of the correct size (using the allocate_output() method).

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be processed

dst : [array_like (3D, float)]

The Gaussian pyramid that should have been allocated with allocate_output()

Returns:

dst : [array_like (3D, float)]

The resulting Gaussian pyramid, if given it will be the same as the dst parameter
scales

int <– The number of intervals of the pyramid, with read and write access

Three additional scales will be computed in practice, as this is required for extracting SIFT features

set_sigma0_no_init_smoothing() → None

Sets sigma0 such that there is not smoothing at the first scale of octave_min

sigma0

float <– The value sigma0 of the standard deviation for the image of the first octave and first scale

sigma_n

float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access

size

(int, int) <– The shape of the images to process, with read and write access

class bob.ip.base.GeomNorm

Bases: object

Objects of this class, after configuration, can perform a geometric normalization of images

The geometric normalization is a combination of rotation, scaling and cropping an image.

Constructor Documentation:

  • bob.ip.base.GeomNorm (rotation_angle, scaling_factor, crop_size, crop_offset)
  • bob.ip.base.GeomNorm (other)

Constructs a GeomNorm object with the given scale, angle, size of the new image and transformation offset in the new image

When the GeomNorm is applied to an image, it is rotated and scaled such that it visually rotated counter-clock-wise (mathematically positive) with the given angle, i.e., to mimic the behavior of ImageMagick. Since the origin in the image is in the top-left corner, this means that the rotation is actually clock-wise (mathematically negative). This also applies for the second version of the landmarks, which will be rotated mathematically negative as well, to keep it consistent with the image.

Warning

The behavior of the landmark rotation has changed from Bob version 1.x, where the landmarks were mistakenly rotated mathematically positive.

Parameters:

rotation_angle : float

The rotation angle in degrees that should be applied

scaling_factor : float

The scale factor to apply

crop_size : (int, int)

The resolution of the processed images

crop_offset : (float, float)

The transformation offset in the processed images

other : GeomNorm

Another GeomNorm object to copy

Class Members:

crop_offset

(float, float) <– The transformation center in the processed image, with read and write access

crop_size

(int, int) <– The size of the processed image, with read and write access

process()
  • process(input, output, center) -> None
  • process(input, input_mask, output, output_mask, center) -> None
  • process(position, center) -> transformed

This function geometrically normalizes an image or a position in the image

The function rotates and scales the given image, or a position in image coordinates, such that the result is visually rotated and scaled with the rotation_angle and scaling_factor.

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D or 3D)

The input image to which GeomNorm should be applied

output : array_like (2D or 3D, float)

The output image, which must be of size crop_size

center : (float, float)

The transformation center in the given image; this will be placed to crop_offset in the output image

input_mask : array_like (bool, 2D or 3D)

An input mask of valid pixels before geometric normalization, must be of same size as input

output_mask : array_like (bool, 2D or 3D)

The output mask of valid pixels after geometric normalization, must be of same size as output

position : (float, float)

A position in input image space that will be transformed to output image space (might be outside of the crop area)

Returns:

transformed : uint16

The resulting GeomNorm code at the given position in the image
rotation_angle

float <– The rotation angle, with read and write access

scaling_factor

float <– The scale factor, with read and write access

class bob.ip.base.GradientMagnitude

Bases: object

Gradient ‘magnitude’ used

Possible values are:

  • Magnitude: L2 magnitude over X and Y
  • MagnitudeSquare: Square of the L2 magnitude
  • SqrtMagnitude: Square root of the L2 magnitude

Class Members:

Magnitude = 0
MagnitudeSquare = 1
SqrtMagnitude = 2
entries = {'MagnitudeSquare': 1, 'Magnitude': 0, 'SqrtMagnitude': 2}
class bob.ip.base.HOG

Bases: object

Objects of this class, after configuration, can extract Histogram of Oriented Gradients (HOG) descriptors.

This implementation relies on the article of [Dalal2005]. A few remarks:

  • Only single channel inputs (a.k.a. grayscale) are considered. Therefore, it does not take the maximum gradient over several channels as proposed in the above article.
  • Gamma/Color normalization is not part of the descriptor computation. However, this can easily be done (using this library) before extracting the descriptors.
  • Gradients are computed using standard 1D centered gradient (except at the borders where the gradient is uncentered [-1 1]). This is the method which achieved best performance reported in the article. To avoid too many uncentered gradients to be used, the gradients are computed on the full image prior to the cell decomposition. This implies that extra-pixels at each boundary of the cell are contributing to the gradients, although these pixels are not located inside the cell.
  • R-HOG blocks (rectangular) normalization is supported, but not C-HOG blocks (circular).
  • Due to the similarity with the SIFT descriptors, this can also be used to extract dense-SIFT features.
  • The first bin of each histogram is always centered around 0. This implies that the orientations are in [0-e,180-e] rather than [0,180], with e being half the angle size of a bin (same with [0,360]).

Constructor Documentation:

  • bob.ip.base.HOG (image_size, [bins], [full_orientation], [cell_size], [cell_overlap], [block_size], [block_overlap])
  • bob.ip.base.HOG (hog)

Constructs a new HOG extractor

Parameters:

image_size : (int, int)

The size of the input image to process.

bins : int

[default: 8] Dimensionality of a cell descriptor (i.e. the number of bins)

full_orientation : bool

[default: False] Whether the range [0,360] is used or only [0,180]

cell_size : (int, int)

[default: (4,4)] The size of a cell.

cell_overlap : (int, int)

[default: (0,0)] The overlap between cells.

block_size : (int, int)

[default: (4,4)] The size of a block (in terms of cells).

block_overlap : (int, int)

[default: (0,0)] The overlap between blocks (in terms of cells).

hog : bob.ip.base.HOG

Another HOG object to copy

Class Members:

bins

int <– Dimensionality of a cell descriptor (i.e. the number of bins), with read and write access

block_norm

bob.ip.base.BlockNorm <– The type of norm used for normalizing blocks, with read and write access

block_norm_eps

float <– Epsilon value used to avoid division by zeros when normalizing the blocks, read and write access

block_norm_threshold

float <– Threshold used to perform the clipping during the block normalization, with read and write access

block_overlap

(int, int) <– Overlap between blocks (in terms of cells), with read and write access

block_size

(int, int) <– Size of a block (in terms of cells), with read and write access

cell_overlap

(int, int) <– Overlap between cells, with read and write access

cell_size

(int, int) <– Size of a cell, with read and write access

compute_histogram(magnitude, orientation[, histogram]) → histogram

Computes an Histogram of Gradients for a given ‘cell’

The inputs are the gradient magnitudes and the orientations for each pixel of the cell

Parameters:

magnitude : array_like (2D, float)

The input array with the gradient magnitudes

orientation : array_like (2D, float)

The input array with the orientations

histogram : array_like (1D, float)

[default = None] If given, the result will be written to this histogram; must be of size bins

Returns:

histogram : array_like (1D, float)

The resulting histogram; same as input histogram, if given
disable_block_normalization() → None

Disable block normalization

This is performed by setting parameters such that the cells are not further processed, i.e.:

extract(input[, output]) → output

Extract the HOG descriptors

This extracts HOG descriptors from the input image. The output is 3D, the first two dimensions being the y- and x- indices of the block, and the last one the index of the bin (among the concatenated cell histograms for this block).

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D)

The input image to extract HOG features from

output : array_like (3D, float)

[default: None] If given, the container to extract the HOG features to; must be of size output_shape()

Returns:

output : array_like(2D, float)

The resulting HOG features, same as parameter output, if given
full_orientation

bool <– Whether the range [0,360] is used or not ([0,180] otherwise), with read and write access

image_size

(int, int) <– The size of the input image to process., with read and write access

magnitude_type

bob.ip.base.GradientMagnitude <– Type of the magnitude to consider for the descriptors, with read and write access

output_shape() → shape

Gets the descriptor output size given the current parameters and size

In detail, it returns (number of blocks along Y, number of blocks along X, number of bins)

Returns:

shape : (int, int, int)

The shape of the output array required to call extract()
class bob.ip.base.LBP

Bases: object

A class that extracts local binary patterns in various types

The implementation is based on [Atanasoaei2012], where all the different types of LBP features are defined in more detail.

Constructor Documentation:

  • bob.ip.base.LBP (neighbors, [radius], [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
  • bob.ip.base.LBP (neighbors, radius_y, radius_x, [circular], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
  • bob.ip.base.LBP (neighbors, block_size, [block_overlap], [to_average], [add_average_bit], [uniform], [rotation_invariant], [elbp_type], [border_handling])
  • bob.ip.base.LBP (lbp)
  • bob.ip.base.LBP (hdf5)

Creates an LBP extractor with the given parametrization

Basically, the LBP configuration can be split into three parts.

  1. Which pixels are compared how:
    • The number of neighbors (might be 4, 8 or 16)
    • Circular or rectangular offset positions around the center, or even Multi-Block LBP (MB-LBP)
    • Compare the pixels to the center pixel or to the average
  2. How to generate the bit strings from the pixels (this is handled by the elbp_type parameter):
    • 'regular': Choose one bit for each comparison of the neighboring pixel with the central pixel
    • 'transitional': Compare only the neighboring pixels and skip the central one
    • 'direction-coded': Compute a 2-bit code for four directions
  3. How to cluster the generated bit strings to compute the final LBP code:
    • uniform: Only uniform LBP codes (with less than two bit-changes between 0 and 1) are considered; all other strings are combined into one LBP code
    • rotation_invariant: Rotation invariant LBP codes are generated, e.g., bit strings 00110000 and 00000110 will lead to the same LBP code

This clustering is done using a look-up-table, which you can also set yourself using the look_up_table attribute. The maximum code that will be generated can be read from the max_label attribute.

Finally, the border handling of the image can be selected. With the 'shrink' option, no LBP code is computed for the border pixels and the resulting image is 2\times radius or 3\times block_size -1 pixels smaller in both directions, see lbp_shape(). The 'wrap' option will wrap around the border and no truncation is performed.

Note

To compute MB-LBP features, it is possible to compute an integral image before to speed up the calculation.

Parameters:

neighbors : int

The number of neighboring pixels that should be taken into account; possible values: 4, 8, 16

radius : float

[default: 1.] The radius of the LBP in both vertical and horizontal direction together

radius_y, radius_x : float

The radius of the LBP in both vertical and horizontal direction separately

block_size : (int, int)

If set, multi-block LBP’s with the given block size will be extracted

block_overlap : (int, int)

[default: (0, 0)] Multi-block LBP’s with the given block overlap will be extracted

circular : bool

[default: False] Extract neighbors on a circle or on a square?

to_average : bool

[default: False] Compare the neighbors to the average of the pixels instead of the central pixel?

add_average_bit : bool

[default: False] (only useful if to_average is True) Add another bit to compare the central pixel to the average of the pixels?

uniform : bool

[default: False] Extract uniform LBP features?

rotation_invariant : bool

[default: False] Extract rotation invariant LBP features?

elbp_type : str

[default: 'regular'] Which type of LBP codes should be computed; possible values: (‘regular’, ‘transitional’, ‘direction-coded’), see elbp_type

border_handling : str

[default: 'shrink'] How should the borders of the image be treated; possible values: (‘shrink’, ‘wrap’), see border_handling

lbp : bob.ip.base.LBP

Another LBP object to copy

hdf5 : bob.io.base.HDF5File

An HDF5 file to read the LBP configuration from

Class Members:

add_average_bit

bool <– Should the bit for the comparison of the central pixel with the average be added as well (read and write access)?

block_overlap

(int, int) <– The block overlap in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access

Note

The block_overlap must be smaller than the block_size. To set both the block size and the block overlap at the same time, use the set_block_size_and_overlap() function.

block_size

(int, int) <– The block size in both vertical and horizontal direction of the Multi-Block-LBP extractor, with read and write access

border_handling

str <– The type of border handling that should be applied (read and write access)

Possible values are: (‘shrink’, ‘wrap’)

circular

bool <– Should circular or rectangular LBP’s be extracted (read and write access)?

elbp_type

str <– The type of LBP bit string that should be extracted (read and write access)

Possible values are: (‘regular’, ‘transitional’, ‘direction-coded’)

extract()
  • extract(input, [is_integral_image]) -> output
  • extract(input, position, [is_integral_image]) -> code
  • extract(input, output, [is_integral_image]) -> None

This function extracts the LBP features from an image

LBP features can be extracted either for the whole image, or at a single location in the image. When MB-LBP features will be extracted, an integral image will be computed to speed up the calculation. The integral image calculation can be done before this function is called, and the integral image can be passed to this function directly. In this case, please set the is_integral_image parameter to True.

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D)

The input image for which LBP features should be extracted

position : (int, int)

The position in the input image, where the LBP code should be extracted; assure that you don’t try to provide positions outside of the offset

output : array_like (2D, uint16)

The output image that need to be of shape lbp_shape()

is_integral_image : bool

[default: False] Is the given input image an integral image?

Returns:

output : array_like (2D, uint16)

The resulting image of LBP codes

code : uint16

The resulting LBP code at the given position in the image
is_multi_block_lbp

bool <– Is the current configuration of the LBP extractor set up to extract Multi-Block LBP’s (read access only)?

lbp_shape()
  • lbp_shape(input, is_integral_image) -> lbp_shape
  • lbp_shape(shape, is_integral_image) -> lbp_shape

This function returns the shape of the LBP image for the given image

In case the border_handling is 'shrink' the image resolution will be reduced, depending on the LBP configuration. This function will return the desired output shape for the given input image or input shape.

Parameters:

input : array_like (2D)

The input image for which LBP features should be extracted

shape : (int, int)

The shape of the input image for which LBP features should be extracted

is_integral_image : bool

[default: False] Is the given image (shape) an integral image?

Returns:

lbp_shape : (int, int)

The shape of the LBP image that is required in a call to extract()
load(hdf5) → None

Loads the parametrization of the LBP extractor from the given HDF5 file

Parameters:

hdf5 : bob.io.base.HDF5File

An HDF5 file opened for reading
look_up_table

array_like (1D, uint16) <– The look up table that defines, which bit string is converted into which LBP code (read and write access)

Depending on the values of uniform and rotation_invariant, bit strings might be converted into different LBP codes. Since this attribute is writable, you can define a look-up-table for LBP codes yourself.

Warning

For the time being, the look up tables are not saved by the save() function!

max_label

int <– The number of different LBP code that are extracted (read access only)

The codes themselves are uint16 numbers in the range [0, max_label - 1]. Depending on the values of uniform and rotation_invariant, bit strings might be converted into different LBP codes.

offset

(int, int) <– The offset in the image, where the first LBP code can be extracted (read access only)

Note

When extracting LBP features from an image with a specific shape, positions might be in range [offset, shape - offset[ only. Otherwise, an exception will be raised.

points

int <– The number of neighbors (usually 4, 8 or 16), with read and write access

Note

The block_overlap must be smaller than the block_size. To set both the block size and the block overlap at the same time, use the set_block_size_and_overlap() function.

radii

(float, float) <– The radii in both vertical and horizontal direction of the elliptical or rectangular LBP extractor, with read and write access

radius

float <– The radius of the round or square LBP extractor, with read and write access

relative_positions

array_like (2D, float) <– The list of neighbor positions, with which the central pixel is compared (read access only)

The list is defined as relative positions, where the central pixel is considered to be at (0, 0).

rotation_invariant

bool <– Should rotation invariant LBP patterns be extracted (read and write access)?

Rotation invariant LBP codes collects all patterns that have the same bit string with shifts. Hence, 00111000 and 10000011 will result in the same LBP code.

save(hdf5) → None

Saves the the parametrization of the LBP extractor to the given HDF5 file

Warning

For the time being, the look-up-table is not saved. If you have set the look_up_table by hand, it is lost.

Parameters:

hdf5 : bob.io.base.HDF5File

An HDF5 file open for writing
set_block_size_and_overlap(block_size, block_overlap) → None

This function sets the block size and the block overlap for MB-LBP features at the same time

Parameters:

block_size : (int, int)

Multi-block LBP’s with the given block size will be extracted

block_overlap : (int, int)

Multi-block LBP’s with the given block overlap will be extracted
to_average

bool <– Should the neighboring pixels be compared with the average of all pixels, or to the central one (read and write access)?

uniform

bool <– Should uniform LBP patterns be extracted (read and write access)?

Uniform LBP patterns are those bit strings, where only up to two changes from 0 to 1 and vice versa are allowed. Hence, 00111000 is a uniform pattern, while 00110011 is not. All non-uniform bit strings will be collected in a single LBP code.

class bob.ip.base.LBPTop

Bases: object

A class that extracts local binary patterns (LBP) in three orthogonal planes (TOP)

The LBPTop class is designed to calculate the LBP-Top coefficients given a set of images. The workflow is as follows:

Todo

UPDATE as this is not true

  1. You initialize the class, defining the radius and number of points in each of the three directions: XY, XT, YT for the LBP calculations
  2. For each image you have in the frame sequence, you push into the class
  3. An internal FIFO queue (length = radius in T direction) keeps track of the current image and their order. As a new image is pushed in, the oldest on the queue is pushed out.
  4. After pushing an image, you read the current LBP-Top coefficients and may save it somewhere.

Constructor Documentation:

bob.ip.base.LBPTop (xy, xt, yt)

Constructs a new LBPTop object

For all three directions, the LBP objects need to be specified. The radii for the three LBP classes must be consistent, i.e., xy.radii[1] == xt.radii[1], xy.radii[0] == yt.radii[1] and xt.radii[0] == yt.radii[0].

Warning

The order of the radius_x and radius_y parameters are not (radius_x, radius_y) in the LBP constructor, but (radius_y, radius_x). Hence, to get an x radius 2 and y radius 3, you need to use xy = bob.ip.base.LBP(8, 3, 2) or more specifically xy = bob.ip.base.LBP(8, radius_x=2, radius_y=3). The same applies for xt and yt.

Parameters:

xy : bob.ip.base.LBP

The 2D LBP-XY plane configuration

xt : bob.ip.base.LBP

The 2D LBP-XT plane configuration

yt : bob.ip.base.LBP

The 2D LBP-YT plane configuration

Class Members:

process(input, xy, xt, yt) → None

This function processes the given set of images and extracts the three orthogonal planes

The given 3D input array represents a set of gray-scale images and returns (by argument) the three LBP planes calculated. The 3D array has to be arranged in this way:

  1. First dimension: time
  2. Second dimension: frame height
  3. Third dimension: frame width

The central pixel is the point where the LBP planes intersect/have to be calculated from.

Parameters:

input : array_like (3D)

The input set of gray-scale images for which LBPTop features should be extracted

xy, xt, yt : array_like (3D, uint16)

The result of the LBP operator in the XY, XT and YT plane (frame), for the central frame of the input array
xt

bob.ip.base.LBP <– The 2D LBP-XT plane configuration

xy

bob.ip.base.LBP <– The 2D LBP-XY plane configuration

yt

bob.ip.base.LBP <– The 2D LBP-XT plane configuration

class bob.ip.base.MultiscaleRetinex

Bases: object

This class allows after configuration to apply the Self Quotient Image algorithm to images

More information about this algorithm can be found in [Jobson1997].

Constructor Documentation:

  • bob.ip.base.MultiscaleRetinex ([scales], [size_min], [size_step], [sigma], [border])
  • bob.ip.base.MultiscaleRetinex (msrx)

Creates a MultiscaleRetinex object

Todo

Add documentation for MultiscaleRetinex

Parameters:

scales : int

[default: 1] The number of scales (bob.ip.base.Gaussian)

size_min : int

[default: 1] The radius of the kernel of the smallest bob.ip.base.Gaussian

size_step : int

[default: 1] The step used to set the kernel size of other weighted Gaussians: size_s = 2 * (size_min + s * size_step) + 1

sigma : double

[default: 2.] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas: sigma_s = sigma * (size_min + s * size_step) / size_min

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

msrx : bob.ip.base.MultiscaleRetinex

The MultiscaleRetinex object to use for copy-construction

Class Members:

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access

process(src[, dst]) → dst

Applies the Self Quotient Image algorithm to an image (2D/grayscale or color 3D/color) of type uint8, uint16 or double

Todo

Check if this documentation is correct (seems to be copied from bob.ip.base.SelfQuotientImage

If given, the dst array should have the type float and the same size as the src array.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be processed

dst : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as src

Returns:

dst : array_like (2D, float)

The resulting output image, which is the same as dst (if given)
scales

int <– The number of scales (Gaussian); with read and write access

sigma

float <– The variance of the kernel of the smallest weighted Gaussian (variance_s = sigma2 * (size_min+s*size_step)/size_min); with read and write access

size_min

int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access

size_step

int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access

class bob.ip.base.SIFT

Bases: object

This class allows after configuration the extraction of SIFT descriptors

For details, please read [Lowe2004].

Constructor Documentation:

  • bob.ip.base.SIFT (size, scales, octaves, octave_min, [sigma_n], [sigma0], [contrast_thres], [edge_thres], [norm_thres], [kernel_radius_factor], [border])
  • bob.ip.base.SIFT (sift)

Creates an object that allows the extraction of SIFT descriptors

Todo

Explain SIFT constructor in more detail.

Warning

The order of the parameters scales and octaves has changed compared to the old implementation, in order to keep it consistent with bob.ip.base.VLSIFT!

Parameters:

size : (int, int)

The height and width of the images to process

scales : int

The number of intervals of the pyramid. Three additional scales will be computed in practice, as this is required for extracting SIFT features

octaves : int

The number of octaves of the pyramid

octave_min : int

The index of the minimum octave

sigma_n : float

[default: 0.5] The value sigma_n of the standard deviation for the nominal/initial octave/scale

sigma0 : float

[default: 1.6] The value sigma0 of the standard deviation for the image of the first octave and first scale

contrast_thres : float

[default: 0.03] The contrast threshold used during keypoint detection

edge_thres : float

[default: 10.] The edge threshold used during keypoint detection

norm_thres : float

[default: 0.2] The norm threshold used during descriptor normalization

kernel_radius_factor : float

[default: 4.] Factor used to determine the kernel radii: size=2*radius+1. For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

sift : bob.ip.base.SIFT

The SIFT object to use for copy-construction

Class Members:

bins

int <– The number of bins for the descriptor, with read and write access

blocks

int <– The number of blocks for the descriptor, with read and write access

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access

compute_descriptor(src, keypoints[, dst]) → dst

Computes SIFT descriptor for a 2D/grayscale image, at the given keypoints

If given, the results are put in the output dst, which output should be of type float and allocated in the shape output_shape() method).

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be processed

keypoints : [bob.ip.base.GSSKeypoint]

The keypoints at which the descriptors should be computed

dst : [array_like (4D, float)]

The descriptors that should have been allocated in size output_shape()

Returns:

dst : [array_like (4D, float)]

The resulting descriptors, if given it will be the same as the dst parameter
contrast_threshold

float <– The contrast threshold used during keypoint detection

edge_threshold

float <– The edge threshold used during keypoint detection

gaussian_window_size

float <– The Gaussian window size for the descriptor

kernel_radius_factor

float <– Factor used to determine the kernel radii size=2*radius+1

For each Gaussian kernel, the radius is equal to ceil(kernel_radius_factor*sigma_{octave,scale})

magnif

float <– The magnification factor for the descriptor

norm_epsilon

float <– The magnification factor for the descriptor

norm_threshold

float <– The norm threshold used during keypoint detection

octave_max

int <– The index of the minimum octave, read only access

This is equal to octave_min+octaves-1.

octave_min

int <– The index of the minimum octave, with read and write access

octaves

int <– The number of octaves of the pyramid, with read and write access

output_shape(keypoints) → shape

Returns the output shape for the given number of input keypoints

Parameters:

keypoints : int

The number of keypoints that you want to retrieve SIFT features for

Returns:

shape : (int, int, int, int)

The shape of the output array required to call compute_descriptor()
scales

int <– The number of intervals of the pyramid, with read and write access

Three additional scales will be computed in practice, as this is required for extracting SIFT features

set_sigma0_no_init_smoothing() → None

Sets sigma0 such that there is not smoothing at the first scale of octave_min

sigma0

float <– The value sigma0 of the standard deviation for the image of the first octave and first scale

sigma_n

float <– The value sigma_n of the standard deviation for the nominal/initial octave/scale; with read and write access

size

(int, int) <– The shape of the images to process, with read and write access

class bob.ip.base.SelfQuotientImage

Bases: object

This class allows after configuration to apply the Self Quotient Image algorithm to images

Details of the Self Quotient Image algorithm is described in [Wang2004].

Constructor Documentation:

  • bob.ip.base.SelfQuotientImage ([scales], [size_min], [size_step], [sigma], [border])
  • bob.ip.base.SelfQuotientImage (sqi)

Creates an object to preprocess images with the Self Quotient Image algorithm

Todo

explain SelfQuotientImage constructor

Warning

Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance. This includes that the WeightedGaussian pyramid is different, see https://gitlab.idiap.ch/bob/bob.ip.base/issues/1.

Parameters:

scales : int

[default: 1] The number of scales (bob.ip.base.WeightedGaussian)

size_min : int

[default: 1] The radius of the kernel of the smallest bob.ip.base.WeightedGaussian

size_step : int

[default: 1] The step used to set the kernel size of other weighted Gaussians: size_s = 2 * (size_min + s * size_step) + 1

sigma : double

[default: math.sqrt(2.)] The standard deviation of the kernel of the smallest weighted Gaussian; other sigmas: sigma_s = sigma * (size_min + s * size_step) / size_min

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

sqi : bob.ip.base.SelfQuotientImage

The SelfQuotientImage object to use for copy-construction

Class Members:

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border; with read and write access

process(src[, dst]) → dst

Applies the Self Quotient Image algorithm to an image (2D/grayscale or 3D/color) of type uint8, uint16 or double

If given, the dst array should have the type float and the same size as the src array.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be processed

dst : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as src

Returns:

dst : array_like (2D, float)

The resulting output image, which is the same as dst (if given)
scales

int <– The number of scales (Weighted Gaussian); with read and write access

sigma

float <– The standard deviation of the kernel of the smallest weighted Gaussian (sigma_s = sigma * (size_min+s*size_step)/size_min); with read and write access

size_min

int <– The radius (size=2*radius+1) of the kernel of the smallest weighted Gaussian; with read and write access

size_step

int <– The step used to set the kernel size of other Weighted Gaussians (size_s=2*(size_min+s*size_step)+1); with read and write access

class bob.ip.base.TanTriggs

Bases: object

Objects of this class, after configuration, can preprocess images

It does this using the method described by Tan and Triggs in the paper [TanTriggs2007].

Constructor Documentation:

  • bob.ip.base.TanTriggs ([gamma], [sigma0], [sigma1], [radius], [threshold], [alpha], [border])
  • bob.ip.base.TanTriggs (tan_triggs)

Constructs a new Tan and Triggs filter

Todo

Explain TanTriggs constructor in more detail.

Parameters:

gamma : float

[default: 0.2] The value of gamma for the gamma correction

sigma0 : float

[default: 1.] The standard deviation of the inner Gaussian

sigma1 : float

[default: 2.] The standard deviation of the outer Gaussian

radius : int

[default: 2] The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)

threshold : float

[default: 10.] The threshold used for the contrast equalization

alpha : float

[default: 0.1] The alpha value used for the contrast equalization

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

tan_triggs : bob.ip.base.TanTriggs

The TanTriggs object to use for copy-construction

Class Members:

alpha

float <– The alpha value used for the contrast equalization, with read and write access

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access

gamma

float <– The value of gamma for the gamma correction, with read and write access

kernel

array_like (2D, float) <– The values of the DoG filter; read only access

process(input[, output]) → output

Preprocesses a 2D/grayscale image using the algorithm from Tan and Triggs.

The input array is a 2D array/grayscale image. The destination array, if given, should be a 2D array of type float64 and allocated in the same size as the input. If the destination array is not given, it is generated in the required size.

Note

The __call__ function is an alias for this method.

Parameters:

input : array_like (2D)

The input image which should be normalized

output : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as input

Returns:

output : array_like (2D, float)

The resulting output image, which is the same as output (if given)
radius

int <– The radius of the Difference of Gaussians filter along both axes (size of the kernel=2*radius+1)

sigma0

float <– The standard deviation of the inner Gaussian, with read and write access

sigma1

float <– The standard deviation of the inner Gaussian, with read and write access

threshold

float <– The threshold used for the contrast equalization, with read and write access

class bob.ip.base.VLDSIFT

Bases: object

Computes dense SIFT features using the VLFeat library

For details, please read [Lowe2004].

Constructor Documentation:

  • bob.ip.base.VLDSIFT (size, [step], [block_size])
  • bob.ip.base.VLDSIFT (sift)

Creates an object that allows the extraction of VLDSIFT descriptors

Todo

Explain VLDSIFT constructor in more detail.

Parameters:

size : (int, int)

The height and width of the images to process

step : (int, int)

[default: (5, 5)] The step along the y- and x-axes

block_size : (int, int)

[default: (5, 5)] The block size along the y- and x-axes

sift : bob.ip.base.VLDSIFT

The VLDSIFT object to use for copy-construction

Class Members:

block_size

(int, int) <– The block size in both directions, with read and write access

extract(src[, dst]) → dst

Computes the dense SIFT features from an input image, using the VLFeat library

If given, the results are put in the output dst, which should be of type float and allocated in the shape output_shape() method.

Todo

Describe the output of the VLDSIFT.extract() method in more detail.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D, float32)

The input image which should be processed

dst : [array_like (2D, float32)]

The descriptors that should have been allocated in size output_shape()

Returns:

dst : array_like (2D, float32)

The resulting descriptors, if given it will be the same as the dst parameter
output_shape() → shape

Returns the output shape for the current setup

The output shape is a 2-element tuple consisting of the number of keypoints for the current size, and the size of the descriptors

Returns:

shape : (int, int)

The shape of the output array required to call extract()
size

(int, int) <– The shape of the images to process, with read and write access

step

(int, int) <– The step along both directions, with read and write access

use_flat_window

bool <– Whether to use a flat window or not (to boost the processing time), with read and write access

window_size

float <– The window size, with read and write access

class bob.ip.base.VLSIFT

Bases: object

Computes SIFT features using the VLFeat library

For details, please read [Lowe2004].

Constructor Documentation:

  • bob.ip.base.VLSIFT (size, scales, octaves, octave_min, [peak_thres], [edge_thres], [magnif])
  • bob.ip.base.VLSIFT (sift)

Creates an object that allows the extraction of VLSIFT descriptors

Todo

Explain VLSIFT constructor in more detail.

Parameters:

size : (int, int)

The height and width of the images to process

scales : int

The number of intervals in each octave

octaves : int

The number of octaves of the pyramid

octave_min : int

The index of the minimum octave

peak_thres : float

[default: 0.03] The peak threshold (minimum amount of contrast to accept a keypoint)

edge_thres : float

[default: 10.] The edge rejectipon threshold used during keypoint detection

magnif : float

[default: 3.] The magnification factor (descriptor size is determined by multiplying the keypoint scale by this factor)

sift : bob.ip.base.VLSIFT

The VLSIFT object to use for copy-construction

Class Members:

edge_threshold

float <– The edge rejection threshold used during keypoint detection, with read and write access

extract(src[, keypoints]) → dst

Computes the SIFT features from an input image

A keypoint is specified by a 3- or 4-tuple (y, x, sigma, [orientation]), stored as one row of the given keypoints parameter. If the keypoints are not given, the are detected first. It returns a list of descriptors, one for each keypoint and orientation. The first four values are the x, y, sigma and orientation of the values. The 128 remaining values define the descriptor.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D, uint8)

The input image which should be processed

keypoints : array_like (2D, float)

The keypoints at which the descriptors should be computed

Returns:

dst : [array_like (1D, float)]

The resulting descriptors; the first four values are the x, y, sigma and orientation of the keypoints, the 128 remaining values define the descriptor
magnif

float <– The magnification factor for the descriptor

octave_max

int <– The index of the minimum octave, read only access

This is equal to octave_min+octaves-1.

octave_min

int <– The index of the minimum octave, with read and write access

octaves

int <– The number of octaves of the pyramid, with read and write access

peak_threshold

float <– The peak threshold (minimum amount of contrast to accept a keypoint), with read and write access

scales

int <– The number of intervals of the pyramid, with read and write access

Three additional scales will be computed in practice, as this is required for extracting VLSIFT features

size

(int, int) <– The shape of the images to process, with read and write access

class bob.ip.base.WeightedGaussian

Bases: object

This class performs weighted gaussian smoothing (anisotropic filtering)

In particular, it is used by the Self Quotient Image (SQI) algorithm bob.ip.base.SelfQuotientImage.

Constructor Documentation:

  • bob.ip.base.WeightedGaussian (sigma, [radius], [border])
  • bob.ip.base.WeightedGaussian (weighted_gaussian)

Constructs a new weighted Gaussian filter

Todo

explain WeightedGaussian constructor

Warning

Compared to the last Bob version, here the sigma parameter is the standard deviation and not the variance.

Parameters:

sigma : (double, double)

The standard deviation of the WeightedGaussian along the y- and x-axes in pixels

radius : (int, int)

[default: (-1, -1) -> 3*sigma ] The radius of the Gaussian in both directions – the size of the kernel is 2*radius+1

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

weighted_gaussian : bob.ip.base.WeightedGaussian

The weighted Gaussian object to use for copy-construction

Class Members:

border

bob.sp.BorderType <– The extrapolation method used by the convolution at the border, with read and write access

filter(src[, dst]) → dst

Smooths an image (2D/grayscale or 3D/color)

If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be smoothed

dst : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as src

Returns:

dst : array_like (2D, float)

The resulting output image, which is the same as dst (if given)
radius

(int, int) <– The radius of the WeightedGaussian along the y- and x-axes (size of the kernel=2*radius+1); with read and write access

When setting the radius to a negative value, it will be automatically computed as 3*sigma.

sigma

(float, float) <– The standard deviation of the weighted Gaussian along the y- and x-axes; with read and write access

Note

The radius of the kernel is not reset by setting the sigma value.

class bob.ip.base.Wiener

Bases: object

A Wiener filter

The Wiener filter is implemented after the description in Part 3.4.3 of [Szeliski2010]

Constructor Documentation:

  • bob.ip.base.Wiener (size, Pn, [variance_threshold])
  • bob.ip.base.Wiener (Ps, Pn, [variance_threshold])
  • bob.ip.base.Wiener (data, [variance_threshold])
  • bob.ip.base.Wiener (filter)
  • bob.ip.base.Wiener (hdf5)

Constructs a new Wiener filter

Several variants of contructors are possible for contructing a Wiener filter. They are:

  1. Constructs a new Wiener filter dedicated to images of the given size. The filter is initialized with zero values
  2. Constructs a new Wiener filter from a set of variance estimates Ps and a noise level Pn
  3. Trains the new Wiener filter with the given data
  4. Copy constructs the given Wiener filter
  5. Reads the Wiener filter from bob.io.base.HDF5File

Parameters:

Ps : array_like<float, 2D>

Variance Ps estimated at each frequency

Pn : float

Noise level Pn

size : (int, int)

The shape of the newly created empty filter

data : array_like<float, 3D>

The training data, with dimensions (#data, height, width)

variance_threshold : float

[default: 1e-8] Variance flooring threshold (i.e., the minimum variance value

filter : bob.ip.base.Wiener

The Wiener filter object to use for copy-construction

hdf5 : bob.io.base.HDF5File

The HDF5 file object to read the Wiener filter from

Class Members:

Pn

float <– Noise level Pn

Ps

array_like <float, 2D> <– Variance Ps estimated at each frequency

filter(src[, dst]) → dst

Filters the input image

If given, the dst array should have the expected type (numpy.float64) and the same size as the src array.

Note

The __call__ function is an alias for this method.

Parameters:

src : array_like (2D)

The input image which should be smoothed

dst : array_like (2D, float)

[default: None] If given, the output will be saved into this image; must be of the same shape as src

Returns:

dst : array_like (2D, float)

The resulting output image, which is the same as dst (if given)
is_similar_to(other[, r_epsilon][, a_epsilon]) → None

Compares this Wiener filter with the other one to be approximately the same

The optional values r_epsilon and a_epsilon refer to the relative and absolute precision, similarly to numpy.allclose().

Parameters:

other : bob.ip.base.Wiener

The other Wiener filter to compare with

r_epsilon : float

[Default: 1e-5] The relative precision

a_epsilon : float

[Default: 1e-8] The absolute precision
load(hdf5) → None

Loads the configuration of the Wiener filter from the given HDF5 file

Parameters:

hdf5 : bob.io.base.HDF5File

An HDF5 file opened for reading
save(hdf5) → None

Saves the the configuration of the Wiener filter to the given HDF5 file

Parameters:

hdf5 : bob.io.base.HDF5File

An HDF5 file open for writing
size

(int, int) <– The size of the filter

variance_threshold

float <– Variance flooring threshold

w

array_like<2D, float> <– The Wiener filter W (W=1/(1+Pn/Ps)) (read-only)

bob.ip.base.angle_to_horizontal(right, left) → angle[source]

Get the angle needed to level out (horizontally) two points.

Parameters

right, left
: (float, float)
The two points to level out horizontically.

Returns

angle
: float
The angle in degrees between the left and the right point
bob.ip.base.block(input, block_size[, block_overlap][, output][, flat]) → output

Performs a block decomposition of a 2D array/image

If given, the output 3D or 4D destination array should be allocated and of the correct size, see bob.ip.base.block_output_shape().

Parameters:

input : array_like (2D)

The source image to decompose into blocks

block_size : (int, int)

The size of the blocks in which the image is decomposed

block_overlap : (int, int)

[default: (0, 0)] The overlap of the blocks

output : array_like(3D or 4D)

[default: None] If given, the resulting blocks will be saved into this parameter; must be initialized in the correct size (see block_output_shape())

flat : bool

[default: False] If output is not specified, the flat parameter is used to decide whether 3D (flat = True) or 4D (flat = False) output is generated

Returns:

output : array_like(3D or 4D)

The resulting blocks that the image is decomposed into; the same array as the output parameter, when given.
bob.ip.base.block_output_shape(input, block_size[, block_overlap][, flat]) → shape

Returns the shape of the output image that is required to compute the bob.ip.base.block() function

Parameters:

input : array_like (2D)

The source image to decompose into blocks

block_size : (int, int)

The size of the blocks in which the image is decomposed

block_overlap : (int, int)

[default: (0, 0)] The overlap of the blocks

flat : bool

[default: False] The flat parameter is used to decide whether 3D (flat = True) or 4D (flat = False) output is generated

Returns:

shape : (int, int, int) or (int, int, int, int)

The shape of the blocks.
bob.ip.base.crop(src, crop_offset, crop_size[, dst][, src_mask][, dst_mask][, fill_pattern]) → dst[source]

Crops the given image src image to the given offset (might be negative) and to the given size (might be greater than src image).

Either crop_size or dst need to be specified. When masks are given, the need to be of the same size as the src and dst parameters. When crop regions are outside the image, the cropped image will contain fill_pattern and the mask will be set to False

Parameters

src
: array_like (2D or 3D)
The source image to flip.
crop_offset
: (int, int)
The position in src coordinates to start cropping; might be negative
crop_size
: (int, int)
The size of the cropped image; might be omitted when the dst is given
dst
: array_like (2D or 3D)
If given, the destination to crop src to.
src_mask, dst_mask: array_like(bool, 2D or 3D)
Masks that define, where src and dst are valid
fill_pattern: number
[default: 0] The value to set outside the croppable area

Returns

dst
: array_like (2D or 3D)
The cropped image
bob.ip.base.extrapolate_mask()
  • extrapolate_mask(mask, img) -> None
  • extrapolate_mask(mask, img, random_sigma, [neighbors], [rng]) -> None

Extrapolate a 2D array/image, taking a boolean mask into account

The img argument is used both as an input and an output. Only values where the mask is set to false are extrapolated. The regions, where the mask is set to True is expected to be convex.

This function can be called in two ways:

The first way is by giving only the mask and the image. Then a nearest neighbor technique is used as:

  1. The columns of the image are firstly extrapolated wrt. to the nearest neighbour on the same column.
  2. The rows of the image are the extrapolate wrt. to the closest neighbour on the same row.

The second way, the mask is interpolated by adding random values to the border pixels. The image is scanned in a spiral way, starting at the center of the masked area. When a pixel of the unmasked area is reached:

  1. The next pixel of the masked area is searched perpendicular to the current spiral direction
  2. From that pixel, neigbors pixels are extratced from the image in both sides of the current spiral direction, and a random value os choosen
  3. A normal distributed random value with mean 1 and standard deviation random_sigma is added to the pixel value
  4. The pixel value is set to the image at the current position

Any action considering a random number will use the given rng to create random numbers.

Note

For the second variant, images of type float are preferred.

Parameters:

mask : array_like (2D, bool)

The mask which has the valid pixel set to True and the invalid pixel set to False

img : array_like (2D, bool)

The image that will be filled; must have the same shape as mask

random_sigma : float

The standard deviation of the random factor to multiply thevalid pixel value from the border with; must be greater than or equal to 0

neighbors : int

[Default: 5] The number of neighbors of valid border pixels to choose one from; set neighbors=0 to disable random selection

rng : bob.core.random.mt19937

[Default: rng initialized with the system time] The random number generator to consider
bob.ip.base.flip(src[, dst]) → dst[source]

Flip a 2D or 3D array/image upside-down. If given, the destination array dst should have the same size and type as the source array.

Parameters

src
: array_like (2D or 3D)
The source image to flip.
dst
: array_like (2D or 3D)
If given, the destination to flip src to.

Returns

dst
: array_like (2D or 3D)
The flipped image
bob.ip.base.flop(src[, dst]) → dst[source]

Flip a 2D or 3D array/image left-right. If given, the destination array dst should have the same size and type as the source array.

Parameters

src
: array_like (2D or 3D)
The source image to flip.
dst
: array_like (2D or 3D)
If given, the destination to flip src to.

Returns

dst
: array_like (2D or 3D)
The flipped image
bob.ip.base.gamma_correction(src, gamma[, dst]) → dst

Performs a power-law gamma correction of a given 2D image

Todo

Explain gamma correction in more detail

Parameters:

src : array_like (2D)

The source image to compute the histogram for

gamma : float

The gamma value to apply

dst : array_like (2D, float)

The gamma-corrected image to write; if not specified, it will be created in the desired size

Returns:

dst : array_like (2D, float)

The gamma-corrected image; the same as the dst parameter, if specified
bob.ip.base.histogram()
  • histogram(src, [bin_count]) -> hist
  • histogram(src, hist) -> None
  • histogram(src, min_max, bin_count) -> hist
  • histogram(src, min_max, hist) -> None

Computes an histogram of the given input image

This function computes a histogram of the given input image, in several ways.

  • (version 1 and 2, only valid for uint8 and uint16 types – and uint32 and uint64 when bin_count is specified or hist is given as parameter): For each pixel value of the src image, a histogram bin is computed, using a fast implementation. The number of bins can be limited, and there will be a check that the source image pixels are actually in the desired range (0, bin_count-1)
  • (version 3 and 4, valid for many data types): The histogram is computed by defining regular bins between the provided minimum and maximum values.

Parameters:

src : array_like (2D)

The source image to compute the histogram for

hist : array_like (1D, uint64)

The histogram with the desired number of bins; the histogram will be cleaned before running the extraction

min_max : (scalar, scalar)

The minimum value and the maximum value in the source image

bin_count : int

[default: 256 or 65536] The number of bins in the histogram to create, defaults to the maximum number of values

Returns:

hist : array_like(2D, uint64)

The histogram with the desired number of bins, which is filled with the histogrammed source data
bob.ip.base.histogram_equalization()
  • histogram_equalization(src) -> None
  • histogram_equalization(src, dst) -> None

Performs a histogram equalization of a given 2D image

The first version computes the normalization in-place (in opposition to the old implementation, which returned a equalized image), while the second version fills the given dst array and leaves the input untouched.

Parameters:

src : array_like (2D, uint8 or uint16)

The source image to compute the histogram for

dst : array_like (2D, uint8, uint16, uint32 or float)

The histogram-equalized image to write; if not specified, the equalization is computed in-place.
bob.ip.base.integral(src, dst[, sqr][, add_zero_border]) → None

Computes an integral image for the given input image

It is the responsibility of the user to select an appropriate type for the numpy array dst (and sqr), which will contain the integral image. By default, src and dst should have the same size. When the sqr matrix is given as well, it will be filled with the squared integral image (useful to compute variances of pixels).

Note

The sqr image is expected to have the same data type as the dst image.

If add_zero_border is set to True, dst (and sqr) should be one pixel larger than src in each dimension. In this case, an extra zero pixel will be added at the beginning of each row and column.

Parameters:

src : array_like (2D)

The source image

dst : array_like (2D)

The resulting integral image

sqr : array_like (2D)

The resulting squared integral image with the same data type as dst

add_zero_border : bool

If enabled, an extra zero pixel will be added at the beginning of each row and column
bob.ip.base.lbphs(input, lbp, block_size[, block_overlap][, output]) → output

Computes an local binary pattern histogram sequences from the given image

Warning

This is a re-implementation of the old bob.ip.LBPHSFeatures class, but with a different handling of blocks. Before, the blocks where extracted from the image, and LBP’s were extracted in the blocks. Hence, in each block, the border pixels where not taken into account, and the histogram contained far less elements. Now, the LBP’s are extracted first, and then the image is split into blocks.

This function computes the LBP features for the whole image, using the given bob.ip.base.LBP instance. Afterwards, the resulting image is split into several blocks with the given block size and overlap, and local LBH histograms are extracted from each region.

Note

To get the required output shape, you can use lbphs_output_shape() function.

Parameters:

input : array_like (2D)

The source image to compute the LBPHS for

lbp : bob.ip.base.LBP

The LBP class to be used for feature extraction

block_size : (int, int)

The size of the blocks in which the LBP histograms are split

block_overlap : (int, int)

[default: (0, 0)] The overlap of the blocks in which the LBP histograms are split

output : array_like(2D, uint64)

If given, the resulting LBPHS features will be written to this array; must have the size #output-blocks, #LBP-labels (see lbphs_output_shape())

Returns:

output : array_like(2D, uint64)

The resulting LBPHS features of the size #output-blocks, #LBP-labels; the same array as the output parameter, when given.
bob.ip.base.lbphs_output_shape(input, lbp, block_size[, block_overlap]) → shape

Returns the shape of the output image that is required to compute the bob.ip.base.lbphs() function

Parameters:

input : array_like (2D)

The source image to compute the LBPHS for

lbp : bob.ip.base.LBP

The LBP class to be used for feature extraction

block_size : (int, int)

The size of the blocks in which the LBP histograms are split

block_overlap : (int, int)

[default: (0, 0)] The overlap of the blocks in which the LBP histograms are split

Returns:

shape : (int, int)

The shape of the LBP histogram sequences, which is (#blocks, #labels).
bob.ip.base.max_rect_in_mask(mask) → rect

Given a 2D mask (a 2D blitz array of booleans), compute the maximum rectangle which only contains true values.

The resulting rectangle contains the coordinates in the following order:

  1. The y-coordinate of the top left corner
  2. The x-coordinate of the top left corner
  3. The height of the rectangle
  4. The width of the rectangle

Parameters:

mask : array_like (2D, bool)

The mask of boolean values, e.g., as a result of bob.ip.base.GeomNorm.process()

Returns:

rect : (int, int, int, int)

The resulting rectangle: (top, left, height, width)
bob.ip.base.median(src, radius[, dst]) → dst

Performs a median filtering of the input image with the given radius

This function performs a median filtering of the given src image with the given radius and writes the result to the given dst image. Both gray-level and color images are supported, and the input and output datatype must be identical.

Median filtering iterates with a mask of size (2*radius[0]+1, 2*radius[1]+1) over the input image. For each input region, the pixels under the mask are sorted and the median value (the middle element of the sorted list) is written into the dst image. Therefore, the dst is smaller than the src image, i.e., by 2*radius pixels.

Parameters:

src : array_like (2D or 3D)

The source image to filter, might be a gray level image or a color image

radius : (int, int)

The radius of the median filter; the final filter will have the size (2*radius[0]+1, 2*radius[1]+1)

dst : array_like (2D or 3D)

The median-filtered image to write; need to be of size src.shape - 2*radius; if not specified, it will be created

Returns:

dst : array_like (2D or 3D)

The median-filtered image; the same as the dst parameter, if specified
bob.ip.base.rotate()
  • rotate(src, rotation_angle) -> dst
  • rotate(src, dst, rotation_angle) -> None
  • rotate(src, src_mask, dst, dst_mask, rotation_angle) -> None

Rotates an image.

This function rotates an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:

  1. Given a source image and a rotation angle, the rotated image is returned in the size bob.ip.base.rotated_output_shape()
  2. Given source and destination image and the rotation angle, the source image is rotated and filled into the destination image.
  3. Same as 2., but additionally boolean masks will be read and filled with according values.

Note

Since the implementation uses a different interpolation style than before, results might slightly differ.

Parameters:

src : array_like (2D or 3D)

The input image (gray or colored) that should be rotated

dst : array_like (2D or 3D, float)

The resulting scaled gray or color image, should be in size bob.ip.base.rotated_output_shape()

src_mask : array_like (bool, 2D or 3D)

An input mask of valid pixels before geometric normalization, must be of same size as src

dst_mask : array_like (bool, 2D or 3D)

The output mask of valid pixels after geometric normalization, must be of same size as dst

rotation_angle : float

the rotation angle that should be applied to the image

Returns:

dst : array_like (2D, float)

The resulting rotated image
bob.ip.base.rotated_output_shape(src, angle) → rotated_shape

This function returns the shape of the rotated image for the given image and angle

Parameters:

src : array_like (2D,3D)

The src image which which should be scaled

angle : float

The rotation angle in degrees to rotate the src image with

Returns:

rotated_shape : (int, int) or (int, int, int)

The shape of the rotated dst image required in a call to bob.ip.base.rotate()
bob.ip.base.scale()
  • scale(src, scaling_factor) -> dst
  • scale(src, dst) -> None
  • scale(src, src_mask, dst, dst_mask) -> None

Scales an image.

This function scales an image using bi-linear interpolation. It supports 2D and 3D input array/image (NumPy array) of type numpy.uint8, numpy.uint16 and numpy.float64. Basically, this function can be called in three different ways:

  1. Given a source image and a scale factor, the scaled image is returned in the size bob.ip.base.scaled_output_shape()
  2. Given source and destination image, the source image is scaled such that it fits into the destination image.
  3. Same as 2., but additionally boolean masks will be read and filled with according values.

Note

For 2. and 3., scale factors are computed for both directions independently. Factually, this means that the image might be stretched in either direction, i.e., the aspect ratio is not identical for the horizontal and vertical direction. Even for 1. this might apply, e.g., when src.shape * scaling_factor does not result in integral values.

Parameters:

src : array_like (2D or 3D)

The input image (gray or colored) that should be scaled

dst : array_like (2D or 3D, float)

The resulting scaled gray or color image

src_mask : array_like (bool, 2D or 3D)

An input mask of valid pixels before geometric normalization, must be of same size as src

dst_mask : array_like (bool, 2D or 3D)

The output mask of valid pixels after geometric normalization, must be of same size as dst

scaling_factor : float

the scaling factor that should be applied to the image; can be negative, but cannot be 0.

Returns:

dst : array_like (2D, float)

The resulting scaled image
bob.ip.base.scaled_output_shape(src, scaling_factor) → scaled_shape

This function returns the shape of the scaled image for the given image and scale

The function tries its best to compute an integral-valued shape given the shape of the input image and the given scale factor. Nevertheless, for non-round scale factors this might not work out perfectly.

Parameters:

src : array_like (2D,3D)

The src image which which should be scaled

scaling_factor : float

The scaling factor to scale the src image with

Returns:

scaled_shape : (int, int) or (int, int, int)

The shape of the scaled dst image required in a call to bob.ip.base.scale()
bob.ip.base.shift(src, offset[, dst][, src_mask][, dst_mask][, fill_pattern]) → dst[source]

Shifts the given image src image with the given offset (might be negative).

If dst is specified, the image is shifted into the dst image. Ideally, dst should have the same size as src, but other sizes work as well. When dst is None (the default), it is created in the same size as src. When masks are given, the need to be of the same size as the src and dst parameters. When shift to regions are outside the image, the shifted image will contain fill_pattern and the mask will be set to False

Parameters

src
: array_like (2D or 3D)
The source image to flip.
crop_offset
: (int, int)
The position in src coordinates to start cropping; might be negative
crop_size
: (int, int)
The size of the cropped image; might be omitted when the dst is given
dst
: array_like (2D or 3D)
If given, the destination to crop src to.
src_mask, dst_mask: array_like(bool, 2D or 3D)
Masks that define, where src and dst are valid
fill_pattern: number
[default: 0] The value to set outside the croppable area

Returns

dst
: array_like (2D or 3D)
The cropped image
bob.ip.base.sobel(src[, border][, dst]) → dst

Performs a Sobel filtering of the input image

This function will perform a Sobel filtering woth both the vertical and the horizontal filter. A Sobel filter is an edge detector, which will detect either horizontal or vertical edges. The two filter are given as:

S_y =  \left\lgroup\begin{array}{ccc} -1 & -2 & -1 \\ 0 & 0 &
0 \\ 1 & 2 & 1 \end{array}\right\rgroup \qquad S_x =
\left\lgroup\begin{array}{ccc} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 &
1 \end{array}\right\rgroup

If given, the dst array should have the expected type (numpy.float64) and two layers of the same size as the input image. Finally, the result of the vertical filter will be put into the first layer of dst[0], while the result of the horizontal filter will be written to dst[1].

Parameters:

src : array_like (2D, float)

The source image to filter

border : bob.sp.BorderType

[default: bob.sp.BorderType.Mirror] The extrapolation method used by the convolution at the border

dst : array_like (3D, float)

The Sobel-filtered image to write; need to be of size [2] + src.shape; if not specified, it will be created

Returns:

dst : array_like (3D, float)

The Sobel-filtered image; the same as the dst parameter, if specified
bob.ip.base.zigzag(src, dst, right_first) → None

Extracts a 1D array using a zigzag pattern from a 2D array

This function extracts a 1D array using a zigzag pattern from a 2D array. If bottom_first is set to True, the second element of the pattern is taken at the bottom of the upper left element, otherwise it is taken at the right of the upper left element. The input is expected to be a 2D dimensional array. The output is expected to be a 1D dimensional array. This method only supports arrays of the following data types:

  • numpy.uint8
  • numpy.uint16
  • numpy.float64 (or the native python float)

To create an object with a scalar type that will be accepted by this method, use a construction like the following:

>> import numpy
>> input_righttype = input_wrongtype.astype(numpy.float64)

Parameters:

src : array_like (uint8|uint16|float64, 2D)

The source matrix.

dst : array_like (uint8|uint16|float64, 1D)

The destination matrix.

right_first : scalar (bool)

Tells whether the zigzag pattern start to move to the right or not