\[\DeclareMathOperator{\erf}{erf} \DeclareMathOperator{\argmin}{argmin} \newcommand{\R}{\mathbb{R}} \newcommand{\n}{\boldsymbol{n}}\]

Module pyqt_fit.kernels

Author:Pierre Barbier de Reuille <pierre.barbierdereuille@gmail.com>

Module providing a set of kernels for use with either the pyqt_fit.kde or the kernel_smoothing module.

Kernels should be created following this template:

Helper class

This class is provided with default implementations of everything in term of the PDF.

class pyqt_fit.kernels.Kernel1D[source]

A 1D kernel \(K(z)\) is a function with the following properties:

\[\begin{split}\begin{array}{rcl} \int_\mathbb{R} K(z) &=& 1 \\ \int_\mathbb{R} zK(z)dz &=& 0 \\ \int_\mathbb{R} z^2K(z) dz &<& \infty \quad (\approx 1) \end{array}\end{split}\]

Which translates into the function should have:

  • a sum of 1 (i.e. a valid density of probability);
  • an average of 0 (i.e. centered);
  • a finite variance. It is even recommanded that the variance is close to 1 to give a uniform meaning to the bandwidth.
cut
Type:float

Cutting point after which there is a negligeable part of the probability. More formally, if \(c\) is the cutting point:

\[\int_{-c}^c p(x) dx \approx 1\]
lower
Type:float

Lower bound of the support of the PDF. Formally, if \(l\) is the lower bound:

\[\int_{-\infty}^l p(x)dx = 0\]
upper
Type:float

Upper bound of the support of the PDF. Formally, if \(u\) is the upper bound:

\[\int_u^\infty p(x)dx = 0\]
cdf(z, out=None)[source]

Returns the cumulative density function on the points z, i.e.:

\[K_0(z) = \int_{-\infty}^z K(t) dt\]
dct(z, out=None)[source]

DCT of the kernel on the points of z. The points will always be provided as a grid with \(2^n\) points, representing the whole frequency range to be explored.

fft(z, out=None)[source]

FFT of the kernel on the points of z. The points will always be provided as a grid with \(2^n\) points, representing the whole frequency range to be explored. For convenience, the second half of the points will be provided as negative values.

pdf(z, out=None)[source]

Returns the density of the kernel on the points z. This is the funtion \(K(z)\) itself.

Parameters:
  • z (ndarray) – Array of points to evaluate the function on. The method should accept any shape of array.
  • out (ndarray) – If provided, it will be of the same shape as z and the result should be stored in it. Ideally, it should be used for as many intermediate computation as possible.
pm1(z, out=None)[source]

Returns the first moment of the density function, i.e.:

\[K_1(z) = \int_{-\infty}^z z K(t) dt\]
pm2(z, out=None)[source]

Returns the second moment of the density function, i.e.:

\[K_2(z) = \int_{-\infty}^z z^2 K(t) dt\]

Gaussian Kernels

class pyqt_fit.kernels.normal_kernel(dim)[source]

Returns a function-object for the PDF of a Normal kernel of variance identity and average 0 in dimension dim.

pdf(xs)[source]

Return the probability density of the function.

Parameters:xs (ndarray) – Array of shape (D,N) where D is the dimension of the kernel and N the number of points.
Returns:an array of shape (N,) with the density on each point of xs
class pyqt_fit.kernels.normal_kernel1d[source]

1D normal density kernel with extra integrals for 1D bounded kernel estimation.

cdf(z, out=None)[source]

Cumulative density of probability. The formula used is:

\[\text{cdf}(z) \triangleq \int_{-\infty}^z \phi(z) dz = \frac{1}{2}\text{erf}\left(\frac{z}{\sqrt{2}}\right) + \frac{1}{2}\]
dct(z, out=None)[source]

Returns the DCT of the normal distribution

fft(z, out=None)[source]

Returns the FFT of the normal distribution

pdf(z, out=None)[source]

Return the probability density of the function. The formula used is:

\[\phi(z) = \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}\]
Parameters:xs (ndarray) – Array of any shape
Returns:an array of shape identical to xs
pm1(z, out=None)[source]

Partial moment of order 1:

\[\text{pm1}(z) \triangleq \int_{-\infty}^z z\phi(z) dz = -\frac{1}{\sqrt{2\pi}}e^{-\frac{z^2}{2}}\]
pm2(z, out=None)[source]

Partial moment of order 2:

\[\text{pm2}(z) \triangleq \int_{-\infty}^z z^2\phi(z) dz = \frac{1}{2}\text{erf}\left(\frac{z}{2}\right) - \frac{z}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} + \frac{1}{2}\]

Tricube Kernel

class pyqt_fit.kernels.tricube[source]

Return the kernel corresponding to a tri-cube distribution, whose expression is. The tri-cube function is given by:

\[\begin{split}f_r(x) = \left\{\begin{array}{ll} \left(1-|x|^3\right)^3 & \text{, if } x \in [-1;1]\\ 0 & \text{, otherwise} \end{array}\right.\end{split}\]

As \(f_r\) is not a probability and is not of variance 1, we use a normalized function:

\[f(x) = a b f_r(ax)\]\[a = \sqrt{\frac{35}{243}}\]\[b = \frac{70}{81}\]
cdf(z, out=None)[source]

CDF of the distribution:

\[\begin{split}\text{cdf}(x) = \left\{\begin{array}{ll} \frac{1}{162} {\left(60 (ax)^{7} - 7 {\left(2 (ax)^{10} + 15 (ax)^{4}\right)} \mathrm{sgn}\left(ax\right) + 140 ax + 81\right)} & \text{, if}x\in[-1/a;1/a]\\ 0 & \text{, if} x < -1/a \\ 1 & \text{, if} x > 1/a \end{array}\right.\end{split}\]
pm1(z, out=None)[source]

Partial moment of order 1:

\[\begin{split}\text{pm1}(x) = \left\{\begin{array}{ll} \frac{7}{3564a} {\left(165 (ax)^{8} - 8 {\left(5 (ax)^{11} + 33 (ax)^{5}\right)} \mathrm{sgn}\left(ax\right) + 220 (ax)^{2} - 81\right)} & \text{, if} x\in [-1/a;1/a]\\ 0 & \text{, otherwise} \end{array}\right.\end{split}\]
pm2(z, out=None)[source]

Partial moment of order 2:

\[\begin{split}\text{pm2}(x) = \left\{\begin{array}{ll} \frac{35}{486a^2} {\left(4 (ax)^{9} + 4 (ax)^{3} - {\left((ax)^{12} + 6 (ax)^{6}\right)} \mathrm{sgn}\left(ax\right) + 1\right)} & \text{, if} x\in[-1/a;1/a] \\ 0 & \text{, if } x < -1/a \\ 1 & \text{, if } x > 1/a \end{array}\right.\end{split}\]

Epanechnikov Kernel

class pyqt_fit.kernels.Epanechnikov[source]

1D Epanechnikov density kernel with extra integrals for 1D bounded kernel estimation.

cdf(xs, out=None)[source]

CDF of the distribution. The CDF is defined on the interval \([-\sqrt{5}:\sqrt{5}]\) as:

\[\begin{split}\text{cdf}(x) = \left\{\begin{array}{ll} \frac{1}{2} + \frac{3}{4\sqrt{5}} x - \frac{3}{20\sqrt{5}}x^3 & \text{, if } x\in[-\sqrt{5}:\sqrt{5}] \\ 0 & \text{, if } x < -\sqrt{5} \\ 1 & \text{, if } x > \sqrt{5} \end{array}\right.\end{split}\]
pdf(xs, out=None)[source]

The PDF of the kernel is usually given by:

\[\begin{split}f_r(x) = \left\{\begin{array}{ll} \frac{3}{4} \left(1-x^2\right) & \text{, if} x \in [-1:1]\\ 0 & \text{, otherwise} \end{array}\right.\end{split}\]

As \(f_r\) is not of variance 1 (and therefore would need adjustments for the bandwidth selection), we use a normalized function:

\[f(x) = \frac{1}{\sqrt{5}}f\left(\frac{x}{\sqrt{5}}\right)\]
pm1(xs, out=None)[source]

First partial moment of the distribution:

\[\begin{split}\text{pm1}(x) = \left\{\begin{array}{ll} -\frac{3\sqrt{5}}{16}\left(1-\frac{2}{5}x^2+\frac{1}{25}x^4\right) & \text{, if } x\in[-\sqrt{5}:\sqrt{5}] \\ 0 & \text{, otherwise} \end{array}\right.\end{split}\]
pm2(xs, out=None)[source]

Second partial moment of the distribution:

\[\begin{split}\text{pm2}(x) = \left\{\begin{array}{ll} \frac{5}{20}\left(2 + \frac{1}{\sqrt{5}}x^3 - \frac{3}{5^{5/2}}x^5 \right) & \text{, if } x\in[-\sqrt{5}:\sqrt{5}] \\ 0 & \text{, if } x < -\sqrt{5} \\ 1 & \text{, if } x > \sqrt{5} \end{array}\right.\end{split}\]

Higher Order Kernels

High order kernels are kernel that give up being valid probabilities. We will say a kernel \(K_{[n]}\) is of order \(n\) if:

\[\begin{split}\begin{array}{rcl} \int_\R K_{[n]}(x) dx & = & 1 \\ \forall 1 \leq k < n \int_\R x^k K_{[n]} dx & = & 0 \\ \int_\R x^n K_{[n]} dx & \neq & 0 \end{array}\end{split}\]

PyQt-Fit implements two high order kernels.

class pyqt_fit.kernels.Epanechnikov_order4[source]

Order 4 Epanechnikov kernel. That is:

\[K_{[4]}(x) = \frac{3}{2} K(x) + \frac{1}{2} x K'(x) = -\frac{15}{8}x^2+\frac{9}{8}\]

where \(K\) is the non-normalized Epanechnikov kernel.

class pyqt_fit.kernels.normal_order4[source]

Order 4 Normal kernel. That is:

\[\phi_{[4]}(x) = \frac{3}{2} \phi(x) + \frac{1}{2} x \phi'(x) = \frac{1}{2}(3-x^2)\phi(x)\]

where \(\phi\) is the normal kernel.

Table Of Contents

Previous topic

Module pyqt_fit.kde_methods

Next topic

Module pyqt_fit.utils

This Page