\[\DeclareMathOperator{\erf}{erf}
\DeclareMathOperator{\argmin}{argmin}
\newcommand{\R}{\mathbb{R}}
\newcommand{\n}{\boldsymbol{n}}\]
Module implementing nonparametric regressions using kernel methods.
NonParametric Regression Methods
Methods must either inherit or follow the same definition as the
pyqt_fit.npr_methods.RegressionKernelMethod.

pyqt_fit.npr_methods.compute_bandwidth(reg)[source]
Compute the bandwidth and covariance for the model, based of its xdata attribute

class pyqt_fit.npr_methods.RegressionKernelMethod[source]
Base class for regression kernel methods
The following methods are interface methods that should be overriden with ones specific to the implemented method.

fit(reg)[source]
Fit the method and returns the fitted object that will be used for actual evaluation.
The object needs to call the pyqt_fit.nonparam_regression.NonParamRegression.set_actual_bandwidth()
method with the computed bandwidth and covariance.
Default:  Compute the bandwidth based on the real data and set it in the regression object 

evaluate(points, out)[source]
Evaluate the regression of the provided points.
Parameters: 
 points (ndarray) – 2darray of points to compute the regression on. Each column is a point.
 out (ndarray) – 1darray in which to store the result

Return type:  ndarray

Returns:  The method must return the out array, updated with the regression values

Provided methods
Only extra methods will be described:

class pyqt_fit.npr_methods.SpatialAverage[source]
Perform a NadarayaWatson regression on the data (i.e. also called
localconstant regression) using a gaussian kernel.
The NadarayaWatson estimate is given by:
\[f_n(x) \triangleq \frac{\sum_i K\left(\frac{xX_i}{h}\right) Y_i}
{\sum_i K\left(\frac{xX_i}{h}\right)}\]
Where \(K(x)\) is the kernel and must be such that \(E(K(x)) = 0\)
and \(h\) is the bandwidth of the method.
Parameters: 
 xdata (ndarray) – Explaining variables (at most 2D array)
 ydata (ndarray) – Explained variables (should be 1D array)
 cov (ndarray or callable) – If an ndarray, it should be a 2D array giving the matrix of
covariance of the gaussian kernel. Otherwise, it should be a function
cov(xdata, ydata) returning the covariance matrix.


q[source]
Degree of the fitted polynom

correction()[source]
The correction coefficient allows to change the width of the kernel
depending on the point considered. It can be either a constant (to
correct globaly the kernel width), or a 1D array of same size as the
input.

set_density_correction()[source]
Add a correction coefficient depending on the density of the input

class pyqt_fit.npr_methods.LocalLinearKernel1D[source]
Perform a locallinear regression using a gaussian kernel.
The local constant regression is the function that minimises, for each
position:
\[f_n(x) \triangleq \argmin_{a_0\in\mathbb{R}}
\sum_i K\left(\frac{xX_i}{h}\right)
\left(Y_i  a_0  a_1(xX_i)\right)^2\]
Where \(K(x)\) is the kernel and must be such that \(E(K(x)) = 0\)
and \(h\) is the bandwidth of the method.

q[source]
Degree of the fitted polynom
This class uses the following function:

pyqt_fit.py_local_linear.local_linear_1d(bw, xdata, ydata, points, kernel, out)[source]
We are trying to find the fitting for points \(x\) given a gaussian kernel
Given the following definitions:
\[\begin{split}x_0 &=& xx_i\end{split}\]\[\begin{split}\begin{array}{rlcrlc}
w_i &=& \mathcal{K}\left(\frac{x_0}{h}\right) & W &=& \sum_i w_i \\
X &=& \sum_i w_i x_0 & X_2 &=& w_i x_0^2 \\
Y &=& \sum_i w_i y_i & Y_2 &=& \sum_i w_i y_i x_0
\end{array}\end{split}\]
The fitted value is given by:
\[f(x) = \frac{X_2 T  X Y_2}{W X_2  X^2}\]

class pyqt_fit.npr_methods.LocalPolynomialKernel1D(q=3)[source]
Perform a localpolynomial regression using a userprovided kernel
(Gaussian by default).
The local constant regression is the function that minimises, for each
position:
\[f_n(x) \triangleq \argmin_{a_0\in\mathbb{R}}
\sum_i K\left(\frac{xX_i}{h}\right)
\left(Y_i  a_0  a_1(xX_i)  \ldots 
a_q \frac{(xX_i)^q}{q!}\right)^2\]
Where \(K(x)\) is the kernel such that \(E(K(x)) = 0\), \(q\)
is the order of the fitted polynomial and \(h\) is the bandwidth of
the method. It is also recommended to have \(\int_\mathbb{R} x^2K(x)dx
= 1\), (i.e. variance of the kernel is 1) or the effective bandwidth will be
scaled by the squareroot of this integral (i.e. the standard deviation of
the kernel).
Parameters: 
 xdata (ndarray) – Explaining variables (at most 2D array)
 ydata (ndarray) – Explained variables (should be 1D array)
 q (int) – Order of the polynomial to fit. Default: 3
 cov (float or callable) – If an float, it should be a variance of the gaussian kernel.
Otherwise, it should be a function cov(xdata, ydata) returning
the variance.
Default: scotts_covariance


q[source]
Degree of the fitted polynomials

class pyqt_fit.npr_methods.LocalPolynomialKernel(q=3)[source]
Perform a localpolynomial regression in ND using a userprovided kernel
(Gaussian by default).
The local constant regression is the function that minimises,
for each position:
\[f_n(x) \triangleq \argmin_{a_0\in\mathbb{R}}
\sum_i K\left(\frac{xX_i}{h}\right)
\left(Y_i  a_0  \mathcal{P}_q(X_ix)\right)^2\]
Where \(K(x)\) is the kernel such that \(E(K(x)) = 0\), \(q\)
is the order of the fitted polynomial, \(\mathcal{P}_q(x)\) is a
polynomial of order \(d\) in \(x\) and \(h\) is the bandwidth
of the method.
The polynomial \(\mathcal{P}_q(x)\) is of the form:
\[\mathcal{F}_d(k) = \left\{ \n \in \mathbb{N}^d \middle
\sum_{i=1}^d n_i = k \right\}\]\[\mathcal{P}_q(x_1,\ldots,x_d) = \sum_{k=1}^q
\sum_{\n\in\mathcal{F}_d(k)} a_{k,\n}
\prod_{i=1}^d x_i^{n_i}\]
For example we have:
\[\mathcal{P}_2(x,y) = a_{110} x + a_{101} y + a_{220} x^2 +
a_{211} xy + a_{202} y^2\]
Parameters: 
 xdata (ndarray) – Explaining variables (at most 2D array).
The shape should be (N,D) with D the dimension of the problem
and N the number of points. For 1D array, the shape can be (N,),
in which case it will be converted to (N,1) array.
 ydata (ndarray) – Explained variables (should be 1D array). The shape
must be (N,).
 q (int) – Order of the polynomial to fit. Default: 3
 kernel (callable) – Kernel to use for the weights. Call is
kernel(points) and should return an array of values the same size
as points. If None, the kernel will be normal_kernel(D).
 cov (float or callable) – If an float, it should be a variance of the gaussian kernel.
Otherwise, it should be a function cov(xdata, ydata) returning
the variance.
Default: scotts_covariance


q[source]
Degree of the fitted polynomials

pyqt_fit.npr_methods.default_method
Defaut nonparametric regression method.
:Default: LocalPolynomialKernel(q=1)
Utility functions and classes

class pyqt_fit.npr_methods.PolynomialDesignMatrix1D(degree)[source]

class pyqt_fit.npr_methods.PolynomialDesignMatrix(dim, deg)[source]
Class used to create a design matrix for polynomial regression

__call__(x, out=None)[source]
Creates the design matrix for polynomial fitting using the points x.
Parameters: 
 x (ndarray) – Points to create the design matrix.
Shape must be (D,N) or (N,), where D is the dimension of
the problem, 1 if not there.
 deg (int) – Degree of the fitting polynomial
 factors (ndarray) – Scaling factor for the columns of the design
matrix. The shape should be (M,) or (M,1), where M is the number
of columns of the out. This value can be obtained using
the designMatrixSize() function.

Returns:  The design matrix as a (M,N) matrix.
