$\DeclareMathOperator{\erf}{erf} \DeclareMathOperator{\argmin}{argmin} \newcommand{\R}{\mathbb{R}} \newcommand{\n}{\boldsymbol{n}}$

# Module pyqt_fit.bootstrap¶

Author: Pierre Barbier de Reuille

This modules provides function for bootstrapping a regression method.

## Bootstrap Shuffling Methods¶

pyqt_fit.bootstrap.bootstrap_residuals(fct, xdata, ydata, repeats=3000, residuals=None, add_residual=None, correct_bias=False, **kwrds)[source]

This implements the residual bootstrapping method for non-linear regression.

Parameters: fct (callable) – Function evaluating the function on xdata at least with fct(xdata) xdata (ndarray of shape (N,) or (k,N) for function with k predictors) – The independent variable where the data is measured ydata (ndarray) – The dependant data residuals (ndarray or callable or None) – Residuals for the estimation on each xdata. If callable, the call will be residuals(ydata, yopt). repeats (int) – Number of repeats for the bootstrapping add_residual (callable or None) – Function that add a residual to a value. The call add_residual(yopt, residual) should return the new ydata, with the residuals ‘applied’. If None, it is considered the residuals should simply be added. correct_bias (boolean) – If true, the additive bias of the residuals is computed and restored kwrds (dict) – Dictionnary present to absorbed unknown named parameters (ndarray, ndarray) 1. xdata, with a new axis at position -2. This correspond to the ‘shuffled’ xdata (as they are not shuffled here) 2.Second item is the shuffled ydata. There is a line per repeat, each line is shuffled independently.
pyqt_fit.bootstrap.bootstrap_regression(fct, xdata, ydata, repeats=3000, **kwrds)[source]

This implements the shuffling of standard bootstrapping method for non-linear regression.

Parameters: fct (callable) – This is the function to optimize xdata (ndarray of shape (N,) or (k,N) for function with k predictors) – The independent variable where the data is measured ydata (ndarray) – The dependant data repeats (int) – Number of repeats for the bootstrapping kwrds (dict) – Dictionnary to absorbed unknown named parameters (ndarray, ndarray) 1. The shuffled x data. The axis -2 has one element per repeat, the other axis are shuffled independently. 2. The shuffled ydata. There is a line per repeat, each line is shuffled independently.

## Main Boostrap Functions¶

pyqt_fit.bootstrap.bootstrap(fit, xdata, ydata, CI, shuffle_method=<function bootstrap_residuals at 0x2b87d5d48c08>, shuffle_args=(), shuffle_kwrds={}, repeats=3000, eval_points=None, full_results=False, nb_workers=None, extra_attrs=(), fit_args=(), fit_kwrds={})[source]

This function implement the bootstrap algorithm for a regression algorithm. It is capable of spreading the load across many threads using shared memory and the multiprocess module.

Parameters: fit (callable) – Method used to compute regression. The call is: f = fit(xdata, ydata, *fit_args, **fit_kwrds)  Fit should return an object that would evaluate the regression on a set of points. The next call will be: f(eval_points)  xdata (ndarray of shape (N,) or (k,N) for function with k predictors) – The independent variable where the data is measured ydata (ndarray) – The dependant data CI (tuple of float) – List of percentiles to extract shuffle_method (callable) – Create shuffled dataset. The call is: shuffle_method(xdata, ydata, y_est, repeat=repeats, *shuffle_args, **shuffle_kwrds)  where y_est is the estimated dependant variable on the xdata. shuffle_args (tuple) – List of arguments for the shuffle method shuffle_kwrds (dict) – Dictionnary of arguments for the shuffle method repeats (int) – Number of repeats for the bootstraping eval_points (ndarray or None) – List of points to evaluate. If None, eval_point is xdata. full_results (bool) – if True, output also the whole set of evaluations nb_worders – Number of worker threads. If None, the number of detected CPUs will be used. And if 1 or less, a single thread will be used. extra_attrs (tuple of str) – List of attributes of the fitting method to extract on top of the y values for confidence intervals fit_args (tuple) – List of extra arguments for the fit callable fit_kwrds (dict) – Dictionnary of extra named arguments for the fit callable BootstrapResult Estimated y on the data, on the evaluation points, the requested confidence intervals and, if requested, the shuffled X, Y and the full estimated distributions.
class pyqt_fit.bootstrap.BootstrapResult(y_fit, y_est, y_eval, CIs, shuffled_xs, shuffled_ys, full_results)

Note

This is a class created with pyqt_fit.utils.namedtuple().

y_fit

Estimator object, fitted on the original data :type: fun(xs) -> ys

y_est

Y estimated on xdata :type: ndarray

eval_points

Points on which the confidence interval are evaluated

y_eval

Y estimated on eval_points

CIs_val

Tuple containing the list of percentiles extracted (i.e. this is a copy of the CIs argument of the bootstrap function.

CIs

List of confidence intervals. The first element is for the estimated values on eval_points. The others are for the extra attributes specified in extra_attrs. Each array is a 3-dimensional array (Q,2,N), where Q is the number of confidence interval (e.g. the length of CIs_val) and N is the number of data points. Values (x,0,y) give the lower bounds and (x,1,y) the upper bounds of the confidence intervals.

shuffled_xs

if full_results is True, the shuffled x’s used for the bootstrapping

shuffled_ys

if full_results is True, the shuffled y’s used for the bootstrapping

full_results

if full_results is True, the estimated y’s for each shuffled_ys