Boosting Strong Classifiers¶
Several tasks can be achieved by a boosted classifier:
- A univariate classification task assigns each sample one of two possible classes: . In this implementation, class is assigned when the (real-valued) outcome of the classifier is positive, or otherwise.
- A multivariate classification task assigns each sample one of possible classes: -dimensional output vector is assigned for each class, and the class with the highest outcome is assigned: with . To train the multi-variate classifier, target values for each training sample are assigned a for the correct class, and a for all other classes.
- A (multivariate) regression task tries to learn a function based on several training examples.
To achieve this goal, a strong classifier is build out of a weighted list of weak classifiers :
Note
For the univariate case, both and the weak classifier result are floating point values. In the multivariate case, is a vector of weights – one for each output dimension – and the weak classifier returns a vector of floating point values as well.
Weak Classifiers¶
Currently, two types of weak classifiers are implemented in this boosting framework.
Stump classifier¶
The first classifier, which can only handle univariate classification tasks, is the bob.learn.boosting.StumpMachine
.
For a given input vector , the classifier bases its decision on a single element of the input vector:
Threshold , polarity and index are parameters of the classifier, which are trained using the bob.learn.boosting.StumpTrainer
.
For a given training set and according target values , the threshold is computed for each input index , such that the lowest classification error is obtained, and the with the lowest training classification error is taken.
The polarity is set to , if values lower than the threshold should be considered as positive examples, or to otherwise.
To compute the classification error for a given , the gradient of a loss function is taken into consideration.
For the stump trainer, usually the bob.learn.boosting.ExponentialLoss
is considered as the loss function.
Look-Up-Table classifier¶
The second classifier, which can handle univariate and multivariate classification and regression tasks, is the bob.learn.boosting.LUTMachine
.
This classifier is designed to handle input vectors with discrete values only.
Again, the decision of the weak classifier is based on a single element of the input vector .
In the univariate case, for each of the possible discrete values of , a decision is selected:
This look-up-table LUT and the feature index is trained by the bob.learn.boosting.LUTTrainer
.
In the multivariate case, each output is handled independently, i.e., a separate look-up-table and a separate feature index is assigned for each output dimension :
Note
As a variant, the feature index can be selected to be shared
for all outputs, see bob.learn.boosting.LUTTrainer
for details.
A weak look-up-table classifier is learned using the bob.learn.boosting.LUTTrainer
.
Strong classifier¶
The strong classifier, which is of type bob.learn.boosting.BoostedMachine
, is a weighted combination of weak classifiers, which are usually of the same type.
It can be trained with the bob.learn.boosting.Boosting
trainer, which takes a list of training samples, and a list of univariate or multivariate target vectors.
In several rounds, the trainer computes (here, only the univariate case is considered, but the multivariate case is similar – simply replace scores by score vectors.):
The classification results (the so-called scores) for the current strong classifier:
The derivative of the loss function, based on the current scores and the target values:
This loss gradient is used to select a new weak machine using a weak trainer (see above).
W_i = trainer.train([\vec x_p], [\nabla_p])
The scores of the weak machine are computed:
The weight for the new machine is optimized using
scipy.optimize.fmin_l_bfgs_b
. This call will use both the loss and its derivative to compute the optimal weight for the new classifier:w_i = scipy.optimize.fmin_l_bfgs_b(...)
The new weak machine is added to the strong classifier.
Loss functions¶
As shown above, the loss functions define, how well the currently predicted scores fit to the target values . Depending on the desired task, and on the type of classifier, different loss functions might be used:
- The
bob.learn.boosting.ExponentialLoss
can be used for the binary classification task, i.e., when target values are in - The
bob.learn.boosting.LogitLoss
can be used for the multi-variate classification task, i.e., when target vectors have entries from - The
bob.learn.boosting.JesorskyLoss
can be used for the particular multi-variate regression task of learning the locations of facial features.
Other loss functions, e.g., using the Euclidean distance for regression, should be easily implementable.