Tutorial

A Simple Example

In this example the performance of SVM classifier is evaluated in a stratified k-fold resampling schema.

First, import NumPy and mlpy modules:

>>> import numpy as np
>>> import mlpy

Then, load a data file (data.dat) containing 30 samples described by 100 features (x) and labels (y):

>>> x, y = mlpy.data_fromfile('data.dat') # import data file
>>> x.shape
(30, 100)

Initialize SVM classifier, specifying kernel type (linear) and regularization parameter (C):

>>> classifier = mlpy.Svm(kernel = 'linear', C = 1.0)  # initialize the svm classifier

Define a stratified 10-fold resampling schema, where idx contains the sample indexes (list of train/test pairs):

>>> idx = mlpy.kfoldS(cl = y, sets = 10)

Actually build train and test data. Train the model on xtr and test it on xts. The performance is evaluated computing the average prediction error:

>>> pred_err = 0.0
>>> for idxtr, idxts in idx:
...     xtr, xts = x[idxtr], x[idxts]       # build training data
...     ytr, yts = y[idxtr], y[idxts]       # build test data
...     ret = classifier.compute(xtr, ytr)  # compute the model
...     pred = classifier.predict(xts)      # test the model on test data
...     pred_err += mlpy.err(yts, pred)          # compute the prediction error
>>> av_pred_err = pred_err / len(idx)       # compute the average prediction error
>>> av_pred_err
0.17499999999999999

Table Of Contents

Previous topic

Install

Next topic

Preprocessing

This Page