nolearn.datasetΒΆ

A Dataset is a simple abstraction around a data and a target matrix.

A Dataset’s data and target attributes are available via attributes of the same name:

>>> data = np.array([[3, 2, 1], [2, 1, 0]] * 4)
>>> target = np.array([3, 2] * 4)
>>> dataset = Dataset(data, target)
>>> dataset.data is data
True
>>> dataset.target is target
True

Attribute split_indices gives us a cross-validation generator:

>>> for train_index, test_index in dataset.split_indices:
...     X_train, X_test, = data[train_index], data[test_index]
...     y_train, y_test, = target[train_index], target[test_index]

An example of where a cross-validation generator like split_indices returns it is expected is sklearn.grid_search.GridSearchCV.

If all you want is a train/test split of your data, you can simply call Dataset.train_test_split():

>>> X_train, X_test, y_train, y_test = dataset.train_test_split()
>>> X_train.shape, X_test.shape, y_train.shape, y_test.shape
((6, 3), (2, 3), (6,), (2,))

Previous topic

nolearn.console

Next topic

nolearn.dbn

This Page