Data Management

Importing and exporting data

mlpy.data_fromfile(file)

Read data file in the form:

x11 [TAB] x12 [TAB] ... x1n [TAB] y1
x21 [TAB] x22 [TAB] ... x2n [TAB] y2
 .         .        .    .        .
 .         .         .   .        .
 .         .          .  .        .
xm1 [TAB] xm2 [TAB] ... xmn [TAB] ym

where xij are float and yi are integer.

Input

  • file - data file name

Output

  • x - data [2D numpy array float]
  • y - classes [1D numpy array integer]

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x, y = data_fromfile('data_example.dat')
>>> x
array([[ 1.1,  2. ,  5.3,  3.1],
...    [ 3.7,  1.4,  2.3,  4.5],
...    [ 1.4,  5.4,  3.1,  1.4]])
>>> y
array([ 1, -1,  1])
mlpy.data_fromfile_wl(file)

Read data file in the form:

x11 [TAB] x12 [TAB] ... x1n [TAB]
x21 [TAB] x22 [TAB] ... x2n [TAB]
 .         .        .    .
 .         .         .   .       
 .         .          .  .       
xm1 [TAB] xm2 [TAB] ... xmn [TAB]

where xij are float.

Input

  • file - data file name

Output

  • x - data [2D numpy array float]

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x, y = data_fromfile('data_example.dat')
>>> x
array([[ 1.1,  2. ,  5.3,  3.1],
...    [ 3.7,  1.4,  2.3,  4.5],
...    [ 1.4,  5.4,  3.1,  1.4]])
mlpy.data_tofile(file, x, y, sep='t')

Write data file in the form:

x11 [sep] x12 [sep] ... x1n [sep] y1
x21 [sep] x22 [sep] ... x2n [sep] y2
 .         .        .    .        .
 .         .         .   .        .
 .         .          .  .        .
xm1 [sep] xm2 [sep] ... xmn [sep] ym

where xij are float and yi are integer.

Input

  • file - data file name
  • x - data [2D numpy array float]
  • y - classes [1D numpy array integer]
  • sep - separator
mlpy.data_tofile_wl(file, x, sep='t')

Write data file in the form:

x11 [sep] x12 [sep] ... x1n [sep]
x21 [sep] x22 [sep] ... x2n [sep]
 .         .        .    .       
 .         .         .   .       
 .         .          .  .       
xm1 [sep] xm2 [sep] ... xmn [sep]

where xij are float.

Input

  • file - data file name
  • x - data [2D numpy array float]
  • sep - separator

Normalization and Standardization

mlpy.data_normalize(x)

Normalize numpy array (2D) x.

Input

  • x - data [2D numpy array float]

Output

  • normalized data

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x = array([[ 1.1,  2. ,  5.3,  3.1],
...            [ 3.7,  1.4,  2.3,  4.5],
...            [ 1.4,  5.4,  3.1,  1.4]])
>>> data_normalize(x)
array([[-0.9797065 , -0.48295391,  1.33847226,  0.12418815],
...    [ 0.52197912, -1.13395464, -0.48598056,  1.09795608],
...    [-0.75217354,  1.35919078,  0.1451563 , -0.75217354]])
mlpy.data_standardize(x, p=None)

Standardize numpy array (2D) x and optionally standardize p using mean and std of x.

Input

  • x - data [2D numpy array float]
  • p - optional data [2D numpy array float]

Output

  • standardized data

Example:

>>> from numpy import *
>>> from mlpy import *
>>> x = array([[ 1.1,  2. ,  5.3,  3.1],
...            [ 3.7,  1.4,  2.3,  4.5],
...            [ 1.4,  5.4,  3.1,  1.4]])
>>> data_standardize(x)
array([[-0.67958381, -0.43266792,  1.1157668 ,  0.06441566],
...    [ 1.1482623 , -0.71081158, -0.81536804,  0.96623494],
...    [-0.46867849,  1.1434795 , -0.30039875, -1.0306506 ]])

Table Of Contents

Previous topic

Feature List Analysis

Next topic

Miscellaneous

This Page