User’s Guide¶
After launching the python interpreter (assuming that the environment is properly set up), you could get the training set as follows:
>>> import bob.db.mnist
>>> db = bob.db.mnist.Database('PATH_TO_DATA_FROM_YANN_LECUN_WEBSITE') # 4 binary .gz compressed files
>>> images, labels = db.data(groups='train', labels=[0,1,2,3,4,5,6,7,8,9])
In this case, this should return two numpy.ndarray
s:
- images contain the raw data (60,000 samples of dimension 784 [28x28 pixels images])
- labels are the corresponding classes (digits 0 to 9) for each of the 60,000 samples
If you don’t have the data installed on your machine, you can also use the following set of commands that will:
- first look for the database in the
bob/db/mnist
subdirectory and use it if is available - automatically download it from Yann Lecun’s website into a temporary folder that will be erased when the destructor of the
bob.db.mnist.Database
is called. - automatically download it into the provided directory that will not be deleted.
>>> import bob.db.mnist
>>> db = bob.db.mnist.Database() # Check for the data files locally, and download them if required
>>> images, labels = db.data(groups='train', labels=[0,1,2,3,4,5,6,7,8,9])
>>> del db # delete the temporary downloaded files if any
or:
>>> db = bob.db.mnist.Database("Directory") # Persistently downloads files into the folder "Directory"
>>> images, labels = db.data(groups='train', labels=[0,1,2,3,4,5,6,7,8,9])
>>> del db # The download directory stays