A data file can be training or evaluation data. Both files are .tar files.
The training data file train.tar has to contain a x.hdf5 and a y.hdf5. Both HDF5 files have to contain matrices. Let’s call the matrix in x.hdf5 X and similarly the matrix in y.hdf5 Y. (The HDF files both contain exactly one object called x.hdf5 or y.hdf5)
X contains the features of all training examples and Y contains the labels.
Then $X in mathbb{R}^{n times m}$ and $L in mathbb{n times 1}$ where $n$ is the number of training examples and $m$ is the number of features. $L$ might be $mathbb{N}$ or simply strings.
To look at HDF5 files, you can use hdfview.