kaggler package

Submodules

kaggler.const module

kaggler.data_io module

class kaggler.data_io.Clock[source]
check()[source]
report()[source]
class kaggler.data_io.PathJoiner(filename='SETTINGS.json')[source]

Load directory names from SETTINGS.json.

Usage:
# In SETTINGS.json, “data”: “/path/to/data/”. # To load “/path/to/data/targets.array” file to y: PATH = PathJoiner() y = load(PATH.data(‘targets.array’))
kaggler.data_io.beep(n=1)[source]
kaggler.data_io.is_number(s)[source]

Check if a string is a number or not.

kaggler.data_io.limit_stream(stream, count=1, skip=0)[source]
kaggler.data_io.load(filename)[source]
kaggler.data_io.load_array(filename)[source]
kaggler.data_io.load_csv(path)[source]

Load data from a CSV file.

Parameters:
  • path (str) – A path to the CSV format file containing data.
  • dense (boolean) – An optional variable indicating if the return matrix should be dense. By default, it is false.
Returns:

Data matrix X and target vector y

kaggler.data_io.load_data(path, dense=False)[source]

Load data from a CSV, LibSVM or HDF5 file based on the file extension.

Parameters:
  • path (str) – A path to the CSV, LibSVM or HDF5 format file containing data.
  • dense (boolean) – An optional variable indicating if the return matrix should be dense. By default, it is false.
Returns:

Data matrix X and target vector y

kaggler.data_io.load_hdf5(path)[source]

Load data from a HDF5 file.

Parameters:
  • path (str) – A path to the HDF5 format file containing data.
  • dense (boolean) – An optional variable indicating if the return matrix should be dense. By default, it is false.
Returns:

Data matrix X and target vector y

kaggler.data_io.load_obj(filename)[source]
kaggler.data_io.load_sparse(filename)[source]
kaggler.data_io.print_shape_type(*objs)[source]
kaggler.data_io.read_sps(path)[source]

Read a LibSVM file line-by-line.

Parameters:path (str) – A path to the LibSVM file to read.
Yields:data (list) and target (int).
kaggler.data_io.save(filename, data)[source]
kaggler.data_io.save_array(filename, array)[source]
kaggler.data_io.save_csv(X, y, path)[source]

Save data as a CSV file.

Parameters:
  • X (numpy or scipy sparse matrix) – Data matrix
  • y (numpy array) – Target vector.
  • path (str) – Path to the CSV file to save data.
kaggler.data_io.save_data(X, y, path)[source]

Save data as a CSV, LibSVM or HDF5 file based on the file extension.

Parameters:
  • X (numpy or scipy sparse matrix) – Data matrix
  • y (numpy array) – Target vector. If None, all zero vector will be saved.
  • path (str) – Path to the CSV, LibSVM or HDF5 file to save data.
kaggler.data_io.save_hdf5(X, y, path)[source]

Save data as a HDF5 file.

Parameters:
  • X (numpy or scipy sparse matrix) – Data matrix
  • y (numpy array) – Target vector.
  • path (str) – Path to the HDF5 file to save data.
kaggler.data_io.save_libsvm(X, y, path)[source]

Save data as a LibSVM file.

Parameters:
  • X (numpy or scipy sparse matrix) – Data matrix
  • y (numpy array) – Target vector.
  • path (str) – Path to the CSV file to save data.
kaggler.data_io.save_obj(filename, obj)[source]
kaggler.data_io.save_sparse(filename, array)[source]
kaggler.data_io.shuf_file(f, shuf_win)[source]
kaggler.data_io.stream_csv(filename, encoding='utf-8', ignore_errors=False)[source]
kaggler.data_io.stream_lines(filename, encoding='utf-8', ignore_errors=False)[source]

kaggler.util module

kaggler.util.get_downsampled_index()

Return the index that downsamples a vector x by the rate.

kaggler.util.get_downsampled_index0()

Return the index that downsamples 0s of a vector x by the rate.

kaggler.util.point()

Calculate Kaggle points to earn after a competition.

Parameters:
  • rank (int) – final ranking in the private leaderboard.
  • n_team (int) – the number of teams participated in the competition.
  • n_teammate (int) – the number of team members in my team.
  • t (int) – the number of days since the competition ends.
Returns:

returns Kaggle points to earn after a compeittion.

kaggler.util.rank()

Rank a vector x. Ties will be averaged.

kaggler.util.set_column_width()

Set the column width of a matrix X to n_col.

kaggler.util.set_min_max()

Module contents