Cloud Storage

GcsUtility

class gwrappy.storage.GcsUtility(**kwargs)[source]

Initializes object for interacting with Bigquery API.

By default, Application Default Credentials are used.
If gcloud SDK isn’t installed, credential files have to be specified using the kwargs json_credentials_path and client_id.
Parameters:
  • max_retries (integer) – Argument specified with each API call to natively handle retryable errors.
  • chunksize (integer) – Upload/Download chunk size
  • client_secret_path – File path for client secret JSON file. Only required if credentials are invalid or unavailable.
  • json_credentials_path – File path for automatically generated credentials.
  • client_id – Credentials are stored as a key-value pair per client_id to facilitate multiple clients using the same credentials file. For simplicity, using one’s email address is sufficient.
list_buckets(project_id, max_results=None, filter_exp=None)[source]

Abstraction of buckets().list() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/buckets/list]

Parameters:
  • project_id (string) – Unique project identifier.
  • max_results (integer) – If None, all results are iterated over and returned.
  • filter_exp (function) – Function that filters entries if filter_exp evaluates to True.
Returns:

List of dictionary objects representing bucket resources.

list_objects(bucket_name, max_results=None, prefix=None, filter_exp=None)[source]

Abstraction of objects().list() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/list]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • max_results (integer) – If None, all results are iterated over and returned.
  • prefix (string) – Pre-filter (on API call) results to objects whose names begin with this prefix.
  • filter_exp (function) – Function that filters entries if filter_exp evaluates to True.
Returns:

List of dictionary objects representing object resources.

get_object(bucket_name, object_name)[source]

Abstraction of objects().get() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/get]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
Returns:

Dictionary object representing object resource.

delete_object(bucket_name, object_name)[source]

Abstraction of objects().delete() method with inbuilt iteration functionality. [https://cloud.google.com/storage/docs/json_api/v1/objects/delete]

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
Raises:

AssertionError if unsuccessful. Response should be empty string if successful.

download_object(bucket_name, object_name, write_path)[source]

Downloads object in chunks.

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • write_path (string) – Local path to write object to.
Returns:

GcsResponse object.

Raises:

HttpError if non-retryable errors are encountered.

upload_object(bucket_name, object_name, read_path)[source]

Uploads object in chunks.

Parameters:
  • bucket_name (string) – Bucket identifier.
  • object_name (list or string) – Can take string representation of object resource or list denoting path to object on GCS.
  • read_path (string) – Local path of object to upload.
Returns:

GcsResponse object.

Raises:

HttpError if non-retryable errors are encountered.

Misc Classes/Functions

class gwrappy.storage.utils.GcsResponse(description)[source]

Wrapper for GCS upload and download responses, mainly for calculating/parsing job statistics into human readable formats for logging.

Parameters:description – String descriptor for specific function of job.
load_resp(resp, is_download)[source]

Loads json response from API.

Parameters:
  • resp (dictionary) – Response from API
  • is_download (boolean) – Calculates time taken based on ‘updated’ field in response if upload, and based on stop time if download