mrjob.local - run locally for testing

class mrjob.local.LocalMRJobRunner(**kwargs)

Runs an MRJob locally, for testing purposes.

This is NOT the default way of running jobs; we assume you’ll spend some time debugging your job before you’re ready to run it on EMR or Hadoop.

It’s rare to need to instantiate this class directly (see __init__() for details).

LocalMRJobRunner adds the current working directory to the subprocesses’ PYTHONPATH, so if you’re using it to test an EMR job locally, be aware that it may see more Python modules than will actaully be uploaded. This behavior may change in the future.

LocalMRJobRunner.__init__(**kwargs)

Arguments to this constructor may also appear in mrjob.conf under runners/local.

LocalMRJobRunner‘s constructor takes the same keyword args as MRJobRunner. However, please note:

  • cmdenv is combined with combine_local_envs()
  • python_bin defaults to sys.executable (the current python interpreter)
  • hadoop_extra_args, hadoop_input_format, hadoop_output_format, hadoop_streaming_jar, and partitioner are ignored because they require Java. If you need to test these, consider starting up a standalone Hadoop instance and running your job with -r hadoop.

Previous topic

mrjob.job

Next topic

mrjob.parse - log parsing

This Page