If you are familiar with Python and the Linux environment, you should have no difficulty installing Mrs and running it with the example wordcount program in three minutes or less.
If you encounter any problems not addressed in this guide, please post a question to the Mrs Mailing List.
Mrs is supported on Linux and Mac OS. It is not currently supported on Windows, although this may change in the future.
The only mandatory prerequisite for Mrs is Python, which is already installed on most Linux and Mac OS systems. Mrs does not rely on any libraries outside the Python Standard Library. However, the following optional utilities can be very helpful when running Mrs in parallel:
Many users may find it easiest to use pip or easy_install to install Mrs.
If you have pip, you can use it to install Mrs automatically:
$ pip install mrs-mapreduce --user
If you want pip and don’t have it check out their installation page.
If you don’t have pip, you can use easy_install to install Mrs automatically:
$ easy_install mrs-mapreduce
If you don’t want to use an automatic tool, you can install Mrs manually.
You may either download a tar file or clone with Git. To use the tar file, download and unpack it from the Mrs Downloads page. Alternatively, you may use Git to clone the repository:
$ git clone https://code.google.com/p/mrs-mapreduce/ && cd mrs-mapreduce
The mrs module must be on the Python path to be usable. You may either install Mrs system-wide or use it in place without installing.
If you choose to install Mrs system-wide, run the following command as root:
$ sudo python setup.py install
If you choose not to install Mrs system-wide with setup.py, then you must add the mrs-mapreduce directory to your Python path to ensure that Python can find the mrs module. Set and export the PYTHONPATH environment variable in your .bashrc file (or the equivalent for other shells). It’s best to make sure you understand how the .bashrc file works, but you might get away with running the following command echo export PYTHONPATH=$PYTHONPATH`pwd`>>~/.bashrc and then . ~/.bashrc to reload your bash config.
Verify correct installation by running the example WordCount program, passing it an input text file and an output directory name. For example, if you wanted to run WordCount on this tutorial, you would run
$ python examples/wordcount.py ./docs/installation.txt tutorial_outdir
If all went well you should have a file called source_something_something.mtxt in your output directory containing a list of all the words in the text file with their respective counts, sorted alphabetically.
You may find the following resources helpful: