These are some general guidelines for programming with jug.
Almost any named function can be a task.
The trade-off is between tasks that are too small (you have too many of them and the overhead of jug will overwhelm your process) or too big (and then you have too few tasks per processor.
As a rule of thumb, each task should take at least a few seconds, but you should have enough tasks that your processors are not idle.
Certain mechanisms in jug, for example, jug.mapreduce.map and jug.mapreduce.mapreduce allow the user to tweak the task breakup with a couple of parameters
In map for example, jug does, by default, issue a task for each element in the sequence. It rather issues one for each 4 elements. This expects tasks to not take that long so that grouping them gives you a better trade-off between the throughput and latency. You might quibble with the default, but the principle is sound and it is only a default: the setting is there to give you more control.
In the module jug.hash, jug attempts to construct a unique identifier, called a hash, for each of your tasks. For doing that, the name of the function involved invoked in the task together with the parameters that it receives are used. This makes jug easy to use but has some drawbacks:
The solution for this is to provide a __jug_hash__ keyword argument to the Task constructor specifying an object that should be used to build the hash instead of the function arguments. Or, alternatively, to provide a __jug_hash__ method to the problematic arguments: this method should return a string to feed the hash function.