Lesson 1: Multi-threading programming¶

Scenario¶

Suppose you want to write a code to find files larger than 100MB in multiple location.

'''
Multi-thread disk scanner for large files

.. codeauthor:: Juti Noppornpitak <juti_n@yahoo.co.jp>
'''

from math import pow
from yotsuba.core import base, fs

threshold = pow(1024,2) * 100 # 100 MB
inputs = []
entries = []
locations = ["/Users/jnopporn/Downloads", "/Users/jnopporn/Documents", "/Applications"]

def worker(i):
    size = fs.size(i)
    if size > threshold:
        entries.append((i, fs.size(i)))

print "Threshold: %16s" % base.convert_to_readable_size(threshold)

for location in locations:
    all_entries = fs.browse(location, True)
    inputs.extend(all_entries['directories'])
    inputs.extend(all_entries['files'])

for i in inputs:
    worker(i)

for e in entries:
    print "%16s\t%s" % (base.convert_to_readable_size(e[1]), e[0])

Although this is very easy, if you are looking in a large number of locations, multiple threads might speed up the code.

Solution¶

First of all, let’s import yotsuba.core.mt.

Then, replace the for loop with the following:

mtfw = mt.MultiThreadingFramework()
mtfw.run(
    inputs,
    worker
)

In the end, you will have something like below.

'''
Multi-thread disk scanner for large files

.. codeauthor:: Juti Noppornpitak <juti_n@yahoo.co.jp>
'''

from math import pow
from yotsuba.core import base, fs, mt

threshold = pow(1024,2) * 100 # 100 MB
inputs = []
entries = []
locations = ["/Users/jnopporn/Downloads", "/Users/jnopporn/Documents", "/Applications"]

def worker(i):
    size = fs.size(i)
    if size > threshold:
        entries.append((i, fs.size(i)))

print "Threshold: %16s" % base.convert_to_readable_size(threshold)

for location in locations:
    all_entries = fs.browse(location, True)
    inputs.extend(all_entries['directories'])
    inputs.extend(all_entries['files'])

mtfw = mt.MultiThreadingFramework()
mtfw.run(inputs, worker, tuple())

for e in entries:
    print "%16s\t%s" % (base.convert_to_readable_size(e[1]), e[0])

What is the difference?¶

When you use for loop, each call to worker will pause the caller thread until worker finishes execution. In the meanwhile, yotsuba.core.mt.MultiThreadingFramework scans multiple location simutaneously in multiple threads.

Note

After execution, yotsuba.core.mt.MultiThreadingFramework will kill all child threads created during the process.

Next, Lesson 2: XML Parsing

Lesson 1: Multi-threading programming¶

Scenario¶

Solution¶

What is the difference?¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Lesson 1: Multi-threading programming¶

Scenario¶

Solution¶

What is the difference?¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation