Package pyworker :: Module httpworker :: Class HTTPWorker
[hide private]
[frames] | no frames]

Class HTTPWorker

source code

     object --+    
              |    
pyworker.Worker --+
                  |
                 HTTPWorker

Gets a local copy of the resource at the URL in the JSON msg ('url') and simply
prints the first "line".

Handles an HTTP cache, so that multiple HTTPWorkers performing different tasks
on the same resource will not cause multiple calls to the same resource, unless
the E-Tag header is different.

If the E-Tag header is different, the updated resource is downloaded and passed
instead.

It is expected that self.endtask will be overwritten. 

If the tempfile option is set, remember to delete the temporary file 
as well as ack the msg! Eg -
------------------------------------------------------------
import os
class SolrFeeder(HTTPWorker):
    def endtask(self, msg, response):
        try:
            # do stuff with response.context['fd'], the file-descriptor for the resource
        finally:
            response.context['fd'].close()
            if self.context.get('tempfile', False):
               os.remove(response.context['tempfile'])
            self.queue_stdin.task_done()

s = SolrFeeder(queue_stdin, queue_stdout=None, tempfile = True)
------------------------------------------------------------    
If 'id' is passed in the message instead, then this is inserted into a template, set
by instantiating this worker with the parameter 'http_template'. Normal python
string formating applies ( template % id )

Requires configuration parameters:
    http_template = template for the URL to GET
    



Instance Methods [hide private]
 
_get_tempfile(self) source code
 
_get_ramfile(self) source code
 
httpsetup(self) source code
 
starttask(self, msg)
This will very simply GET the url supplied and pass the temp/ramfile to endtask
source code
 
endtask(self, msg, response)
Demo method to be overwritten.
source code

Inherited from pyworker.Worker: __init__, parse_json_msg, run

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

starttask(self, msg)

source code 
This will very simply GET the url supplied and pass the temp/ramfile to endtask
Overrides: pyworker.Worker.starttask

endtask(self, msg, response)

source code 
Demo method to be overwritten. This simply reads the first 100 characters from the reponse.context['fd'] (file-handle) and deletes/removes the file.
Overrides: pyworker.Worker.endtask