Package concurrent_tree_crawler :: Package html_multipage_navigator :: Module sample_page_analyzer :: Class ArticlePageAnalyzer
[hide private]
[frames] | no frames]

Class ArticlePageAnalyzer

source code


A class that downloads article pages

Instance Methods [hide private]
 
__init__(self, dst_dir_path) source code
 
process(self, tree_path, page_file)
Process the node (normally, this method is called once for every node).
source code
 
__download_page(self, page_file, dst_file) source code

Inherited from abstract_page_analyzer.AbstractPageAnalyzer: get_links

Method Details [hide private]

process(self, tree_path, page_file)

source code 

Process the node (normally, this method is called once for every node).

Parameters:
  • tree_path - path to the tree node the navigator is currently in i.e. subsequent node names from the tree root to the current node. This might be e.g. ["root"] for a path to the root node or ["root", "magazine-2011-09-18", "article_23"] for some other node inside the tree hierarchy.
  • page_file - file-like structure to be processed
Overrides: abstract_page_analyzer.AbstractPageAnalyzer.process
(inherited documentation)