Package concurrent_tree_crawler :: Package html_multipage_navigator :: Module sample_page_analyzer :: Class MagazinePageAnalyzer
[hide private]
[frames] | no frames]

Class MagazinePageAnalyzer

source code


A class that parses magazine-level pages

Instance Methods [hide private]
PageLinks
get_links(self, page_file, child_links_retrieved_so_far)
Returns: information about links on the given page.
source code

Inherited from abstract_page_analyzer.AbstractPageAnalyzer: process

Static Methods [hide private]
 
__convert_date(text) source code
Method Details [hide private]

get_links(self, page_file, child_links_retrieved_so_far)

source code 
Parameters:
  • page_file - file-like structure to be analyzed
  • child_links_retrieved_so_far_count - number of child links retrieved so far in current node (from previous pages)
Returns: PageLinks
information about links on the given page. The given default implementation is made for a leaf node (a page with no children).
Overrides: abstract_page_analyzer.AbstractPageAnalyzer.get_links
(inherited documentation)