Package tipy :: Module minr :: Class CorpusMiner
[hide private]
[frames] | no frames]

Class CorpusMiner

source code

object --+        
         |        
     Miner --+    
             |    
     TextMiner --+
                 |
                CorpusMiner

The miner for text corpus.

This miner is basically a minr.TextMiner wrapper that implement the mine() method which merely loops on every files of the corpus and call the minr.TextMiner.update_db method to effectively do the mining operation.

Class Hierarchy for CorpusMiner
Class Hierarchy for CorpusMiner

Nested Classes [hide private]
    Inherited from Miner
  __metaclass__
Metaclass for defining Abstract Base Classes (ABCs).
Instance Methods [hide private]
 
__init__(self, config, minerName, callback=None)
Constructor of the CorpusMiner class.
source code
 
mine(self)
Perform the mining operation.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

    Inherited from TextMiner
 
add_to_db(self, ngramMap, n, append=False)
Add n-grams of an n-gram dictionary to the database.
source code
 
crt_new_db(self, textPath)
Mine a text file.
source code
dict
crt_ngram_map(self, textPath, n)
Create a n-gram dictionary from a file.
source code
 
update_db(self, textPath)
Mine a text file, updating the database.
source code
    Inherited from Miner
 
rm_db(self)
Remove the database file (call os.system).
source code
Class Variables [hide private]
  __abstractmethods__ = frozenset([])
    Inherited from Miner
  _abc_cache = <_weakrefset.WeakSet object at 0x7f2a42131ad0>
  _abc_negative_cache = <_weakrefset.WeakSet object at 0x7f2a421...
  _abc_negative_cache_version = 44
  _abc_registry = <_weakrefset.WeakSet object at 0x7f2a42131a90>
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, config, minerName, callback=None)
(Constructor)

source code 

Constructor of the CorpusMiner class.

Parameters:
  • config (drvr.Configuration) - The configuration file. It is used to retrieve the miner parameters.
  • minerName (str) - The name of the miner.
  • callback (fun(float, ...)) - The callback is used to show the progress percentage. In the gui a callback method is implemented to update a progress bar showing the n-grams insertion progress (cf. py).
Overrides: object.__init__

mine(self)

source code 

Perform the mining operation.

Overrides: Miner.mine