.. caspo documentation master file, created by sphinx-quickstart on Fri Mar 1 14:07:14 2013. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to caspo's documentation! ================================= **caspo** combines PyASP_ and CellNOpt_ (through cellnopt.wrapper_) to provide an easy to use software for learning Boolean logic models describing the immediate-early response of protein signaling networks. Given a Prior Knowledge Network (PKN) describing causal interactions (SIF_), and a phospho-proteomics dataset (MIDAS_), **caspo** searches for optimal Boolean logic models derived from the PKN, such that the fitness between model predictions and experimental observations is maximized. For more information please visit `caspo's website`_ .. _`caspo's website`: http://caspo.genouest.org .. _PyASP: http://pypi.python.org/pypi/pyasp .. _CellNOpt: http://www.ebi.ac.uk/saezrodriguez/cno/ .. _cellnopt.wrapper: http://pypi.python.org/pypi/cellnopt.wrapper .. _SIF: http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats .. _MIDAS: http://www.ebi.ac.uk/saezrodriguez/cno/doc/cnodocs/midas.html Command line usage ------------------ Typical usage of **caspo** is running the *caspo.py* script that you will find in your PATH after installation:: $ caspo.py pkn.sif midas.csv For more options you can ask for help as follows:: $ caspo.py --help usage: caspo.py [-h] [--version] [--fit F] [--size S] [--discrete D] [--gtts] [--cross N K] [--out O] pkn midas positional arguments: pkn Prior knowledge network in SIF format midas Experimental dataset in MIDAS file optional arguments: -h, --help show this help message and exit --version show program's version number and exit --fit F suboptimal enumeration tolerance (Default to 0) --size S suboptimal size enumeration tolerance (Default to 0). Combined with --fit could lead to a huge number of models --discrete D discretization over [0,D] (Default to 100) --gtts compute Global Truth Tables (Default to False). This could take some time for many models. --cross N K compute N random K-fold cross validation --out O output directory path (Default to current directory) For example, for computing all logic models and their Global Truth Tables (input-output behaviors) within a 2% tolerance over the best fit, you have to run:: $ caspo.py pkn.sif midas.csv --fit 0.02 --gtts Reading input files... done. Learning Boolean logic models and their Global Truth Tables with ASP... done in 6.63 sec. 192 Boolean logic models and 4 Global Truth Tables have been learned. Wrote ./models.csv Wrote ./frequencies.csv Wrote ./exclusive.csv Wrote ./inclusive.csv Wrote ./gtt-[0, 1, 2, 3].csv Wrote ./gtts_stats.csv Output files are: - models.csv: Matrix representation of logic models. - frequencies.csv: Logic conjunctions frequencies among the family of models - exclusive.csv: Mutual exclusive pairs of conjunctions - inclusive.csv: Mutual inclusive pairs of conjunctions - gtt-i.csv: Matrix representation for the complete input-output behaviors - gtts_stats.csv: Basic GTTs statistics You can validate the learning process using N random K-fold cross-validation simply running:: $ caspo.py pkn.sif midas.csv --cross 1 10 Reading input files... done. Learning Boolean logic models with ASP... done in 0.41 sec. 16 Boolean logic models have been learned. Wrote ./models.csv Wrote ./frequencies.csv Wrote ./exclusive.csv Wrote ./inclusive.csv Running 1 random 10-fold cross validation... Wrote ./cross_validation_1.csv done. For each cross-validation round i, you will get: - cross_validation_i.csv: GTTs, MSE and number models for each fold in the cross-validation API usage --------- In order to facilitate the integration of **caspo** with other software tools, you can access to all its functionalities through a comprehensive API. Here we show a very simple example and we refer to :ref:`caspo-api` for the full documentation:: #some imports from __caspo__ import Network, Dataset, Learner, cno midas_file = 'path-to-your-midas.csv' sif_file = 'path-to-your.sif.sif' #compress PKN using CellNOpt compressed_sif = cno.compress(sif_file, midas_file) #load dataset and compressed network dataset = Dataset.from_midas(midas_file) network = Network(compressed_sif) #create a Learner object for our network and dataset learner = Learner(network, dataset) #learn logic models and their GTTs satisfying: # - 2% fitness tolerance # - 0 size tolerance # - 100-valued discretization family = learner.learn(0.02, 0, 100, True) #print the number of models and the family's weighted MSE print len(family) print family.weighted_mse(dataset) #print MSE and models gathered for each GTT in the family for gtt in familiy.gtts: print gtt.mse(dataset), len(gtt) Modules API =========== .. toctree:: :maxdepth: 2 src Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`