Welcome to caspo’s documentation!

caspo combines PyASP and CellNOpt (through cellnopt.wrapper) to provide an easy to use software for learning Boolean logic models describing the immediate-early response of protein signaling networks.

Given a Prior Knowledge Network (PKN) describing causal interactions (SIF), and a phospho-proteomics dataset (MIDAS), caspo searches for optimal Boolean logic models derived from the PKN, such that the fitness between model predictions and experimental observations is maximized. For more information please visit caspo’s website

Command line usage

Typical usage of caspo is running the caspo.py script that you will find in your PATH after installation:

$ caspo.py pkn.sif midas.csv

For more options you can ask for help as follows:

$ caspo.py --help
usage: caspo.py [-h] [--version] [--fit F] [--size S] [--discrete D] [--gtts]
                [--cross N K] [--out O]
                pkn midas

positional arguments:
  pkn           Prior knowledge network in SIF format
  midas         Experimental dataset in MIDAS file

optional arguments:
  -h, --help    show this help message and exit
  --version     show program's version number and exit
  --fit F       suboptimal enumeration tolerance (Default to 0)
  --size S      suboptimal size enumeration tolerance (Default to 0). Combined
                with --fit could lead to a huge number of models
  --discrete D  discretization over [0,D] (Default to 100)
  --gtts        compute Global Truth Tables (Default to False). This could
                take some time for many models.
  --cross N K   compute N random K-fold cross validation
  --out O       output directory path (Default to current directory)

For example, for computing all logic models and their Global Truth Tables (input-output behaviors) within a 2% tolerance over the best fit, you have to run:

$ caspo.py pkn.sif midas.csv --fit 0.02 --gtts

Reading input files... done.

Learning Boolean logic models and their Global Truth Tables with ASP... done in 6.63 sec.
192 Boolean logic models and 4 Global Truth Tables have been learned.

Wrote ./models.csv
Wrote ./frequencies.csv
Wrote ./exclusive.csv
Wrote ./inclusive.csv
Wrote ./gtt-[0, 1, 2, 3].csv
Wrote ./gtts_stats.csv

Output files are: - models.csv: Matrix representation of logic models. - frequencies.csv: Logic conjunctions frequencies among the family of models - exclusive.csv: Mutual exclusive pairs of conjunctions - inclusive.csv: Mutual inclusive pairs of conjunctions - gtt-i.csv: Matrix representation for the complete input-output behaviors - gtts_stats.csv: Basic GTTs statistics

You can validate the learning process using N random K-fold cross-validation simply running:

$ caspo.py pkn.sif midas.csv --cross 1 10

Reading input files... done.

Learning Boolean logic models with ASP... done in 0.41 sec.
16 Boolean logic models have been learned.

Wrote ./models.csv
Wrote ./frequencies.csv
Wrote ./exclusive.csv
Wrote ./inclusive.csv

Running 1 random 10-fold cross validation...
Wrote ./cross_validation_1.csv
done.

For each cross-validation round i, you will get: - cross_validation_i.csv: GTTs, MSE and number models for each fold in the cross-validation

API usage

In order to facilitate the integration of caspo with other software tools, you can access to all its functionalities through a comprehensive API. Here we show a very simple example and we refer to __caspo__ Package for the full documentation:

#some imports
from __caspo__ import Network, Dataset, Learner, cno

midas_file = 'path-to-your-midas.csv'
sif_file = 'path-to-your.sif.sif'

#compress PKN using CellNOpt
compressed_sif = cno.compress(sif_file, midas_file)

#load dataset and compressed network
dataset = Dataset.from_midas(midas_file)
network = Network(compressed_sif)

#create a Learner object for our network and dataset
learner = Learner(network, dataset)

#learn logic models and their GTTs satisfying:
#  - 2% fitness tolerance
#  - 0 size tolerance
#  - 100-valued discretization
family = learner.learn(0.02, 0, 100, True)

#print the number of models and the family's weighted MSE
print len(family)
print family.weighted_mse(dataset)

#print MSE and models gathered for each GTT in the family
for gtt in familiy.gtts:
    print gtt.mse(dataset), len(gtt)

Indices and tables

Table Of Contents

Next topic

__caspo__ Package

This Page