Project Summary

Description: This is a very simple protocol whose aim is to drive an example of Python/R integration. We just plot the R's cars dataset together with a linear regression using R's lm and a boxplot. The [F]-s in the protocol are flags indicating that the flagged node (rectangular in the graphical representation) produces a file on the disk. Such nodes must always return the name of the file they create (as a simple str), or a list of file names if they produce more than one file (as a list of str).
Number of nodes: 4
Number of [F]-nodes: 2
Total number of output files: 3
Total size of output files: 29.09K
Total CPU time required: 00:00:0.30

Protocol map

Nodes details

plots

Plots the dataset with a regression line and a boxplot using R.
Output files: car_regress.pdf , car_hist.pdf .

Last build was: Sun Jan 8 15:51:57 2012.
Required CPU time: 00:00:0.15.
def plots(getData_o, regression_o):
"""Plots the dataset with a regression line and a boxplot using R."""
fname1 = 'car_regress.pdf'
r.pdf(fname1)
r.plot(getData_o, ylab='dist', xlab='speed')
r.abline(regression_o['(Intercept)'], regression_o['y'], col='red')
r.dev_off()

fname2 = 'car_hist.pdf'
r.pdf(fname2)
r.boxplot(getData_o, names=['dist', 'speed'])
r.dev_off()

return fname1, fname2

exportCSV

Exports Car's data to a CSV file.
Output file: carsdata.csv

Last build was: Sun Jan 8 15:51:57 2012.
Required CPU time: 00:00:0.03.
def exportCSV(data):
"""Exports Car's data to a CSV file."""
fname = 'carsdata.csv'
f = open(fname, 'w')
for point in data:
f.write(str(point[0]) + ', ' +
str(point[1]) + '\n')
f.close()
return fname

getData

Calls R to get and normalize the cars dataset.

Last build was: Sun Jan 8 15:51:56 2012.
Required CPU time: 00:00:0.07.
def getData():
"""Calls R to get and normalize the cars dataset."""
return r("scale(cars)")

regression

Calls R's lm to make a linear regression on each of its inputs.

Last build was: Sun Jan 8 15:51:57 2012.
Required CPU time: 00:00:0.05.
def regression(data):
"""Calls R's lm to make a linear regression on each of its inputs."""

reg = r.lm(r('x ~ y'),
data = r.data_frame(x=data[:,0], y=data[:,1])
)['coefficients']

return reg