EuroSciPy logo

EuroSciPy 2013

Brussels, Belgium - August 21-24 2013

Parallel computation in Systems Biology with Python

Johann Rohwer

Abstract

Computer modelling has become an integral tool in the analysis and understanding of the reaction networks that underlie cellular processes, which has lead to the development and rapid growth of the field of computational systems biology. The need to adapt modelling software to our specific needs prompted the development of PySCeS [1], the Python Simulator for Cellular Systems, which makes extensive use of the highly successful IPython, NumPy, SciPy, Matplotlib stack.

In this talk I shall illustrate the application of PySCeS and other Python-based tools in parallel computational tasks to solve large problems in computational systems biology. Three use cases will be discussed.

  1. Performing large multi-dimensional parameter scans with PySCeS using its built-in parallel functionality, which makes use of the multiprocessing module for single multi-core machines or IPython's ipcluster framework on multi-machine clusters.

  2. Showing how multiple-parameter rate characteristics can be used to investigate regulatory patterns in a metabolic pathway and how regulatory metabolites can be computationally identified from a generalised supply-demand analysis [2], again using the built-in parallel functionality of PySCeS.

  3. Performing repeated runs of a partial differential equation (PDE)-based model of sugar metabolism in sugarcane, which is simulated with FiPy [3]. The model is subject to parameter sensitivity analysis using FAST [4] and Morris sampling [5], requiring multiple runs. The computation was performed on an ipcluster with 110 engines, yielding data sets in excess of 5GB. Problems encountered with the storage, movement and analysis of such huge datasets, as well strategies for their solution will be discussed.

In summary, the open, extensible platform offered by Python and SciPy provides an extremely powerful framework for building dedicated software packages for specialised scientific computing needs, such as required in computational systems biology. The parallel functionalities provided by IPython in the ipcluster framework allow relatively easy implementation and provide interfaces to your own Python code.

References

  1. Olivier, B. G., Rohwer, J. M. & Hofmeyr, J.-H. S. Modelling cellular systems with PySCeS. Bioinformatics 21, 560–561 (2005).
  2. Rohwer, J. M. & Hofmeyr, J.-H. S. Identifying and characterising regulatory metabolites with generalised supply-demand analysis. J. Theor. Biol. 252, 546-554 (2008).
  3. Guyer, J.E., Wheeler, D. & Warren, J.A. FiPy: Partial differential equations with Python. Comput. Sci. Eng. 11, 6-15 (2009).
  4. Cukier, R. I., Levine, H. B. & Shuler, K. E. Nonlinear sensitivity analysis of multiparameter model systems. J. Comp. Phys. 26, 1-42 (1978).
  5. Morris, M. D. Factorial sampling plans for preliminary computational experiments. Technometrics 33, 161-174 (1991).

Sponsors