Saturday 1:30 p.m.–2 p.m.

Reverse Engineering Animal Vision with Virtual Reality, Genetics and Python

Santiago Villalba

Audience level:


We present python tools for analyzing large collections of stimulus-response time series. The PyData ecosystem proves proficient at all levels of the data processing pipeline: from collection to discriminative analysis and interpretation. We introduce two new libraries: "whatami", for easy computation provenance tracking, and "pyopy", for seamless talking to useful octave/matlab libraries.


In Andrew Straw's lab we like to look at flies. By recording how genetically modified Drosophila Melanogaster flies and reacts when exposed to virtual reality induced visual stimuli, we hope to understand the role of certain neurons in visual information processing and behavior. Each day we collect thousands of trials consisting of stimulus/response time series. In this talk we show how the scientific python ecosystem enables us to derive insight from data after such experiments, proving a solid ally when facing the many challenges that our technically-advanced research poses. For us, python also shines from a social perspective, by helping to bring together hackers and less technical users, and by reducing the burden of interdisciplinary collaboration.

Key pieces of the puzzle are: numpy, scipy, pandas, bcolz, seaborn, scikit-learn and statsmodels.

Talk outline:


  • Introduction to Drosophila Neuroscience, our freeflight assays and the genetic tools we use.

  • Particularities of our data: large number of regular, multivariate, stimulus/response time series of different length.

Driving example

Using a common data-processing pipeline - collection, filtering, time series feature generation, subset selection, discriminative analysis, interpretation - we illustrate how we use:

  • bcolz to conveniently and efficiently store and retrieve these data

  • whatami to better deal with computation provenance, aesthetics and naming consistency in reports

  • pyopy quick demo to generate features while collaborating with matlab-minded scientists

  • scikit-learn to build interpretable models and highlight promising research avenues

  • pandas and seaborn to enable non-programmers to analyze their data in flexible yet approachable ways, leading to beautiful visualisations