Saturday 2 p.m.–2:15 p.m.

Want drugs? Use Python.

Michal Nowotka

Audience level:


We describe how Python is leveraged to streamline the modelling of drug discovery data and the development of tools for the scientific community. We look at various examples, e.g. chemistry toolkits, machine-learning applications and web frameworks and show how Python can glue it all together to create efficient data science pipelines.


ChEMBL is the largest open access database resource in the fields of computational drug discovery, chemoinformatics and chemical biology. Contrary to the common Perl-related perception, the Python programming language is used predominantly in the aforementioned fields. In this presentation, we describe how Python is used as the cornerstone and foundation inside and outside the ChEMBL group, in order to support and streamline many facets of our work, tools and resources. In particular, we cover the following topics:

  • Data modelling using Object Relational Mapping
  • Chemistry-specific computations using cheminformatics toolkits
  • Biological target prediction using machine learning tools
  • Improving accessibility by creating user-friendly applications
  • Supporting developers with RESTful APIs
  • Creating workflows using Python
  • Facilitate learning by providing collections of tutorials as iPython Notebooks

We provide specific examples of web applications developed at ChEMBL, ranging from simple single-page apps to complex, multilayered RIAs built on top of RESTful interfaces: