- Audience level:
- Novice

The `numpy`

package takes a central role in
Python data science code. This is mainly
because numpy code has been designed with
high performance in mind. This tutorial
will provide the most essential concepts
to become confident with `numpy`

and `ndarray`

s.
Then some concrete examples of applications
where numpy takes a central role, will
be presented as well.

It is very hard to be a scientist without knowing how to write code,
and nowadays **Python** is probably the language of choice in many research fields.
This is mainly because the Python ecosystem includes a lot of tools and libraries
for many research tasks: `pandas`

for *data analysis* ,
`networkx`

for *social network analysis*,
`scikit-learn`

for *machine learning*, and so on.

Most of these libraries relies (or are built on top of) `numpy`

.
Therefore, `numpy`

is a crucial component of the common Python
stack used for numerical analysis and data science.

On the one hand, NumPy code tends to be much cleaner (and faster) than "straight" Python code that tries to accomplish the same task. Moreover, the underlying algorithms have been designed with high performance in mind.

This training will be organised in two parts: the first part is intended to provide most of the essential concepts needed to become confident with NumPy data structures and functions.

In the second part, some examples of data analysis libraries and code will be presented, where NumPy takes a central role.

Here is a list of software used to develop and test the code examples presented during the training:

- Python 3.x (2.x would work as well)
- iPython 2.3+ (with
**notebook support**):`pip install ipython[notebook]`

- numpy 1.9+
- scipy 0.14+
- scikit-learn 0.15+
- pandas 0.8+

The training is meant to be mostly introductory, thus it is perfectly suited
for **beginners**. However, a good proficiency in Python programming is (at least)
required.