Advanced Python Arrays - Introducing NumPy
Written by Alex Armstrong   
Sunday, 21 April 2013
Article Index
Advanced Python Arrays - Introducing NumPy
Using the NumPy Array
Integer Indexing

The NumPy Array

Arrays in Python work reasonably well but compared to Matlab or  Octave there are a lot of missing features. There is an array module that provides something more suited to numerical arrays but why stop there as there is also NumPy which provides a much better array object. Put simply if you are going to use something other than the basic Python list as an array you might as well download NumPy - which is available for Python 2 and 3. 

Assuming you have NumPy installed, all you need to do to use it is add 

import numpy as np

to the start of any program.

The main thing that NumPy brings to the environment is the NumPy array.

This is an object, complete with methods, that wraps a static array of various data types.

Notice that the NumPy  array is a completely separate data type from the Python list and this means you can have two types of array-like entity within your program. The good news is that it is very easy to convert a Python data types that are "array-like" to NumPy arrays.

It is also good that NumPy arrays behave a lot like Python arrays with the two exceptions - the elements of a NumPy array are all of the same type and have a fixed and very specific data type and once created you can't change the size of a NumPy array.

A Python array is dynamic and you can append new elements and delete existing ones. A NumPy array is more like an object-oriented version of a traditional C or C++ array. 

You can create NumPy arrays using a large range of data types from int8, uint8, float64, bool and through to complex128. Check the documentation of what is available. There is also a range of type conversion functions available. 

To create a NumPy array you can use the low level constructor ndarray. You can pass this a range of arguments to control the type of array you create but the simplest is to pass just the shape of the array. For example:

myArray=np.ndarray((3,3))

creates a 3 by 3 array of floats. The array is created in memory and uninitialized. This means that if you try to make use of any of the elements of myArray you will find some random garbage. 

A more usual way of creating an array is to use one of either np.zeros(shape) or np.ones(shape) which create an array of the shape specified initialized to zeros or ones respectively. Similarly

np.arange(start,end,increment)

will create a one dimensional array initilized to values from start to end spaced by increment. There is also the linspace method that will creat an array of a specified size with evenly spaced values.  

There are lots of other array creation methods including random, identity and so on. 

You can also use the array method to convert a Python array object into a NumPy array. For example:

myArray=np.array([[1,2,3],[4,5,6],[7,8,9]])

NumPy Array Slicing

Now that we can create a NumPy array it's time to find out how to use them.

You can index a NumPy array just like a Python array. So for example after

myArray=np.array([[1,2,3],[4,5,6],[7,8,9]])

you can write

myArray[1][2]

to get the element in the row 1 column 2 i.e. 6 (remember NumPy arrays are indexed starting from 0). You can use more complex slicing and it all works exactly as for a Python array. 

For example:

myArray[0:2]

is 

array([[1, 2, 3],[4, 5, 6]])

and more to the point our original example of our failed attempt at two dimensional slicing still fails:

myArray[0:2][0:2]

is still 

array([[1, 2, 3],[4, 5, 6]])

A slice is always a view of the NumPy array i.e. it isn't a copy and assigning to a slice changes the array as is the case with a Python array.

The only real difference is that the array has a fixed size and cannot be extended or reduced. 

Multidimensional Slicing

The NumPy array goes well beyond what a standard Python array supports in terms of indexing.

The first big difference is that you can use a tuple as an indexing object.  

The simplest case of this is to use a tuple of integers. For example

myArray=np.array([[1,2,3],[4,5,6],[7,8,9]])
myArray[1,2]

is 6.

If myArray was a simple Python array this would generate an error and you would have to write:

myArray[1][2]

Both index methods work with NumPy arrays.

Being able to use a tuple of integers is a simplification of notation but you can go one step further and use a tuple of slicers.

The rule is that each slicer operates on its corresponding dimension. That is unlike the Python array where multiple slicers operate on the result of previous slicers the NumPy array implements things are you might want them to work i.e. slicing each dimension in turn.  For example, if you now try:

myArray[0:2,0:2]

you will discover that it does return the 2x2 sub matrix in the top left hand corner of the original array .i.e.

array([[1, 2],[4, 5]])

This works with any number of dimensions and each slicer is applied to the corresponding dimension to extract a sub-matrix. You can even use a step size to extract, say, every other row and column.

All you have to remember is to specify the slicers as part of a tuple and not as individual index terms. Notice that if you don't specify one slicer for each dimension, the missing slicers are assumed to be :, i.e the entire dimension. For example

myArray[0:2]

is taken to mean

myArray[0:2,:]

You can also use the ellipsis object to add : slicers if you want to specify slicers from the other end of the dimensions. For example if bigArray has five dimensions:

bigArray[...,0] 

specifies all of the rows columns and so on but with the final dimension set to index 0 i.e. it is equivalent to 

bigArray[:,:,:,:,0]

Also notice that

myArray[0,:]

is 

array([1, 2, 3])

i.e. row zero all column entries as a one dimensional array but

myArray[0:1,:]

is

array([[1, 2, 3]])

i.e. a two dimensional array consisting of just row zero. In general using an integer i returns an array with one less dimension than using the slicer [i:i+1] which returns the same elements. 



Last Updated ( Wednesday, 27 February 2019 )