Thursday, April 23, 2015

Get the maximum and minimum values from multiple 2-dimensional NumPy arrays


A classic GIS question you might ask is: "What is the maximum or minimum cell value from a set of multiple overlapping rasters?". When your raster data is in the form of two 2-dimensional NumPy arrays, this is very straightforward as you can just use the numpy.maximum or numpy.minimum functions to get a new array with the minimum and maximum values. However, if you have more than two arrays you need another solution.

To start, lets create the arrays shown in the picture at the top of this post:

import numpy as np
# Create some arrays to stack
a = np.array([[0,1,2],[3,4,5],[6,7,8],[4,6,7]])
b = np.array([[5,4,8],[6,3,9],[1,5,9],[0,0,0]])
c = np.array([[5,1,9],[5,9,7],[3,1,5],[3,9,1]])

The next step is to use the numpy.dstack function to stack the arrays depth wise - which in a GIS context is to say on top of each other:

# Stack the arrays
abc = np.dstack((a,b,c))

It is at this point that things get a little more complicated. What we can do is use the numpy.argmax or numpy.argmin functions to pull out not the minimum and maximum values themselves, but rather an index to the location of the minimum and maximum values within the array stack. The axis argument is set to 2 so that the index is for minimum and maximum values going depth wise into the stack (the first two axes would relate to looking at the stack in terms of top to bottom or left to right):

# Get the index values for the minimum and maximum values
maxIndex = np.argmax(abc, axis=2)
minIndex = np.argmin(abc, axis=2)

Having got the index values for the minimum and maximum values in terms of their depth into the array stack, we just need to create two other arrays that can act as the matching row and column index positions:

# Create column and row position arrays
nRow, nCol = np.shape(a)
col, row = np.meshgrid(range(nCol), range(nRow))

And then its a simple process of indexing out the maximum and minimum values within the array stack to create new arrays containing the minimum and maximum values as shown in the picture at the top of this post:

# Index out the maximum and minimum values from the stacked array based on row 
# and column position and the maximum value
maxValue = abc[row, col, maxIndex]
minValue = abc[row, col, minIndex]

Putting the whole process together we get:

import numpy as np

# Create some arrays to stack
a = np.array([[0,1,2],[3,4,5],[6,7,8],[4,6,7]])
b = np.array([[5,4,8],[6,3,9],[1,5,9],[0,0,0]])
c = np.array([[5,1,9],[5,9,7],[3,1,5],[3,9,1]])
# Stack the arrays
abc = np.dstack((a,b,c))

# Get the index values for the minimum and maximum values
maxIndex = np.argmax(abc, axis=2)
minIndex = np.argmin(abc, axis=2)

# Create column and row position arrays
nRow, nCol = np.shape(a)
col, row = np.meshgrid(range(nCol), range(nRow))

# Index out the maximum and minimum values from the stacked array based on row 
# and column position and the maximum value
maxValue = abc[row, col, maxIndex]
minValue = abc[row, col, minIndex]

Saturday, April 18, 2015

ArcGIS integrates with the Python SciPy software stack


I was excited to read in the latest ArcUser Magazine that the Python SciPy stack will be integrated from version 10.3 of ArcGIS Pro and version 10.3.1 of ArcGIS Desktop. I think this is great news for making the power of Python for scientific computing more accessible to users of ArcGIS.

I initially started Python programming with the arcgisscripting module of version 9.2 of ArcGIS, and initially did research using only this approach (Etherington 2011). However, from those early beginnings I quickly learnt that Python has massive potential for scientific analysis and visualisation, and I've been making use of the NumPy, SciPy, and Matplotlib packages for many years. As I discovered more and more about what these packages could do I even discovered that I could take a purely Python approach for some of my research (Etherington et al. 2015).

So I would urge users of ArcGIS who haven’t programmed with these newly available Python packages to seriously consider spending some time learning how to make use of them.

Note to self - must make time to learn how to make better use of Pandas...

References

Etherington TR (2011) Python based GIS tools for landscape genetics: visualising genetic relatedness and measuring landscape connectivity. Methods in Ecology and Evolution 2(1): 52-55.

Etherington TR, Holland EP, O'Sullivan D (2015) NLMpy: a Python software package for the creation of neutral landscape models within a general numerical framework. Methods in Ecology and Evolution 6(2): 164-168.