Creating arrays in python

| categories: python | tags:

Often, we will have a set of 1-D arrays, and we would like to construct a 2D array with those vectors as either the rows or columns of the array. This may happen because we have data from different sources we want to combine, or because we organize the code with variables that are easy to read, and then want to combine the variables. Here are examples of doing that to get the vectors as the columns.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print np.column_stack([a, b])

# this means stack the arrays vertically, e.g. on top of each other
print np.vstack([a, b]).T
[[1 4]
 [2 5]
 [3 6]]
[[1 4]
 [2 5]
 [3 6]]

Or rows:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print np.row_stack([a, b])

# this means stack the arrays vertically, e.g. on top of each other
print np.vstack([a, b])
[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]

The opposite operation is to extract the rows or columns of a 2D array into smaller arrays. We might want to do that to extract a row or column from a calculation for further analysis, or plotting for example. There are splitting functions in numpy. They are somewhat confusing, so we examine some examples. The numpy.hsplit command splits an array “horizontally”. The best way to think about it is that the “splits” move horizontally across the array. In other words, you draw a vertical split, move over horizontally, draw another vertical split, etc… You must specify the number of splits that you want, and the array must be evenly divisible by the number of splits.

import numpy as np

A = np.array([[1, 2, 3, 5], 
              [4, 5, 6, 9]])

# split into two parts
p1, p2 = np.hsplit(A, 2)
print p1
print p2

#split into 4 parts
p1, p2, p3, p4 = np.hsplit(A, 4)
print p1
print p2
print p3
print p4
[[1 2]
 [4 5]]
[[3 5]
 [6 9]]
[[1]
 [4]]
[[2]
 [5]]
[[3]
 [6]]
[[5]
 [9]]

In the numpy.vsplit command the “splits” go “vertically” down the array. Note that the split commands return 2D arrays.

import numpy as np

A = np.array([[1, 2, 3, 5], 
              [4, 5, 6, 9]])

# split into two parts
p1, p2 = np.vsplit(A, 2)
print p1
print p2
print p2.shape
[[1 2 3 5]]
[[4 5 6 9]]
(1, 4)

An alternative approach is array unpacking. In this example, we unpack the array into two variables. The array unpacks by row.

import numpy as np

A = np.array([[1, 2, 3, 5], 
              [4, 5, 6, 9]])

# split into two parts
p1, p2 = A
print p1
print p2
[1 2 3 5]
[4 5 6 9]

To get the columns, just transpose the array.

import numpy as np

A = np.array([[1, 2, 3, 5], 
              [4, 5, 6, 9]])

# split into two parts
p1, p2, p3, p4 = A.T
print p1
print p2
print p3
print p4
print p4.shape
[1 4]
[2 5]
[3 6]
[5 9]
(2,)

Note that now, we have 1D arrays.

You can also access rows and columns by indexing. We index an array by [row, column]. To get a row, we specify the row number, and all the columns in that row like this [row, :]. Similarly, to get a column, we specify that we want all rows in that column like this: [:, column]. This approach is useful when you only want a few columns or rows.

import numpy as np

A = np.array([[1, 2, 3, 5], 
              [4, 5, 6, 9]])

# get row 1
print A[1]
print A[1, :]  # row 1, all columns

print A[:, 2]  # get third column 
print A[:, 2].shape
[4 5 6 9]
[4 5 6 9]
[3 6]
(2,)

Note that even when we specify a column, it is returned as a 1D array.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter