-
Pandas Basics 5
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
Lecture1.5
-
-
Data Transformations 6
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
Lecture2.4
-
Lecture2.5
-
Lecture2.6
-
-
Statistics 4
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
Lecture3.4
-
-
Reading and Writing Data 3
-
Lecture4.1
-
Lecture4.2
-
Lecture4.3
-
-
Joins 5
-
Lecture5.1
-
Lecture5.2
-
Lecture5.3
-
Lecture5.4
-
Lecture5.5
-
-
Grouping 4
-
Lecture6.1
-
Lecture6.2
-
Lecture6.3
-
Lecture6.4
-
-
Introduction to Numpy 4
-
Lecture7.1
-
Lecture7.2
-
Lecture7.3
-
Lecture7.4
-
-
Randomness 2
-
Lecture8.1
-
Lecture8.2
-
-
Numpy Data Functionality 1
-
Lecture9.1
-
Numpy Data Functionality
Numpy Data Functionality¶
This lesson will focus on some of the most important techniques within numpy that you will have available. There are functions for data creation, data transformation, and data combination that will be covered.
Data Creation Ones and Zeros¶
Very often there is a reason for you to simply have a large array of 0s or 1s. This can either be for some sort of mathematical procedure, or it may be that you simply want placeholders for real numbers you will set later. Either way, by passing either a number for length or a shape, you can set up these types of arrays in numpys.
A few examples below….
import numpy as np
#Create an array of 10 0s
print(np.zeros(10))
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
#Create an array of 10 1s
print(np.ones(10))
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
#Create a 3x3 array of 1s
print(np.ones((3,3)))
[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]
Numpy Ranges¶
There are two nice features of numpy in terms of ranges that we will discuss. These will make your life easier as compared to the classic range functions in python.
Linspace¶
The first is the linspace function which will n linearly spaced points between a starting and ending point. The format is to first pass in the starting point, then the ending point, then the number of points you want. For example, if we pass in 0, 1, and then 3, we are going to get back 3 points which span between 0 and 1. Keep in mind that 0 and 1 are included in counting the number of points.
#Create the linspace
print(np.linspace(0,1,3))
[0. 0.5 1. ]
One example would be if you are trying to plot a function in the area between 0 and 1. Let's say you have $x^3$ and want to see with some level of detail what the function looks like. The first step would be to create the x array. We can grab points between 0 and 1, and give the number of points as 101 to get the spacing to come out to .01 per step.
#Create the linspace
x = np.linspace(0, 1, 101)
print(x)
[0. 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13
0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27
0.28 0.29 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.4 0.41
0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 0.53 0.54 0.55
0.56 0.57 0.58 0.59 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69
0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83
0.84 0.85 0.86 0.87 0.88 0.89 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97
0.98 0.99 1. ]
Next we calculate the y by applying our function.
#Find the function value
y = x ** 3
print(y)
[0.00000e+00 1.00000e-06 8.00000e-06 2.70000e-05 6.40000e-05 1.25000e-04
2.16000e-04 3.43000e-04 5.12000e-04 7.29000e-04 1.00000e-03 1.33100e-03
1.72800e-03 2.19700e-03 2.74400e-03 3.37500e-03 4.09600e-03 4.91300e-03
5.83200e-03 6.85900e-03 8.00000e-03 9.26100e-03 1.06480e-02 1.21670e-02
1.38240e-02 1.56250e-02 1.75760e-02 1.96830e-02 2.19520e-02 2.43890e-02
2.70000e-02 2.97910e-02 3.27680e-02 3.59370e-02 3.93040e-02 4.28750e-02
4.66560e-02 5.06530e-02 5.48720e-02 5.93190e-02 6.40000e-02 6.89210e-02
7.40880e-02 7.95070e-02 8.51840e-02 9.11250e-02 9.73360e-02 1.03823e-01
1.10592e-01 1.17649e-01 1.25000e-01 1.32651e-01 1.40608e-01 1.48877e-01
1.57464e-01 1.66375e-01 1.75616e-01 1.85193e-01 1.95112e-01 2.05379e-01
2.16000e-01 2.26981e-01 2.38328e-01 2.50047e-01 2.62144e-01 2.74625e-01
2.87496e-01 3.00763e-01 3.14432e-01 3.28509e-01 3.43000e-01 3.57911e-01
3.73248e-01 3.89017e-01 4.05224e-01 4.21875e-01 4.38976e-01 4.56533e-01
4.74552e-01 4.93039e-01 5.12000e-01 5.31441e-01 5.51368e-01 5.71787e-01
5.92704e-01 6.14125e-01 6.36056e-01 6.58503e-01 6.81472e-01 7.04969e-01
7.29000e-01 7.53571e-01 7.78688e-01 8.04357e-01 8.30584e-01 8.57375e-01
8.84736e-01 9.12673e-01 9.41192e-01 9.70299e-01 1.00000e+00]
Finally, we can plot the results.
import matplotlib.pyplot as plt
#Plot the function
plt.plot(x, y)
plt.title("Dummy Function Test")
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Using arange¶
While linspace gives us the ability to specify a number of points, the function arange instead builds an array based off of a start, end and a given step to take. For example, in the previous version, we looked for numbers between 0 and 1 with a step of .01 in between. We could do the following with arange instead, not that like a regular range we use 1.01 as the end instead of 1 because it is non inclusive of the end point.
#Find the arange
x = np.arange(0, 1.01, .01)
print(x)
[0. 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13
0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.21 0.22 0.23 0.24 0.25 0.26 0.27
0.28 0.29 0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.4 0.41
0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.5 0.51 0.52 0.53 0.54 0.55
0.56 0.57 0.58 0.59 0.6 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69
0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.8 0.81 0.82 0.83
0.84 0.85 0.86 0.87 0.88 0.89 0.9 0.91 0.92 0.93 0.94 0.95 0.96 0.97
0.98 0.99 1. ]
Stacking¶
Numpy as well has functions for stacking data either horizontally or vertically. For example, look at stacking these arrays with either hstack or vstack. The argument is a list of items to stack together.
#Create data
a = np.array([[1, 2],
[3, 4]])
b = np.array([[5, 6],
[7, 8]])
#Show vstack and hstack
print("A:")
print(a)
print()
print("B:")
print(b)
print()
print("vstack:")
print(np.vstack([a,b]))
print()
print("hstack:")
print(np.hstack([a,b]))
A:
[[1 2]
[3 4]]
B:
[[5 6]
[7 8]]
vstack:
[[1 2]
[3 4]
[5 6]
[7 8]]
hstack:
[[1 2 5 6]
[3 4 7 8]]