Plotting
Solution
print(dist.cdf(500)-dist.cdf(300))
print(dist.cdf(600)-dist.cdf(200))
print(dist.cdf(700)-dist.cdf(100))
Let’s plot on the normal distribution now. What does the area under the curve for CDF(350) look like?
First, set up two sets of arrays, one will be our full graph and one will represent only point under or equal to 350.
xVals = list(range(800))
yVals = [dist.pdf(x) for x in xVals]
xVals2 = list(range(351))
yVals2 = [dist.pdf(x) for x in xVals2]
We are going to make use of the plt.fill_between function which fills in values for us.
plt.plot(xVals,yVals)
plt.xlabel("Value")
plt.ylabel("Density")
plt.fill_between(xVals2, yVals2)
plt.show()
This code will create the classic picture of the normal distribution filled only in the area where our CDF applies.
Let’s also plot on the CDF function where we are.
xVals = list(range(800))
yVals = [dist.cdf(x) for x in xVals]
plt.plot(xVals,yVals)
plt.xlabel("Value")
plt.ylabel("Cummulative Percentage")
plt.plot(350,dist.cdf(350),"ro")
plt.show()
Finally, let’s repeat this process for the z-scores.
No surprise at all, the two graphs are the same except for the units. The z-score is just a transformation that allows us to make more general statements about distributions.
xVals = list(range(800))
yVals = [dist.pdf(x) for x in xVals]
xVals2 = list(range(351))
yVals2 = [dist.pdf(x) for x in xVals2]
#Translate x values to z score
xVals = [(x-400)/100 for x in xVals]
xVals2 = [(x-400)/100 for x in xVals2]
plt.plot(xVals,yVals)
plt.xlabel("Z Score")
plt.ylabel("Density")
plt.fill_between(xVals2, yVals2)
plt.show()
xVals = list(range(800))
yVals = [dist.cdf(x) for x in xVals]
xVals = [(x-400)/100 for x in xVals]
plt.plot(xVals,yVals)
plt.xlabel("Z Score")
plt.ylabel("Cummulative Percentage")
plt.plot((350-400)/100,dist.cdf(350),"ro")
plt.show()
Challenge