-
Graphing Data 4
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
-
Mean and Standard Deviation 5
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
Lecture2.4
-
Lecture2.5
-
-
Distributions 6
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
Lecture3.4
-
Lecture3.5
-
Lecture3.6
-
-
Correlation and Linear Regression 7
-
Lecture4.1
-
Lecture4.2
-
Lecture4.3
-
Lecture4.4
-
Lecture4.5
-
Lecture4.6
-
Lecture4.7
-
-
Probability 3
-
Lecture5.1
-
Lecture5.2
-
Lecture5.3
-
-
Counting Principles 3
-
Lecture6.1
-
Lecture6.2
-
Lecture6.3
-
-
Binomial Distribution 3
-
Lecture7.1
-
Lecture7.2
-
Lecture7.3
-
-
Confidence Interval 7
-
Lecture8.1
-
Lecture8.2
-
Lecture8.3
-
Lecture8.4
-
Lecture8.5
-
Lecture8.6
-
Lecture8.7
-
-
Proportion Confidence Interval 3
-
Lecture9.1
-
Lecture9.2
-
Lecture9.3
-
-
Hypothesis Testing 5
-
Lecture10.1
-
Lecture10.2
-
Lecture10.3
-
Lecture10.4
-
Lecture10.5
-
-
Comparing Two Means 5
-
Lecture11.1
-
Lecture11.2
-
Lecture11.3
-
Lecture11.4
-
Lecture11.5
-
-
Chi-squared Test 3
-
Lecture12.1
-
Lecture12.2
-
Lecture12.3
-
Expected Values
Solution
expected = np.empty(table.shape)
for index, x in np.ndenumerate(table):
pA = row[index[0]]
pB = col[index[1]]
expected[index] = pA*pB*n
print(expected)
The first line creates an empty array with the same shape as our table. Then, when we enumerate we can grab the row and column probabilities for the index we are observing, then assign those two probabilities multiplied by n as the expected value for the given index in our expected array.
Now, to get the chi-square test statistic, the equation is…
Equation
$$O_{ij} = \text{Observed Value for row i, column j}$$
$$E_{ij} = \text{Expected Value for row i, column j}$$
Once you have the chi-squared statistic, we can find the critical value given the confidence level and degrees of freedom.
(((table-expected)**2)/expected).sum()
The degrees of freedom = (# of x variables-1)*(# of y variables-1), so for this problem we have (2)(1) = 2 degrees of freedom. Now we need to pick a p value, and we can get back the critical value. If our chi squared statistic is greater than the critical value then the we reject the null hypothesis that the variables are independent. Let’s find the critical value now in python.
from scipy.stats import chi2
chi = chi2.isf(q=0.05, df=2)
print(chi)
Our value is below this threshold so we can’t say that we believe the variables to not be independent.
Source Code