Expected Values

Solution

expected = np.empty(table.shape)
for index, x in np.ndenumerate(table):
    pA = row[index[0]]
    pB = col[index[1]]
    expected[index] = pA*pB*n
print(expected)

The first line creates an empty array with the same shape as our table. Then, when we enumerate we can grab the row and column probabilities for the index we are observing, then assign those two probabilities multiplied by n as the expected value for the given index in our expected array.

Now, to get the chi-square test statistic, the equation is…

Equation

$$\chi^2 = \Sigma \frac{(O_{ij}-E_{ij})^2}{E_{ij}} $$
$$O_{ij} = \text{Observed Value for row i, column j}$$
$$E_{ij} = \text{Expected Value for row i, column j}$$

Once you have the chi-squared statistic, we can find the critical value given the confidence level and degrees of freedom.

(((table-expected)**2)/expected).sum()

The degrees of freedom = (# of x variables-1)*(# of y variables-1), so for this problem we have (2)(1) = 2 degrees of freedom. Now we need to pick a p value, and we can get back the critical value. If our chi squared statistic is greater than the critical value then the we reject the null hypothesis that the variables are independent. Let’s find the critical value now in python.

from scipy.stats import chi2
chi = chi2.isf(q=0.05, df=2)
print(chi)

Our value is below this threshold so we can’t say that we believe the variables to not be independent.

Basics

Statistics

Expected Values

Solution

Equation

Source Code

Leave A Reply Cancel reply

Modal title