-
wbdata 5
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
Lecture1.5
-
-
Hexbin Plots 7
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
Lecture2.4
-
Lecture2.5
-
Lecture2.6
-
Lecture2.7
-
-
Heatmap 5
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
Lecture3.4
-
Lecture3.5
-
-
Boxplot 2
-
Lecture4.1
-
Lecture4.2
-
-
Violin Plot 5
-
Lecture5.1
-
Lecture5.2
-
Lecture5.3
-
Lecture5.4
-
Lecture5.5
-
-
Time Series 2
-
Lecture6.1
-
Lecture6.2
-
-
Pairplot 2
-
Lecture7.1
-
Lecture7.2
-
-
Kernel Density Estimation 3
-
Lecture8.1
-
Lecture8.2
-
Lecture8.3
-
Income Level Plotting Part 2
Solution
def plotIL(dfPlot):
sns.jointplot(dfPlot["Unemployment"],dfPlot["GDP Rate"],kind="hex")
plt.title(dfPlot["Income Level"].iloc[0])
plt.show()
df.groupby("Income Level").apply(plotIL)
We give plt.title() the argument dfPlot[“Income Level”].iloc[0] to say title the graph with the first entry in the Income Level column. iloc is used to index with integers like what is done in arrays. iloc[0] will give the first element.
Finally, let’s apply a filter like we did in the past where we got rid of anything beyond three standard deviations from the mean. We will do this within the function so that our filter is different for each income level.
def plotIL(dfPlot):
IL = dfPlot["Income Level"].iloc[0]
#Get the income level
dfPlot = dfPlot[["GDP Rate","Unemployment"]]
#Get rid of income level as a column because we don't want to filter off it, and do not need it anymore
lower = dfPlot.describe().loc["mean"]-3*dfPlot.describe().loc["std"]
upper = dfPlot.describe().loc["mean"]+3*dfPlot.describe().loc["std"]
#Get upper and lower bounds
dfPlot = dfPlot[(dfPlot > lower) & (dfPlot < upper)].dropna()
#Filter
sns.jointplot(dfPlot["Unemployment"],dfPlot["GDP Rate"],kind="hex")
#Plot
plt.title(IL)
#Put plot title
plt.show()
df.groupby("Income Level").apply(plotIL)
We get the income level as IL then get rid of the column by setting dfPlot equal to only the columns GDP Rate and Unemployment. This might be new to you, but in pandas, you can select only certain columns by using df[ar] where ar is an array of the columns you want. After we do this, it is just code we’ve seen applied in the function.
Source Code