Plotting Unemployment vs. GDP Part 2
Solution
lower = df.describe().loc["mean"]-3*df.describe().loc["std"]
upper = df.describe().loc["mean"]+3*df.describe().loc["std"]
print(lower)
print(upper)
Now, if you haven’t seen truth series yet, here is a quick primer. We can index pandas dataframes by a truth index so that we only get rows that satisfy conditions. The first step is to get an array of True/False by condition. Let’s get an array where True means the point is within three standard deviations of the mean and false means it is outside.
(df > lower) & (df < upper)
We can use this truth array to return only the rows that satisfy the conditions.
df = df[(df > lower) & (df < upper)].dropna()
df.describe()
Let’s see if plotting looks better now.
sns.jointplot(df["Unemployment"],df["GDP Rate"],kind="hex")
plt.show()
It looks better now!
Let’s try this analysis with different income levels, if we use wbdata.get_incomelevel() we’ll see the different income level options.
wbdata.get_incomelevel()
Let’s look at HIC, MIC and LIC. It’s easy to find the countries associated this way:
levels = ["HIC","MIC","LIC"]
for level in levels:
countries = [i["id"] for i in wbdata.get_country(incomelevel=level, display=False)]
print(countries)
print()
wbdata.get_country(incomelevel=level, display=False) returns all country dictionaries within the specified income level, as before we need to use i[“id”] to get the country ID.
Challenge