Income Level Plotting
Solution
dataArray = []
#Our data array will hold the different income level datafraes
levels = ["HIC","MIC","LIC"]
for level in levels:
countries = [i["id"] for i in wbdata.get_country(incomelevel=level, display=False)]
data = wbdata.get_dataframe(indicators, country=countries)
#Pull data for the income level countries
data["Income Level"] = level
#Create a column Income Level to hold the income level for each smaller dataframe
dataArray.append(data)
df = pd.concat(dataArray)
#Put all the dataframes together
df.dropna(inplace=True)
df
What we do for this solution is first initialize an array for holding the different dataframes related to each income level. Then we are iterating through the income levels getting data, creating an “Income Level” column to hold the income level for each iteration, and finally putting it all together at the end by feeding pd.concat() our array of dataframes.
With the data in place, we are going to learn some new ways to work with pandas. We could loop through the income levels, find which rows match those income levels and then plot them, but there is a better way. Instead, we are going to use a pandas function which breaks our dataframe into groups for function application. Run the below code.
df.groupby("Income Level")
You might be wondering, what did I just do? Well you did nothing. Yet. After you use groupby(), you can call a function to be applied to each group. Check out a few of the below formulas.
print(df.groupby("Income Level").sum())
print()
print(df.groupby("Income Level").mean())
print()
print(df.groupby("Income Level").max())
Notice how these functions work, they get applied for each column and each group combination.
You can also apply your own custom functions, for example, let’s create a function that finds the mean divided by standard deviation.
def customFunc(x):
return x.mean()/x.std()
print(df.groupby("Income Level").apply(customFunc))
You use .apply() to apply a function to a dataframe, and then pass in as the argument the function which you want to apply. The apply() function works with a regular pandas dataframe as well, not just one that is grouped.
Challenge