-
wbdata 5
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
Lecture1.5
-
-
Hexbin Plots 7
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
Lecture2.4
-
Lecture2.5
-
Lecture2.6
-
Lecture2.7
-
-
Heatmap 5
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
Lecture3.4
-
Lecture3.5
-
-
Boxplot 2
-
Lecture4.1
-
Lecture4.2
-
-
Violin Plot 5
-
Lecture5.1
-
Lecture5.2
-
Lecture5.3
-
Lecture5.4
-
Lecture5.5
-
-
Time Series 2
-
Lecture6.1
-
Lecture6.2
-
-
Pairplot 2
-
Lecture7.1
-
Lecture7.2
-
-
Kernel Density Estimation 3
-
Lecture8.1
-
Lecture8.2
-
Lecture8.3
-
KDE Plot
First, like usual we will get the data. This time we want just the GDP per Capita measure and the population measure.
import wbdata
import pandas as pd
import datetime
dates = (datetime.datetime(2016, 1, 1),datetime.datetime(2017, 1, 1))
levels = ["HIC","MIC","LIC"]
indicators = {"SP.POP.TOTL":"Population","NY.GDP.PCAP.CD":"GDP per Capita"}
dataArray = []
for level in levels:
countries = [i["id"] for i in wbdata.get_country(incomelevel=level, display=False)]
data = wbdata.get_dataframe(indicators, country=countries, data_date=dates)
data["Income Level"] = level
dataArray.append(data)
df = pd.concat(dataArray)
df.dropna(inplace=True)
df
Let’s apply the log transformations again as well.
import numpy as np
df["GDP per Capita"] = df["GDP per Capita"].apply(np.log)
df["Population"] = df["Population"].apply(np.log)
Bring up the pair plots for these two for a reminder.
sns.pairplot(df, hue="Income Level")
plt.show()
Now, let’s call kdeplot() from seaborn by giving x and y arguments.
sns.kdeplot(df["Population"],df["GDP per Capita"])
plt.show()
These lines look a little ugly, let’s get some shading in there. Setting shade=True will do this.
sns.kdeplot(df["Population"],df["GDP per Capita"],shade=True)
plt.show()
Another change we might want to make is to get rid of farthest shading, it will look better without it. We can do this with shade_lowest=False.
sns.kdeplot(df["Population"],df["GDP per Capita"],shade=True,shade_lowest=False)
plt.show()
This is the full distribution, instead of plotting everything together let’s plot each income level alone.
Challenge