Time Series Plot
Let’s get our initial settings. We are going to look at the difference of HIV rates between high income, medium income and low income countries.
import wbdata
import pandas as pd
import datetime
dates = (datetime.datetime(1990, 1, 1), datetime.datetime(2017, 1, 1))
levels = ["HIC","MIC","LIC"]
indicators = {"SH.HIV.1524.MA.ZS":"HIV Rate"}
Let’s cycle through each level and concat them all together at the end.
dataArray = []
for level in levels:
countries = [i["id"] for i in wbdata.get_country(incomelevel=level, display=False)]
data = wbdata.get_dataframe(indicators, country=countries, data_date=dates)
data["Income Level"] = level
dataArray.append(data)
df = pd.concat(dataArray)
df
Now drop NA and reset the index.
df.reset_index(inplace=True)
df.dropna(inplace=True)
df
Seaborn’s tsplot is what we use to create the time series graph. The arguments to worry about are data, time for which column represents the dates, unit which represents the individual entities (in our case it is countries), condition which is what to group units into (in our case the income level) and finally value which is the actual value we are trying to plot. Run the below code.
import seaborn as sns
import matplotlib.pyplot as plt
sns.tsplot(data=df, time="date", unit="country",
condition="Income Level", value="HIV Rate")
plt.show()
As you can see, we get a confidence interval plotted around the time series which is good for visualizing the possible range of values.
Source Code