Data Processing
In [1]:
import pandas as pd
# Read the data
CPI = pd.read_csv("Data/CPI.csv",index_col=0)
# Rename the columns
CPI.columns = ["CPI"]
# Convert the index to a datetime index
CPI.index = pd.to_datetime(CPI.index)
print(CPI)
CPI
DATE
1947-01-01 21.480
1947-02-01 21.620
1947-03-01 22.000
1947-04-01 22.000
1947-05-01 21.950
... ...
2019-10-01 257.229
2019-11-01 257.824
2019-12-01 258.444
2020-01-01 258.820
2020-02-01 259.050
[878 rows x 1 columns]
Plotting the line will give us some good intuition of what data we are working with.
In [2]:
import matplotlib.pyplot as plt
# Plot
CPI.plot(kind="line")
# Add labels
plt.xlabel("Date")
plt.ylabel("Index Value")
plt.title("CPI Time Series")
plt.show()
We also have an index of real estate prices.
In [3]:
# Read the data in
real_estate = pd.read_csv("Data/Real Estate.csv",index_col=0)
# Set the index to be datetime
real_estate.index = pd.to_datetime(real_estate.index)
# Rename the columns
real_estate.columns = ["Real Estate"]
print(real_estate)
Real Estate
DATE
1975-01-01 59.77
1975-04-01 60.97
1975-07-01 61.18
1975-10-01 62.22
1976-01-01 62.90
... ...
2018-10-01 429.86
2019-01-01 434.58
2019-04-01 442.71
2019-07-01 447.87
2019-10-01 451.65
[180 rows x 1 columns]
When we plot the real estate index we see the drop during the financial crisis.
In [4]:
# Plot the data
real_estate.plot(kind="line")
# Add labels
plt.xlabel("Date")
plt.ylabel("Index Value")
plt.title("Real Estate Time Series")
plt.show()