Regression Into
We are going to use the statsmodels library for our regressions. The way we create a formula is by specifying “Dependent Variable ~ Variable1 Variable2 Variable3”. These variables will be the columns of our dataframe, which we feed in by saying “data=df”. So, to create a model, we would do this:
import pandas as pd
import statsmodels.formula.api as sm
df = pd.DataFrame.from_csv("StockData.csv", encoding="UTF-8")
model = sm.ols(formula="PXD ~ Index + Oil + Gold + NaturalGas", data=df)
This creates our model object, but we need to actually fit the equation, which we do by using the fit() function.
result = model.fit()
From here, we can get a summary of the results.
print(result.summary())
Towards the bottom, the column “P>|t|” is the p-score. Notice the two that are under 5% are the market index’s daily returns and oil’s daily returns. This is unsurprising since PXD does oil exploration and production.
You’ll also notice a beta over 1 for the market index. This means the stock tends to have a strong reaction to the market. Oil has a .38 beta, meaning for every 1% change in oil prices, we expect .38% of an increase in this stock. This might not seem as important as the market index, but it is important to realize the beta on oil takes into account the market index already. It assumes we are keeping the market index is constant during the oil price rising. Oil prices going upwards might cause the market to also rise, which would lead to even more of an increase since there is a valid beta on the market.
Challenge