This project will be full of breadth, but lacking on depth. The main goal will be to illustrate how web scraping, regression, machine learning and data visualization can come together to attempt to cluster stocks together. The project will get daily return data, and use that to create regressions against daily commodity returns. The betas from these regressions will then be used in our machine learing model, and we’ll get surprisingly strong results given how little sophistication this project has (future projects will go deeper into these topics).
Prerequisites: Any Introductory Course
Python Difficulty: Intermediate
Python Libraries Used: pandas-datareader, pandas, lxml, requests, statsmodels, sklearn.cluster
Finance/Economics Difficulty: Intermediate