-
Basic Functionality 6
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
Lecture1.5
-
Lecture1.6
-
-
Pandas & Indices 4
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
Lecture2.4
-
-
Joins, Unions & Order By 3
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
-
Group By & Views 3
-
Lecture4.1
-
Lecture4.2
-
Lecture4.3
-
Views
In SQL, a view is a virtual table. It acts exactly like a regular table in many regards, but can be created from actual data for the purpose of things like automatic summaries of data. As we work through the following pieces of code you will begin to see the benefit of using a view. Let’s begin with a basic query that we can use to find the average return by sector over all of the days.
#Let's start with a basic query
query = "SELECT Sector, AVG(Return) as Return FROM Returns GROUP BY Sector"
df = pd.read_sql(query, conn)
print(df)
The way to create a view is CREATE VIEW [Name] AS followed by a query that we want to turn into a view. Replace Name with the name of the virtual table. The below example will create a view from the prior query.
#Turn it into a view
cur = conn.cursor()
query = "CREATE VIEW [Sector Average] AS SELECT Sector, AVG(Return) as Return FROM Returns GROUP BY Sector"
cur.execute(query)
Now that the view has been created, it can be queried similar to a normal table.
#Then you are able to pull from it
query = "SELECT * FROM [Sector Average]"
df = pd.read_sql(query, conn)
print(df)
An important feature of views is that when we update the data which feeds into the view, the view as well is updated. Watch what happens when we delete all negative returns in the data.
#Notice if we change the data the view also changes
query = "DELETE FROM Returns WHERE Return < 0"
cur.execute(query)
query = "SELECT * FROM [Sector Average]"
df = pd.read_sql(query, conn)
print(df)
Reset the data back to what it was.
#Reset it....
data.to_sql("Returns", conn, index=False, if_exists='replace')
Check the view has changed back to what it started as.
#Back to what it was before
query = "SELECT * FROM [Sector Average]"
df = pd.read_sql(query, conn)
print(df)
Let's move on to another important concept. When we use IN, we can filter results to be only those where a column is within a list of elements. For example, if we wanted the average return for consumer discretionary and consumer staples stocks only, we would apply the following at the end "WHERE Sector IN ('Consumer Discretionary', 'Consumer Staples')". What this achieves is removing all entries that don't have a sector equal to either of those.
#The IN keyword can be used to find if a column contains one of any values given
#Below will find the average return for the consumer sectors
query = "SELECT AVG(Return) FROM [Sector Average] WHERE Sector IN ('Consumer Discretionary', 'Consumer Staples')"
df = pd.read_sql(query, conn)
print(df)