-
Geographical Analysis 6
-
Lecture1.1
-
Lecture1.2
-
Lecture1.3
-
Lecture1.4
-
Lecture1.5
-
Lecture1.6
-
-
Cap Table 3
-
Lecture2.1
-
Lecture2.2
-
Lecture2.3
-
-
Simulation 6
-
Lecture3.1
-
Lecture3.2
-
Lecture3.3
-
Lecture3.4
-
Lecture3.5
-
Lecture3.6
-
-
Search Index 8
-
Lecture4.1
-
Lecture4.2
-
Lecture4.3
-
Lecture4.4
-
Lecture4.5
-
Lecture4.6
-
Lecture4.7
-
Lecture4.8
-
-
Fund Distributions 5
-
Lecture5.1
-
Lecture5.2
-
Lecture5.3
-
Lecture5.4
-
Lecture5.5
-
Location
Location¶
The location column needs to be worked with to be useable. First, let’s see what the format is. It is overall going to be latitude/longitude that we get out of it at the end.
#Print a sample location
print(r1['Location'].values[0])
(42.309030000| -71.050840000)
Because this is a string, we need to do 2 things to get it into a format we can use. First of all, we are going to ahve to get rid of the parentheses. After that, we need to split the string in two to get the latitude and longitude separate. If we split on "| ", we should get an even split. Below is an example of replacing the parentheses with nothing, then doing this split.
print(r1['Location'].values[0].replace("(", "").replace(")", "").split("| "))
['42.309030000', '-71.050840000']
Let's now take this logic and apply it to each row. We need two more things to be done here. First, we want to convert the strings in the list to floats, and then also we are going to need to index each position for the respective column.
#Get rid of any null values
r1 = r1[~pd.isnull(r1['Location'])]
#Get the latitude
r1['Latitude'] = r1['Location'].apply(lambda x: x.replace("(", "").replace(")", "").split("| ")[0]).astype(float)
#Get the longitude
r1['Longitude'] = r1['Location'].apply(lambda x: x.replace("(", "").replace(")", "").split("| ")[1]).astype(float)
#Get rid of any that have 0 for the latitude (not a valid location)
r1 = r1[r1['Latitude'] != 0]
print(r1[['Latitude', 'Longitude']])
Latitude Longitude
41 42.30903 -71.05084
42 42.30890 -71.05096
60 42.30888 -71.05145
77 42.30973 -71.05124
85 42.30932 -71.05208
... ... ...
168101 42.30886 -71.04791
168102 42.30879 -71.04813
168104 42.30862 -71.04868
168105 42.30847 -71.04862
168106 42.30856 -71.04887
[30373 rows x 2 columns]