Location
Location¶
The location column needs to be worked with to be useable. First, let’s see what the format is. It is overall going to be latitude/longitude that we get out of it at the end.
#Print a sample location
print(r1['Location'].values[0])
Because this is a string, we need to do 2 things to get it into a format we can use. First of all, we are going to ahve to get rid of the parentheses. After that, we need to split the string in two to get the latitude and longitude separate. If we split on "| ", we should get an even split. Below is an example of replacing the parentheses with nothing, then doing this split.
print(r1['Location'].values[0].replace("(", "").replace(")", "").split("| "))
Let's now take this logic and apply it to each row. We need two more things to be done here. First, we want to convert the strings in the list to floats, and then also we are going to need to index each position for the respective column.
#Get rid of any null values
r1 = r1[~pd.isnull(r1['Location'])]
#Get the latitude
r1['Latitude'] = r1['Location'].apply(lambda x: x.replace("(", "").replace(")", "").split("| ")[0]).astype(float)
#Get the longitude
r1['Longitude'] = r1['Location'].apply(lambda x: x.replace("(", "").replace(")", "").split("| ")[1]).astype(float)
#Get rid of any that have 0 for the latitude (not a valid location)
r1 = r1[r1['Latitude'] != 0]
print(r1[['Latitude', 'Longitude']])