Null Values
Working with Null Values¶
When you come across null values there are a few options of what you can do. One option is to simply drop any rows which have null values. The function dropna() achieves this by going through every row and ensuring there are no null values. It will not do it in place but rather returns a new version of the dataframe.
#The dropna() function gets rid of any rows with missing values
print(df_final.dropna())
If there are only some columns that you want to consider when dropping null values you can give the keyword subset which will only consider dropping rows with null values in these columns.
#If given an argument subset, we can drop only rows with missing values from the subset
print(df_final.dropna(subset=["Height","Type"]))
If you do not want to return a dataframe but want to instead modify the one you are calling these functions on you are free to use the argument inplace=True. This will make it so that the modification happens to the object you passed.
#If we use inplace=True then the dropna function happens in place
df_final.dropna(inplace=True,subset=["Height","Type"])
print(df_final)
We may also prefer to use a more descriptive index like the name of the player to refer to the different rows. Calling set_index will return a new dataframe with the index set to whatever column you passed it.
#Setting the index to name changes it from a column to index
df_final = df_final.set_index("Name")
print(df_final)
If there is new data that has come to light, you have the option of overwriting data using loc. For example, we can overwrite the row for Christian McCaffrey and the weight value in that row by executing the code below.
#Using loc we can overwrite data
df_final.loc["Christian McCaffrey","Weight"] = 205
print(df_final)