Function Minimization Basics
Function Minimization Basics¶
If we have a function we would like to minimize, we can use the minimize function from scipy.optimize. While it is very powerful, you need to be careful to understand that it has limitations and works best with smooth functions. For the purpose that we are going to use it for, it does a great job, but be aware that it may not always be the best fit. If there are multiple local minimum points then it can get caught on one of these as opposed to the global minimum, for example.
The minimize function has many different solvers that could be used. Because this is not a course in optimization we will not go through them and instead use the default. The main idea is similar between all the solvers…. to iteratively reduce the error by shifting parameters until we can’t reduce it anymore. Let’s begin with defining a basic function that we want to try minimizing.
#Define a basic parabola function
def parabola(x):
y = (x-5) ** 2 + 100
return y
#Space out 21 points between -15 and 15
X = np.linspace(-15,15,31)
#Get the Y values based on the parabola function
Y = [parabola(x) for x in X]
#Plot
plt.plot(X, Y)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Parabola Function")
plt.ylim([0,550])
plt.show()
#Start with a naive guess of the x point that minimizes the function of 0
x0 = 0
y0 = parabola(x0)
print(x0, y0)
0 125
#Notice that one to the right decreases the function, while one to the left increases it
#When we run the optimization, it will move to the right
print(parabola(x0+1))
print(parabola(x0-1))
116
136
The minimzie function¶
The minimize function will take two arguments:
- The function which we want to minimize
- The starting value for x, the value that needs to be changed to find the minimum
The output from the function will be a dictionary full of a lot of different information including the number of iterations run as well as the value of x that is the minimum.
#Run the function
from scipy.optimize import minimize
print(minimize(parabola, x0))
fun: 100.00000000000003
hess_inv: array([[0.49999995]])
jac: array([0.])
message: 'Optimization terminated successfully.'
nfev: 12
nit: 3
njev: 4
status: 0
success: True
x: array([4.99999982])
The key 'x' gives us back what we are looking for, the optimal value for x. Notice it comes back as a list, we will need to grab the first index to get the actual value. It comes back as a list because our input does not necessarily have to be a single variable.
print(minimize(parabola, x0)['x'])
[4.99999982]
Plotting the point confirms that we have found the minimum.
#Get the minimum point
x_opt = minimize(parabola, x0)['x'][0]
y_opt = parabola(x_opt)
#Add it to plot
plt.plot(X, Y)
plt.plot(x_opt, y_opt, 'ro')
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Parabola Function")
plt.ylim([0,550])
plt.legend(['Function', 'Minimum'])
plt.show()
Extending to Multiple Dimensions¶
The minimization function can also handle multiple dimensions. We just will need to define the function in a way that it takes an argument of a list of these x variables. Let's define a 2D function below. The variable x will be a list with two values. Let's define the function, then after that we need to quickly cover something else that will aid us in solving the problem.
#Define the function
def func2D(x):
return (x[0] - 5) ** 2 * (x[1] - 5) ** 2 + 500
The product function from itertools¶
A very handy function from itertools is the product function. By using this, we can find all combinations of lists. For example, if we wanted all combinations of ['A', 'B', 'C'] and ['1', '2', '3'], the following code below would produce it. The format for this is to call product(*[l1, l2]) where l1 and l2 are the two lists to get the product of.
from itertools import product
#Define the lists
l1 = ['A', 'B', 'C']
l2 = ['1', '2', '3']
#Get the iterator
combos = product(*[l1,l2])
print(combos)
<itertools.product object at 0x7fbbb1a36040>
Iterators can be used by calling __next__() to get out the next value. For example, below we get the first combination followed by the second combination.
#To get the next combo, you can call __next__()
print(combos.__next__())
('A', '1')
#If you call again, it will be the one after it
print(combos.__next__())
('A', '2')
It could also be simply converted to a list.
#Or just transform it into a list
combos = product(*[l1,l2])
combos = list(combos)
print(combos)
[('A', '1'), ('A', '2'), ('A', '3'), ('B', '1'), ('B', '2'), ('B', '3'), ('C', '1'), ('C', '2'), ('C', '3')]
Now that we know how to get all the possible combinations, what if we find all combinations of the X values we defined before. We will use the same values for the X1 and X2 variables. Also, we will transform it into a pandas dataframe.
#Get combos
combos = product(*[X,X])
combos = list(combos)
#Convert to a dataframe
df = pd.DataFrame(combos, columns = ['X1', 'X2'])
print(df)
X1 X2
0 -15.0 -15.0
1 -15.0 -14.0
2 -15.0 -13.0
3 -15.0 -12.0
4 -15.0 -11.0
.. ... ...
956 15.0 11.0
957 15.0 12.0
958 15.0 13.0
959 15.0 14.0
960 15.0 15.0
[961 rows x 2 columns]
#Use the apply function to find the values at each combination
df["Y"] = df.apply(func2D, axis=1)
print(df)
X1 X2 Y
0 -15.0 -15.0 160500.0
1 -15.0 -14.0 144900.0
2 -15.0 -13.0 130100.0
3 -15.0 -12.0 116100.0
4 -15.0 -11.0 102900.0
.. ... ... ...
956 15.0 11.0 4100.0
957 15.0 12.0 5400.0
958 15.0 13.0 6900.0
959 15.0 14.0 8600.0
960 15.0 15.0 10500.0
[961 rows x 3 columns]
Now that we have a general outline of what the function looks like in three dimensions, we can plot it below.
#Create a 3d plot figure
fig = plt.figure(figsize=(12,12))
ax = fig.add_subplot(111, projection='3d')
#Add a scatter plot of all the 3d points
ax.scatter(df['X1'], df['X2'], df['Y'])
plt.show()
With our optimization, we are going to pass in two starting values to begin. It will simply be the origin, (0, 0) and what we will get back is the most optimal point.
#Now we can give an input of two starting values
x0 = [0,0]
print(minimize(func2D, x0))
print()
print()
#The output is going to be two optimal points
x1_opt, x2_opt = minimize(func2D, x0)['x']
y_opt = func2D([x1_opt, x2_opt])
print(x1_opt)
print(x2_opt)
print(y_opt)
fun: 500.00000006039045
hess_inv: array([[135.48482389, 134.48482389],
[134.48482389, 135.48482389]])
jac: array([-7.62939453e-06, -7.62939453e-06])
message: 'Optimization terminated successfully.'
nfev: 88
nit: 21
njev: 22
status: 0
success: True
x: array([4.98432376, 4.98432376])
4.984323755729247
4.984323755729236
500.00000006039045
Adding this point into the graph we see visually that it is in fact the minimum.
fig = plt.figure(figsize=(12,12))
ax = fig.add_subplot(111, projection='3d')
#Add a scatter plot of all the 3d points
ax.scatter(df['X1'], df['X2'], df['Y'])
ax.scatter(x1_opt, x2_opt, y_opt, color='red', s=100)
plt.show()