Root finding, minimization, and solving equations
Paris Dauphine University-PSL
High-level toolbox for numerical optimization: root finding, minimization, least squares, linear programming, and more.
Find x such that \(f(x)=0\) using a robust bracketing method (Brent). Requires an interval \([a,b]\) where \(f(a)\) and \(f(b)\) have opposite signs.
Solve vector equations \(F(v)=0\) using hybrid methods (Jacobian-free by default; can supply Jacobian for speed/robustness).
Minimize a scalar objective without constraints. SciPy picks gradient-based or derivative-free methods depending on options.
\[ \min_{x \in \mathbb{R}} \; (x-3)^2 \]
Min-variance with \(\sum_i w_i = 1\) and \(w \ge 0\) using historical covariance.
\[ \begin{aligned} \min_{w \in \mathbb{R}^N} \quad & w^\top \Sigma \, w \\ \text{s.t.}\quad & \mathbf{1}^\top w = 1, \\ & w \ge 0 \end{aligned} \]
import pandas as pd, numpy as np
from scipy import optimize as opt
# use returns from prices-top20.csv to estimate mu and Sigma
px = pd.read_csv('data/prices-top20.csv', parse_dates=['date']).sort_values(['symbol','date'])
ret = px.groupby('symbol')['close'].pct_change()
R = px.assign(ret=ret).pivot(index='date', columns='symbol', values='ret').dropna()
mu = R.mean().values
Sigma = np.cov(R.values, rowvar=False)
# min variance portfolio with sum(w)=1 and w>=0
N = len(mu)
var = lambda w: w @ Sigma @ w
cons = ({'type': 'eq', 'fun': lambda w: np.sum(w)-1}, {'type':'ineq','fun': lambda w: w})
res = opt.minimize(var, x0=np.ones(N)/N, constraints=cons, bounds=[(0,1)]*N)
weights = pd.Series(res.x, index=R.columns).sort_values(ascending=False)
weights.head()Optimize with bounds and/or constraints. Use methods that support them (e.g., SLSQP, trust-constr; COBYLA is derivative-free).
\[ \min_{x \in \mathbb{R}} \; (x-1)^2 \quad \text{s.t. } x \ge 0 \]
Solve linear least squares \(Ax \approx y\) (QR/SVD under the hood). Standard tool for linear regression under Gaussian noise.
\[ \min_{x} \; \|Ax - y\|_2^2 \]
Providing analytic gradients (and Hessians where applicable) improves speed and accuracy; quasi-Newton methods like BFGS approximate the Hessian from gradients.
\[ \min_{x} \; f(x) \quad \text{with } \nabla f,\; \nabla^2 f \text{ when available} \]
Use Bounds for box constraints and LinearConstraint for affine equalities/inequalities; pair with SLSQP or trust-constr.
\[ \min_{x} \; f(x) \quad \text{s.t. } A x = b,\; \ell \le x \le u \]
Track progress and customize termination with callbacks and options like max iterations or tolerances.
Solve linear objectives with linear constraints to global optimality using the HiGHS solvers (simplex/IPM), deterministic and scalable.
\[ \begin{aligned} \min_{x} \quad & c^\top x \\ ext{s.t.}\quad & A_{\mathrm{ub}} x \le b_{\mathrm{ub}}, \\ & A_{\mathrm{eq}} x = b_{\mathrm{eq}}, \\ & \ell \le x \le u \end{aligned} \]
Stochastic global search: random perturbations + local minimization to escape local minima in nonconvex landscapes.
\[ \min_{x} \; f(x) \]
Bounds, LinearConstraint, and NonlinearConstraint in one frameworkinitial_tr_radius to control step sizesres.message, try different initial guessesGenerate synthetic data and fit a logistic curve using nonlinear least squares.
import numpy as np
from scipy.optimize import curve_fit
def logistic(x, L, k, x0):
return L / (1 + np.exp(-k*(x - x0)))
rng = np.random.default_rng(42)
x = np.linspace(-6, 6, 120)
y_true = logistic(x, L=1.8, k=1.1, x0=0.2)
y_obs = y_true + 0.05*rng.standard_normal(x.size)
popt, pcov = curve_fit(logistic, x, y_obs, p0=[1.0, 1.0, 0.0])
L_hat, k_hat, x0_hat = popt
print({'L': L_hat, 'k': k_hat, 'x0': x0_hat})Estimate daily returns from prices-top20.csv, compute covariance, and solve min‑variance portfolio with \(\sum_i w_i = 1\) and \(w \ge 0\).
For a minimal, self‑contained dataset (if you don’t have prices locally), you can first write a tiny two‑ticker CSV to data/prices-top2.csv, then point the reader path to that file in the solution below:
# Create a tiny two‑ticker daily prices CSV for demonstration
import os, numpy as np, pandas as pd
os.makedirs('data', exist_ok=True)
dates = pd.date_range('2023-01-02', periods=10, freq='B')
a_close = 100 + np.arange(len(dates))*0.5 # upward drift
b_close = 50 + np.arange(len(dates))*0.2 # gentler drift
px_demo = pd.DataFrame({
'date': list(dates) + list(dates),
'symbol': ['A-USD']*len(dates) + ['B-USD']*len(dates),
'close': list(a_close) + list(b_close),
}).sort_values(['symbol','date'])
px_demo.to_csv('data/prices-top2.csv', index=False)
print('Wrote data/prices-top2.csv with', len(px_demo), 'rows.')
import pandas as pd, numpy as np
from scipy import optimize as opt
# Load prices and compute returns per symbol
px = pd.read_csv('data/prices-top20.csv', parse_dates=['date']).sort_values(['symbol','date'])
ret = px.groupby('symbol')['close'].pct_change()
R = px.assign(ret=ret).pivot(index='date', columns='symbol', values='ret').dropna()
Sigma = np.cov(R.values, rowvar=False)
tickers = R.columns
N = len(tickers)
var = lambda w: w @ Sigma @ w
cons = (
{'type':'eq', 'fun': lambda w: np.sum(w) - 1}, # sum to 1
{'type':'ineq','fun': lambda w: w}, # w >= 0 (handled also by bounds)
)
bounds = [(0,1)]*N
res = opt.minimize(var, x0=np.ones(N)/N, bounds=bounds, constraints=cons, method='SLSQP')
weights = pd.Series(res.x, index=tickers).sort_values(ascending=False)
print('Variance:', var(res.x))
weights.head(10)Examples of choosing different algorithms and options:
import numpy as np
from scipy import optimize as opt
rosen = opt.rosen # classic Rosenbrock function
x0 = np.array([-1.2, 1.0])
opt.minimize(rosen, x0, method='BFGS', options={'gtol':1e-8,'maxiter':500})
opt.minimize(rosen, x0, method='L-BFGS-B', bounds=[(-2,2),(-1,3)])
opt.minimize(rosen, x0, method='Nelder-Mead') # derivative-freeHandle nonlinear (in)equality constraints via NonlinearConstraint. trust-constr uses a constrained trust-region approach; analytic Jacobians help a lot. Example: minimize \(\|x\|^2\) s.t. \(x_0 x_1 = 1\), \(x \ge 0\).
\[ \begin{aligned} \min_{x \ge 0} \quad & \|x\|_2^2 \\ \text{s.t.}\quad & x_0\, x_1 = 1 \end{aligned} \]
import numpy as np
from scipy.optimize import NonlinearConstraint
# minimize ||x||^2 s.t. x0 * x1 = 1 and x >= 0
obj = lambda x: x @ x
grad = lambda x: 2*x
con_fun = lambda x: x[0]*x[1]
con_jac = lambda x: np.array([x[1], x[0]])
nlin = NonlinearConstraint(con_fun, lb=1.0, ub=1.0, jac=con_jac)
res = opt.minimize(obj, x0=[0.5, 3.0], jac=grad,
constraints=[nlin], bounds=[(0,None),(0,None)],
method='trust-constr')Fit parameters by minimizing \(\sum_i r_i(\theta)^2\) using Gauss-Newton/Levenberg–Marquardt (or TRF). Robust loss reduces outlier influence; e.g., fit \(y = a e^{b x} + c\).
\[ \min_{\theta} \; \sum_i r_i(\theta)^2 \]
import numpy as np
from scipy.optimize import least_squares
# Fit y = a * exp(bx) + c with robust loss
rng = np.random.default_rng(0)
x = np.linspace(0, 3, 50)
true = np.array([2.0, 0.8, 0.5])
y = true[0]*np.exp(true[1]*x) + true[2] + 0.2*rng.standard_normal(x.size)
def residuals(theta):
a, b, c = theta
return a*np.exp(b*x) + c - y
res = least_squares(residuals, x0=[1, 1, 0], loss='huber', f_scale=0.5)
res.x # estimated parametersConvenient wrapper around least_squares for parametric models; returns best-fit parameters and covariance.
\[ \min_{a,b,c} \; \sum_i \big(a e^{b x_i} + c - y_i\big)^2 \]
Explore nonconvex landscapes with population/stochastic methods: - Differential Evolution: evolutionary search over bounded boxes - Dual Annealing: simulated annealing variant with local refinements - SHGO: deterministic global optimizer for low dimensions within bounds
\[ \min_{v} \; f(v) \quad \text{subject to bounds} \]
from scipy.optimize import differential_evolution, dual_annealing, shgo
# Differential Evolution (evolutionary, box-bounded)
f = lambda v: (v[0]-3)**2 + 10*np.sin(v[1])**2
bounds = [(-5,5), (-5,5)]
de = differential_evolution(f, bounds, tol=1e-6)
# Simulated annealing variant
da = dual_annealing(f, bounds, maxiter=2000)
# SHGO (global for low-dim, needs bounds)
sg = shgo(f, bounds)Model convex problems declaratively and solve to global optimality with convex solvers (OSQP/SCS). Great for QPs, L1/L2 regularization, portfolios.
\[ \begin{aligned} \min_{w} \quad & w^\top \Sigma \, w \\ ext{s.t.}\quad & \mathbf{1}^\top w = 1, \\ & w \ge 0, \\ & \mu^\top w \ge r_{\text{target}} \end{aligned} \]
# pip install cvxpy (once)
import numpy as np, cvxpy as cp
np.random.seed(0)
N = 10
mu = np.random.rand(N)/10
Sigma = np.random.randn(N, N); Sigma = Sigma.T@Sigma + 1e-3*np.eye(N)
w = cp.Variable(N)
ret_target = 0.05
prob = cp.Problem(cp.Minimize(cp.quad_form(w, Sigma)),
[cp.sum(w) == 1,
w >= 0,
mu @ w >= ret_target])
prob.solve(solver=cp.OSQP)
w.valueSolve discrete/boolean decision problems with CP-SAT (MILP/CP). Global search leverages cutting planes and constraint propagation.
\[ \begin{aligned} \max_{x \in \{0,1\}^n} \quad & \sum_{i=1}^n v_i x_i \\ ext{s.t.}\quad & \sum_{i=1}^n w_i x_i \le C \end{aligned} \]
# pip install ortools (once)
from ortools.sat.python import cp_model
values = [6, 10, 12]; weights = [1, 2, 3]; capacity = 5
model = cp_model.CpModel()
x = [model.NewBoolVar(f"x{i}") for i in range(len(values))]
model.Add(sum(w*x_i for w, x_i in zip(weights, x)) <= capacity)
model.Maximize(sum(v*x_i for v, x_i in zip(values, x)))
solver = cp_model.CpSolver(); solver.Solve(model)
[solver.Value(x_i) for x_i in x]Use reverse-mode automatic differentiation to compute exact gradients on NumPy-like code for efficient first-order optimization.
\[ \min_{x \in \mathbb{R}^n} \; \sum_{i=1}^{n-1}\left[100\,(x_{i+1}-x_i^2)^2 + (1-x_i)^2\right] \]
Optimize model parameters with first-order methods (SGD/Adam) using autograd; standard in ML for large-scale problems.
\[ \min_{w} \; \frac{1}{n} \sum_{i=1}^n \operatorname{BCE}\big(\sigma(x_i^\top w),\; y_i\big) \]
# pip install torch (CPU)
import torch
# Logistic regression on synthetic data
torch.manual_seed(0)
n, d = 200, 3
X = torch.randn(n, d)
w_true = torch.tensor([1.0, -2.0, 0.5])
y = (torch.sigmoid(X@w_true + 0.1*torch.randn(n)) > 0.5).float()
w = torch.zeros(d, requires_grad=True)
opt_t = torch.optim.Adam([w], lr=0.05)
for _ in range(400):
opt_t.zero_grad()
p = torch.sigmoid(X@w)
loss = torch.nn.functional.binary_cross_entropy(p, y)
loss.backward(); opt_t.step()
w.detach()scipy.optimize.check_grad(f, grad, x0) to verify gradientsjac/hess when possible; set options={'maxiter':..., 'gtol':...}Collect iterates during optimization to analyze convergence behavior or visualize paths after the run.
Allocate a long-only portfolio with a minimum return, total weight = 1, and sector exposure caps.
\[ \begin{aligned} \min_{w} \quad & w^\top \Sigma \, w \\ ext{s.t.}\quad & \mathbf{1}^\top w = 1,\; w \ge 0, \\ & \mu^\top w \ge r_{\min}, \\ & S w \le \text{cap} \end{aligned} \]
import numpy as np, pandas as pd
from scipy import optimize as opt
# Inputs: returns R (T x N), expected mu (N,), covariance Sigma (N x N), sector map
px = pd.read_csv('data/prices-top20.csv', parse_dates=['date']).sort_values(['symbol','date'])
ret = px.groupby('symbol')['close'].pct_change()
R = px.assign(ret=ret).pivot(index='date', columns='symbol', values='ret').dropna()
mu = R.mean().values
Sigma = np.cov(R.values, rowvar=False)
tickers = R.columns
sectors = pd.Series({t: t.split('-')[0] for t in tickers}) # example sector parse
N = len(mu)
var = lambda w: w @ Sigma @ w
ret_target = mu.mean() # target: average expected return
def sector_matrix(labels):
cats = sorted(labels.unique())
S = np.zeros((len(cats), len(labels)))
for i,c in enumerate(cats):
S[i, labels.index[labels==c].tolist()] = 1.0
return S, cats
S, cats = sector_matrix(sectors)
cap = 0.35 # max 35% per sector
cons = [
{'type':'eq', 'fun': lambda w: np.sum(w) - 1},
{'type':'ineq','fun': lambda w: w @ mu - ret_target},
]
# sector caps as inequality constraints S w <= cap
for i in range(S.shape[0]):
cons.append({'type':'ineq','fun': (lambda i: lambda w: cap - S[i]@w)(i)})
bounds = [(0,1)]*N
res = opt.minimize(var, x0=np.ones(N)/N, bounds=bounds, constraints=cons, method='SLSQP')
weights = pd.Series(res.x, index=tickers).sort_values(ascending=False)
weights.head(10)Minimize tracking error to a benchmark with L1 turnover penalty to control changes from current weights.
\[ \min_{w} \; \lambda_{\mathrm{te}}\, (w-w_b)^\top \Sigma (w-w_b) + \lambda_{\mathrm{to}}\, \|w-w_{\text{prev}}\|_1 \quad \text{s.t. } \mathbf{1}^\top w = 1,\; w \ge 0 \]
import numpy as np, pandas as pd, cvxpy as cp
# R, Sigma from historical returns; w_bench benchmark weights; w_prev current weights
px = pd.read_csv('data/prices-top20.csv', parse_dates=['date']).sort_values(['symbol','date'])
ret = px.groupby('symbol')['close'].pct_change()
R = px.assign(ret=ret).pivot(index='date', columns='symbol', values='ret').dropna()
Sigma = np.cov(R.values, rowvar=False)
tickers = R.columns; N = len(tickers)
w_bench = np.ones(N)/N
w_prev = np.ones(N)/N
w = cp.Variable(N)
lam_te = 1.0
lam_to = 0.01
obj = cp.quad_form(w - w_bench, Sigma) * lam_te + lam_to * cp.norm1(w - w_prev)
prob = cp.Problem(cp.Minimize(obj), [cp.sum(w)==1, w>=0])
prob.solve(solver=cp.OSQP)
pd.Series(w.value, index=tickers).sort_values(ascending=False).head(10)Find weights so each asset contributes equally to portfolio risk. Solve with nonlinear constraints.
Let (_i = w_i,(w)_i). Then
\[ \min_{w} \; \sum_i (\mathrm{RC}_i - \bar{\mathrm{RC}})^2 \quad \text{s.t. } \mathbf{1}^\top w = 1,\; w \ge 0 \]
import numpy as np
from scipy import optimize as opt
def risk_contrib(w, Sigma):
# RC_i = w_i (Sigma w)_i
Sw = Sigma @ w
return w * Sw
def rp_objective(w, Sigma):
RC = risk_contrib(w, Sigma)
target = np.mean(RC)
return np.sum((RC - target)**2)
Sigma = np.eye(5) * 0.05
Sigma[0,1]=Sigma[1,0]=0.03 # some correlation
N = Sigma.shape[0]
bounds = [(0,1)]*N
cons = ({'type':'eq','fun': lambda w: np.sum(w)-1},)
res = opt.minimize(lambda w: rp_objective(w,Sigma), x0=np.ones(N)/N,
bounds=bounds, constraints=cons, method='SLSQP')
res.xHedge estimation error by minimizing worst‑case variance in an ellipsoidal uncertainty set. Use the constraint \(\|Uw\|_2 \le t\) and minimize \(t - \lambda \, (\mu^\top w)\) subject to \(\mathbf{1}^\top w = 1\) and \(w \ge 0\).
\[ \begin{aligned} \min_{w,\,t} \quad & t - \lambda\, \mu^\top w \\ ext{s.t.}\quad & \mathbf{1}^\top w = 1,\; w \ge 0, \\ & \|U w\|_2 \le \sqrt{t} + \rho \end{aligned} \]
import numpy as np, cvxpy as cp
np.random.seed(0)
N = 10
mu = np.random.rand(N)/10
Sigma_hat = np.random.randn(N,N); Sigma_hat = Sigma_hat.T@Sigma_hat + 1e-3*np.eye(N)
U = np.linalg.cholesky(Sigma_hat) # uncertainty shaping
rho = 0.05
w = cp.Variable(N)
# worst-case variance upper bound: || U w ||_2 <= t, minimize t + lambda * (-mu @ w)
t = cp.Variable()
lam = 1.0
prob = cp.Problem(cp.Minimize(t - lam*(mu@w)), [
cp.sum(w) == 1,
w >= 0,
cp.norm(U@w, 2) <= cp.sqrt(t) + rho # conservative form
])
prob.solve(solver=cp.SCS)
w.valueFit a simple model parameter (e.g., volatility) to market prices by minimizing squared pricing errors.
\[ \min_{\sigma} \; \sum_j \big( C_{\text{model}}(S,K_j,t_j,r,\sigma) - C^{\text{mkt}}_j \big)^2 \]
import numpy as np
from scipy.optimize import least_squares
from math import log, sqrt, exp
from mpmath import quad, erfc # or implement Black–Scholes directly
def bs_call(s, k, t, r, sigma):
from math import log, sqrt
d1 = (log(s/k) + (r+0.5*sigma**2)*t) / (sigma*sqrt(t))
d2 = d1 - sigma*sqrt(t)
from mpmath import quad, erfc
# quick normal CDF via erfc
Phi = lambda z: 0.5*erfc(-z/np.sqrt(2))
return s*Phi(d1) - k*exp(-r*t)*Phi(d2)
S = 100; r = 0.01
strikes = np.array([90, 95, 100, 105, 110])
tts = np.array([0.25, 0.25, 0.25, 0.25, 0.25])
market = np.array([11.5, 7.9, 5.8, 3.9, 2.5])
def resid(params):
sigma = params[0]
model = np.array([bs_call(S, k, t, r, sigma) for k,t in zip(strikes, tts)])
return model - market
res = least_squares(resid, x0=[0.2], bounds=(1e-4, 2.0), loss='soft_l1')
res.x # calibrated volatilityleast_squares using a robust loss (Cauchy/Huber)Intro to VBA and Python