Optimizers¶

Critical to using the insights generated from subspace-based dimension reduction is the ability to use these low-dimensional representations for optimization.

High-Level Optimization¶

Ridge-based Surrogate Optimization¶

Low-Level Optimization¶

Gauss-Newton¶

psdr.gauss_newton(f, F, x0, tol=1e-10, tol_normdx=1e-12, maxiter=100, linesearch=None, verbose=0, trajectory=None, gnsolver=None)[source]¶

A Gauss-Newton solver for unconstrained nonlinear least squares problems.

Given a vector valued function \(\mathbf{f}:\mathbb{R}^m \to \mathbb{R}^M\) and its Jacobian \(\mathbf{F}:\mathbb{R}^m\to \mathbb{R}^{M\times m}\), solve the nonlinear least squares problem:

\[\min_{\mathbf{x}\in \mathbb{R}^m} \| \mathbf{f}(\mathbf{x})\|_2^2.\]

Normal Gauss-Newton computes a search direction \(\mathbf{p}\in \mathbb{R}^m\) at each iterate by solving least squares problem

\[\mathbf{p}_k \leftarrow \mathbf{F}(\mathbf{x}_k)^+ \mathbf{f}(\mathbf{x}_k)\]

and then computes a new step by solving a line search problem for a step length \(\alpha\) satisfying the Armijo conditions:

\[\mathbf{x}_{k+1} \leftarrow \mathbf{x}_k + \alpha \mathbf{p}_k.\]

This implementation offers several features that modify this basic outline.

First, the user can specify a nonlinear trajectory along which candidate points can move; i.e.,

\[\mathbf{x}_{k+1} \leftarrow T(\mathbf{x}_k, \mathbf{p}_k, \alpha).\]

Second, the user can specify a custom solver for computing the search direction \(\mathbf{p}_k\).

Parameters:

f (callable) – residual, \(\mathbf{f}: \mathbb{R}^m \to \mathbb{R}^M\)
F (callable) – Jacobian of residual \(\mathbf{f}\); \(\mathbf{F}: \mathbb{R}^m \to \mathbb{R}^{M \times m}\)
tol (float [optional] default = 1e-8) – gradient norm stopping criterion
tol_normdx (float [optional] default = 1e-12) – norm of control update stopping criterion
maxiter (int [optional] default = 100) – maximum number of iterations of Gauss-Newton
linesearch (callable, returns new x) – f : callable, residual, f: R^n -> R^m g : gradient, R^n p : descent direction, R^n x0 : current iterate, R^n
gnsolver ([optional] callable, returns search direction p) –

Parameters:

F: current Jacobian f: current residual

Returns:

p: search step s: singular values of Jacobian
verbose (int [optional] default = 0) – if >= print convergence history diagnostics

Returns:

numpy.array((dof,)) – returns x^* (optimizer)
int – info = 0: converged with norm of gradient below tol info = 1: norm of gradient did not converge, but ||dx|| below tolerance info = 2: did not converge, max iterations exceeded

Line Search¶

psdr.linesearch_armijo(f, g, p, x0, bt_factor=0.5, ftol=0.0001, maxiter=40, trajectory=<function trajectory_linear>, fx0=None)[source]¶

Back-Tracking Line Search to satify Armijo Condition

f(x0 + alpha*p) < f(x0) + alpha * ftol * <g,p>

Parameters:	f (callable) – objective function, f: R^n -> R g (np.array((n,))) – gradient p (np.array((n,))) – descent direction x0 (np.array((n,))) – current location bt_factor (float [optional] default = 0.5) – backtracking factor ftol (float [optional] default = 1e-4) – coefficient in (0,1); see Armijo description in Nocedal & Wright maxiter (int [optional] default = 10) – maximum number of iterations of backtrack trajectory (function(x0, p, t) [Optional]) – Function that returns next iterate
Returns:	alpha: backtracking coefficient (alpha = 1 implies no backtracking)
Return type:	float

Minimax¶

psdr.minimax(f, x0, domain=None, trajectory=<function trajectory_linear>, c_armijo=0.01, maxiter=100, trust_region=True, search_constraints=None, tol_dx=1e-10, tol_df=1e-10, verbose=False, bt_maxiter=30, **kwargs)[source]¶

Solves a minimax optimization problem via sequential linearization and line search

Given a vector-valued function \(f: \mathcal{D}\subset\mathbb{R}^m\to \mathbb{R}^q\), solve the minimax optimization problem:

\[\min_{\mathbf{x}\in \mathcal{D}} \max_{i=1,\ldots, q} f_i(\mathbf{x}).\]

This implements a minor modification of the algorithm described by Osborne and Watson [OW69]. The basic premise is that we add in a slack variable \(t_k\) representing the maximum value of \(f_i(\mathbf{x}_k)\) at the current iterate \(\mathbf{x}_k\). Then we linearize the functions \(f_i\) about this point, yielding additional linear constraints:

(1)¶\[\begin{split}\min_{\mathbf{p}_\mathbf{x} \in \mathbb{R}^m, p_t} & \ p_t \\ \text{such that} & \ t_k + p_t \ge f_i(\mathbf{x}_k) + \mathbf{p}_\mathbf{x}^\top \nabla f_i(\mathbf{x}_k) \quad \forall i=1,\ldots,q\end{split}\]

This yields a search direction \(\mathbf{p}_\mathbf{x}\), along which we preform a line search to find a point satisfying the constraint:

(2)¶\[\min_{\alpha \ge 0} \max_{i=1,\ldots,q} f_i(T(\mathbf{x}_k, \mathbf{p}_\mathbf{x}, \alpha)) \le t_k + c\alpha p_t.\]

Here \(T\) represents the trajectory which defaults to a linear trajectory:

\[T(\mathbf{x}, \mathbf{p}, \alpha) = \mathbf{x} + \alpha\mathbf{p}\]

but, if provided can be more sophisticated. The substantial difference from [OW69] is using an inexact backtracking linesearch is used to find the \(\alpha\) satisfying the Armijo like condition (2); as originally proposed, Osborne and Watson use an exact line search.

Parameters:	f (BaseFuction-like) – Provides a callable interface f(x) that evalates the functions at x and access to the gradient by f.grad(x) x0 (array-like (m,)) – Starting point for optimization domain (Domain, optional) – Add constraints to each step enforcing that steps remain in the domain (assumes a linear trajectory for steps) trajectory (callable, optional) – Function taking current location x, direction p, and step length alpha and returning a new point x. c_armijo (float, optional) – Coefficient used in the Armijo backtracking linesearch maxiter (int, optional) – Maximum number of iterations trust_region (bool, optional) – If true, enforce a spherical trust region on each step search_constraints (callable) – Function taking the current iterate x and cvxpy.Variable representing the search direction and returning a list of cvxpy constraints. tol_dx (float) – convergence tolerance in terms of movement of x tol_df (float) – convergence tolerance in terms of the maximum verbose (bool) – If true, display convergence history bt_maxiter (int) – Number of backtracking line search steps to take *kwargs (dict*) – Additional parameters to pass to cvxpy.Problem.solve()
Returns:	x – Solution to minimax problem.
Return type:	np.array (m,)

References

[OW69]

(1, 2) Osborne and Watson, “An algorithm for minimax approximation in the nonlinear case”, The Computer Journal, 12(1) 1969 pp. 63–68 https://doi.org/10.1093/comjnl/12.1.63