Optimizers¶
Critical to using the insights generated from subspace-based dimension reduction is the ability to use these low-dimensional representations for optimization.
Low-Level Optimization¶
Gauss-Newton¶
-
psdr.
gauss_newton
(f, F, x0, tol=1e-10, tol_normdx=1e-12, maxiter=100, linesearch=None, verbose=0, trajectory=None, gnsolver=None)[source]¶ A Gauss-Newton solver for unconstrained nonlinear least squares problems.
Given a vector valued function \(\mathbf{f}:\mathbb{R}^m \to \mathbb{R}^M\) and its Jacobian \(\mathbf{F}:\mathbb{R}^m\to \mathbb{R}^{M\times m}\), solve the nonlinear least squares problem:
\[\min_{\mathbf{x}\in \mathbb{R}^m} \| \mathbf{f}(\mathbf{x})\|_2^2.\]Normal Gauss-Newton computes a search direction \(\mathbf{p}\in \mathbb{R}^m\) at each iterate by solving least squares problem
\[\mathbf{p}_k \leftarrow \mathbf{F}(\mathbf{x}_k)^+ \mathbf{f}(\mathbf{x}_k)\]and then computes a new step by solving a line search problem for a step length \(\alpha\) satisfying the Armijo conditions:
\[\mathbf{x}_{k+1} \leftarrow \mathbf{x}_k + \alpha \mathbf{p}_k.\]This implementation offers several features that modify this basic outline.
First, the user can specify a nonlinear trajectory along which candidate points can move; i.e.,
\[\mathbf{x}_{k+1} \leftarrow T(\mathbf{x}_k, \mathbf{p}_k, \alpha).\]Second, the user can specify a custom solver for computing the search direction \(\mathbf{p}_k\).
Parameters: - f (callable) – residual, \(\mathbf{f}: \mathbb{R}^m \to \mathbb{R}^M\)
- F (callable) – Jacobian of residual \(\mathbf{f}\); \(\mathbf{F}: \mathbb{R}^m \to \mathbb{R}^{M \times m}\)
- tol (float [optional] default = 1e-8) – gradient norm stopping criterion
- tol_normdx (float [optional] default = 1e-12) – norm of control update stopping criterion
- maxiter (int [optional] default = 100) – maximum number of iterations of Gauss-Newton
- linesearch (callable, returns new x) – f : callable, residual, f: R^n -> R^m g : gradient, R^n p : descent direction, R^n x0 : current iterate, R^n
- gnsolver ([optional] callable, returns search direction p) –
- Parameters:
- F: current Jacobian f: current residual
- Returns:
- p: search step s: singular values of Jacobian
- verbose (int [optional] default = 0) – if >= print convergence history diagnostics
Returns: - numpy.array((dof,)) – returns x^* (optimizer)
- int – info = 0: converged with norm of gradient below tol info = 1: norm of gradient did not converge, but ||dx|| below tolerance info = 2: did not converge, max iterations exceeded
Line Search¶
-
psdr.
linesearch_armijo
(f, g, p, x0, bt_factor=0.5, ftol=0.0001, maxiter=40, trajectory=<function trajectory_linear>, fx0=None)[source]¶ Back-Tracking Line Search to satify Armijo Condition
f(x0 + alpha*p) < f(x0) + alpha * ftol * <g,p>Parameters: - f (callable) – objective function, f: R^n -> R
- g (np.array((n,))) – gradient
- p (np.array((n,))) – descent direction
- x0 (np.array((n,))) – current location
- bt_factor (float [optional] default = 0.5) – backtracking factor
- ftol (float [optional] default = 1e-4) – coefficient in (0,1); see Armijo description in Nocedal & Wright
- maxiter (int [optional] default = 10) – maximum number of iterations of backtrack
- trajectory (function(x0, p, t) [Optional]) – Function that returns next iterate
Returns: alpha: backtracking coefficient (alpha = 1 implies no backtracking)
Return type: float
Minimax¶
-
psdr.
minimax
(f, x0, domain=None, trajectory=<function trajectory_linear>, c_armijo=0.01, maxiter=100, trust_region=True, search_constraints=None, tol_dx=1e-10, tol_df=1e-10, verbose=False, bt_maxiter=30, **kwargs)[source]¶ Solves a minimax optimization problem via sequential linearization and line search
Given a vector-valued function \(f: \mathcal{D}\subset\mathbb{R}^m\to \mathbb{R}^q\), solve the minimax optimization problem:
\[\min_{\mathbf{x}\in \mathcal{D}} \max_{i=1,\ldots, q} f_i(\mathbf{x}).\]This implements a minor modification of the algorithm described by Osborne and Watson [OW69]. The basic premise is that we add in a slack variable \(t_k\) representing the maximum value of \(f_i(\mathbf{x}_k)\) at the current iterate \(\mathbf{x}_k\). Then we linearize the functions \(f_i\) about this point, yielding additional linear constraints:
(1)¶\[\begin{split}\min_{\mathbf{p}_\mathbf{x} \in \mathbb{R}^m, p_t} & \ p_t \\ \text{such that} & \ t_k + p_t \ge f_i(\mathbf{x}_k) + \mathbf{p}_\mathbf{x}^\top \nabla f_i(\mathbf{x}_k) \quad \forall i=1,\ldots,q\end{split}\]This yields a search direction \(\mathbf{p}_\mathbf{x}\), along which we preform a line search to find a point satisfying the constraint:
(2)¶\[\min_{\alpha \ge 0} \max_{i=1,\ldots,q} f_i(T(\mathbf{x}_k, \mathbf{p}_\mathbf{x}, \alpha)) \le t_k + c\alpha p_t.\]Here \(T\) represents the trajectory which defaults to a linear trajectory:
\[T(\mathbf{x}, \mathbf{p}, \alpha) = \mathbf{x} + \alpha\mathbf{p}\]but, if provided can be more sophisticated. The substantial difference from [OW69] is using an inexact backtracking linesearch is used to find the \(\alpha\) satisfying the Armijo like condition (2); as originally proposed, Osborne and Watson use an exact line search.
Parameters: - f (BaseFuction-like) – Provides a callable interface f(x) that evalates the functions at x and access to the gradient by f.grad(x)
- x0 (array-like (m,)) – Starting point for optimization
- domain (Domain, optional) – Add constraints to each step enforcing that steps remain in the domain (assumes a linear trajectory for steps)
- trajectory (callable, optional) – Function taking current location x, direction p, and step length alpha and returning a new point x.
- c_armijo (float, optional) – Coefficient used in the Armijo backtracking linesearch
- maxiter (int, optional) – Maximum number of iterations
- trust_region (bool, optional) – If true, enforce a spherical trust region on each step
- search_constraints (callable) – Function taking the current iterate x and cvxpy.Variable representing the search direction and returning a list of cvxpy constraints.
- tol_dx (float) – convergence tolerance in terms of movement of x
- tol_df (float) – convergence tolerance in terms of the maximum
- verbose (bool) – If true, display convergence history
- bt_maxiter (int) – Number of backtracking line search steps to take
- **kwargs (dict) – Additional parameters to pass to cvxpy.Problem.solve()
Returns: x – Solution to minimax problem.
Return type: np.array (m,)
References
[OW69] (1, 2) Osborne and Watson, “An algorithm for minimax approximation in the nonlinear case”, The Computer Journal, 12(1) 1969 pp. 63–68 https://doi.org/10.1093/comjnl/12.1.63