Page MenuHomec4science

local_optimality.tex
No OneTemporary

File Metadata

Created
Sun, Jun 30, 21:27

local_optimality.tex

\section{Local Optimality \label{sec:local-opt}}
Theorem~\ref{thm:main} establishes that HoLAL, being a first-order algorithm that does not use any second-order information, achieves first-order stationarity for problem~(\ref{prob:01}) but remains silent about local optimality. As shown in~\cite{nouiehed2018convergence}, finding approximate second-order stationary points of convex-constrained problems is in general NP-hard. For this reason, we focus in this section on the special case of problem~\eqref{prob:01} with $g=0$.
\paragraph{\emph{\textbf{Special case.}}} As an important special case of problem~(\ref{prob:01}), if $f$ is strongly convex and the manifold $ \{x: A(x)=0\}$ is smooth enough, then any first-order stationary point of problem~\eqref{prob:01} is also a local minimum. Intuitively, this happens because the second-order terms of the Lagrangian are locally dominated by those of $f$. A concrete example is the factorized SDP in \eqref{prob:nc}, when $C$ is positive definite.
%Such problems arise in, for instance, nonlinear regression or compressive sensing \notea{what to cite?}, where $f$
More formally, suppose that $f$ is strongly convex and both $f,A$ are twice differentiable. For a feasible pair $(x,y)$ in \eqref{prob:01}, recall from \eqref{eq:local-min} that
\begin{align}
\nabla^2_{xx} \L_{0}(x,y) & = \nabla^2 f(x) + \sum_{i=1}^m y_i \nabla^2 A_i(x) \nonumber\\
& \succcurlyeq \nabla^2 f(x) - \| \sum_{i=1}^m y_i \nabla^2 A_i(x) \|
\qquad \text{(Weyl's inequality)} \nonumber\\
& \succcurlyeq \nabla^2 f(x) - \|y\|_1 \cdot \max_i \| \nabla^2 A_i(x) \| \nonumber\\
& \succcurlyeq \nabla^2 f(x) - \sqrt{m}\|y\|_2 \cdot \max_i \| \nabla^2 A_i(x) \|.
\end{align}
Therefore, if the last line above is positive definite, then $x$ is a local minimum of problem~\eqref{prob:01}. In particular, the proof of Theorem~\ref{thm:main} establishes that the output sequence $(x_k,y_k)$ of Algorithm~\ref{Algo:2} satisfies $\|y_k\|\le y_{\max}$, where $y_{\max}$ is specified in \eqref{eq:dual growth}. As such, we conclude that the output sequence of Algorithm~\ref{Algo:2} reaches second-order optimality if
\begin{align}
\min_{\|x\| \le \rho'} \eta_{\min}( \nabla^2 f(x)) > \sqrt{m}y_{\max} \cdot \max_i \max_{\|x\|\le \rho'}\| \nabla^2 A_i(x) \|,
\end{align}
namely, if $f$ is sufficiently strongly convex and $\{A_i\}_i$ are sufficiently smooth.
\paragraph{\textbf{\emph{General case.}}} More generally, if every saddle point of \eqref{prob:01} is strict, then one can extend the analysis of \cite{o2017behavior} to show that Algorithm~\ref{Algo:2} almost surely does not converge to a saddle point. \notea{haven't actually verified this. should we go down this route?}
\notea{beyond this, what else can we say? maybe we can talk about how in a lot of nonconvex problems (2nd order) local optimality implies global optimality. but that's not the focus of this paper and we should mention it in passing. }

Event Timeline