local_optimality.tex
No OneTemporary
Actions

Subscribers

None

File Metadata

Created: Sun, Jun 30, 21:27

local_optimality.tex
View Options

	\section{Local Optimality \label{sec:local-opt}}

	Theorem~\ref{thm:main} establishes that HoLAL, being a first-order algorithm that does not use any second-order information, achieves first-order stationarity for problem~(\ref{prob:01}) but remains silent about local optimality. As shown in~\cite{nouiehed2018convergence}, finding approximate second-order stationary points of convex-constrained problems is in general NP-hard. For this reason, we focus in this section on the special case of problem~\eqref{prob:01} with $g=0$.


	\paragraph{\emph{\textbf{Special case.}}} As an important special case of problem~(\ref{prob:01}), if $f$ is strongly convex and the manifold $ \{x: A(x)=0\}$ is smooth enough, then any first-order stationary point of problem~\eqref{prob:01} is also a local minimum. Intuitively, this happens because the second-order terms of the Lagrangian are locally dominated by those of $f$. A concrete example is the factorized SDP in \eqref{prob:nc}, when $C$ is positive definite.
	%Such problems arise in, for instance, nonlinear regression or compressive sensing \notea{what to cite?}, where $f$
	More formally, suppose that $f$ is strongly convex and both $f,A$ are twice differentiable. For a feasible pair $(x,y)$ in \eqref{prob:01}, recall from \eqref{eq:local-min} that
	\begin{align}
	\nabla^2_{xx} \L_{0}(x,y) & = \nabla^2 f(x) + \sum_{i=1}^m y_i \nabla^2 A_i(x) \nonumber\\
	& \succcurlyeq \nabla^2 f(x) - \\| \sum_{i=1}^m y_i \nabla^2 A_i(x) \\|
	\qquad \text{(Weyl's inequality)} \nonumber\\
	& \succcurlyeq \nabla^2 f(x) - \\|y\\|_1 \cdot \max_i \\| \nabla^2 A_i(x) \\| \nonumber\\
	& \succcurlyeq \nabla^2 f(x) - \sqrt{m}\\|y\\|_2 \cdot \max_i \\| \nabla^2 A_i(x) \\|.
	\end{align}
	Therefore, if the last line above is positive definite, then $x$ is a local minimum of problem~\eqref{prob:01}. In particular, the proof of Theorem~\ref{thm:main} establishes that the output sequence $(x_k,y_k)$ of Algorithm~\ref{Algo:2} satisfies $\\|y_k\\|\le y_{\max}$, where $y_{\max}$ is specified in \eqref{eq:dual growth}. As such, we conclude that the output sequence of Algorithm~\ref{Algo:2} reaches second-order optimality if
	\begin{align}
	\min_{\\|x\\| \le \rho'} \eta_{\min}( \nabla^2 f(x)) > \sqrt{m}y_{\max} \cdot \max_i \max_{\\|x\\|\le \rho'}\\| \nabla^2 A_i(x) \\|,
	\end{align}
	namely, if $f$ is sufficiently strongly convex and $\{A_i\}_i$ are sufficiently smooth.

	\paragraph{\textbf{\emph{General case.}}} More generally, if every saddle point of \eqref{prob:01} is strict, then one can extend the analysis of \cite{o2017behavior} to show that Algorithm~\ref{Algo:2} almost surely does not converge to a saddle point. \notea{haven't actually verified this. should we go down this route?}

	\notea{beyond this, what else can we say? maybe we can talk about how in a lot of nonconvex problems (2nd order) local optimality implies global optimality. but that's not the focus of this paper and we should mention it in passing. }

local_optimality.texNo OneTemporaryActions

File Metadata

local_optimality.texView Options

Event Timeline

local_optimality.tex
No OneTemporary
Actions

local_optimality.tex
View Options