Theorem~\ref{thm:main} establishes that HoLAL, being a first-order algorithm that does not use any second-order information, achieves first-order stationarity for problem~(\ref{prob:01}) but remains silent about local optimality. As shown in~\cite{nouiehed2018convergence}, finding approximate second-order stationary points of convex-constrained problems is in general NP-hard. For this reason, we focus in this section on the special case of problem~\eqref{prob:01} with $g=0$.
\paragraph{\emph{\textbf{Special case.}}} As an important special case of problem~(\ref{prob:01}), if $f$ is strongly convex and the manifold $ \{x: A(x)=0\}$ is smooth enough, then any first-order stationary point of problem~\eqref{prob:01} is also a local minimum. Intuitively, this happens because the second-order terms of the Lagrangian are locally dominated by those of $f$. A concrete example is the factorized SDP in \eqref{prob:nc}, when $C$ is positive definite.
%Such problems arise in, for instance, nonlinear regression or compressive sensing \notea{what to cite?}, where $f$
More formally, suppose that $f$ is strongly convex and both $f,A$ are twice differentiable. For a feasible pair $(x,y)$ in \eqref{prob:01}, recall from \eqref{eq:local-min} that
Therefore, if the last line above is positive definite, then $x$ is a local minimum of problem~\eqref{prob:01}. In particular, the proof of Theorem~\ref{thm:main} establishes that the output sequence $(x_k,y_k)$ of Algorithm~\ref{Algo:2} satisfies $\|y_k\|\le y_{\max}$, where $y_{\max}$ is specified in \eqref{eq:dual growth}. As such, we conclude that the output sequence of Algorithm~\ref{Algo:2} reaches second-order optimality if
namely, if $f$ is sufficiently strongly convex and $\{A_i\}_i$ are sufficiently smooth.
\paragraph{\textbf{\emph{General case.}}} More generally, if every saddle point of \eqref{prob:01} is strict, then one can extend the analysis of \cite{o2017behavior} to show that Algorithm~\ref{Algo:2} almost surely does not converge to a saddle point. \notea{haven't actually verified this. should we go down this route?}
\notea{beyond this, what else can we say? maybe we can talk about how in a lot of nonconvex problems (2nd order) local optimality implies global optimality. but that's not the focus of this paper and we should mention it in passing. }