Page MenuHomec4science

slater.tex
No OneTemporary

File Metadata

Created
Mon, Nov 4, 12:48

slater.tex

\section{Nonconvex Slater's Condition \label{sec:slater}}
The (convex) Slater's condition (CSC) plays a key role in convex optimization as a sufficient condition for strong duality. As a result, CSC guarantees the success of a variety of primal-dual algorithms for convex and constrained programming. As a visual example, in Program~\eqref{prob:01}, when $f=0$, $g=1_C$ is the indicator function of a convex set $C$, and $A$ is an affine operator, CSC removes any pathological cases by ensuring that the affine subspace is not tangent to $C$, see Figure~\ref{fig:convex_slater}. \textbf{We should add a figure here with circle and line.}
Likewise, to successfully solve Program~\eqref{prob:01} with nonlinear constraints, we require the following condition which, loosely speaking, extends CSC to the nonconvex setting, as clarified shortly afterwards.
\begin{definition}\label{defn:nonconvex slater} \textbf{\emph{(Nonconvex Slater's condition)}}
For $\rho,\rho'>0$ and subspace $S\subset \RR^{d}$, let
\begin{align}
\nu(g,A,S) :=
\begin{cases}
\underset{v,x}{\min} \, \frac{\left\| \left( \operatorname{id} - P_S P_{\partial g(x)/\b} \right) ( DA(x)^\top v) \right\|}{\|v\|} \\
\|v\|\le \rho\\
\|x\|\le \rho',
\end{cases}
\label{eq:new slater defn}
\end{align}
where $\operatorname{id}$ is the identity operator, $P_{\partial g(x)/\b}$ projects onto the set $\partial g(x)/\b$, and $DA(x)$ is the Jacobian of $A$. We say that Program (\ref{prob:01}) satisfies the nonconvex Slater's condition if $\nu(g,A,S) >0$.
\end{definition}
A few remarks about the nonconvex Slater's condition (NSC) is in order.
\paragraph{Jacobian $DA$.}
As we will see later, $DA(x)^\top \overset{\operatorname{QR}}{=} Q(x) R(x)$ in NSC might be replaced with its orthonormal basis, namely, $Q(x)$. For simplicity, we will avoid this minor change and instead, whenever needed, assume that $DA(x)$ is nonsingular; otherwise a simple projection can remove any redundancy from $A(x)=0$ in Program~\eqref{prob:01}.
\paragraph{Subspace $S$.} The choice of subspace $S$ helps broaden the generality of NSC. In particular, when $S = \RR^{d}$, \eqref{eq:new slater defn} takes the simpler form of
\begin{align}
\nu(g,A,S) :=
\begin{cases}
\underset{v,x}{\min} \, \frac{ \dist( DA(x)^\top v , \partial g(x)/\b) }{\|v\|} \\
\|v\|\le \rho\\
\|x\|\le \rho',
\end{cases}
\label{eq:defn_nu_A}
\end{align}
where $\dist(\cdot,\partial g(x)/\b)$ returns the Euclidean distance to the set $\partial g(x)/\b$.
\paragraph{Convex case.}
To better parse Definition \ref{defn:nonconvex slater}, let us assume that $f:\RR^d\rightarrow\RR$ is convex, $g = 1_C:\RR^d\rightarrow\RR$ is the indicator function for a convex and bounded set $C$, and $A$ is a nonsingular linear operator represented with the full-rank matrix $A\in \RR^{m\times d}$.
%We also let $T_C(x)$ denote the tangent cone to $C$ at $x$, and reserve $P_{T_C(x)}:\RR^d\rightarrow\RR^d$ for the orthogonal projection onto this cone.
%We also set $S_K = \RR^d$ to simplify the discussion.
We can now study the geometric interpretation of $\nu()$.
Assuming that $S=\RR^d$ and using the well-known Moreau decomposition, it is not difficult to rewrite \eqref{eq:new slater defn} as
\begin{align}
\nu(g,A,S) & :=
\begin{cases}
\underset{v,x}{\min} \, \, \frac{\left\| P_{T_C(x)} A^\top v \right\|}{\|v\|} \\
\|v\|\le \rho\\
\|x\|\le \rho'
\end{cases} \nonumber\\
& = \begin{cases}
\underset{x}{\min} \,\, \eta_{m}\left( P_{T_C(x)} A^\top \right) \\
\|x\|\le \rho',
\end{cases}
\label{eq:nu for cvx}
\end{align}
where $P_{T_C(x)}$ is the projection onto the tangent cone of $C$ at $x$, and $\eta_{m}(\cdot)$ returns the $m$th largest singular value of its input matrix. Intuitively then, NSC ensures that the the row span of $A$ is not tangent to $C$, similar to CSC. This close relationship between NSC and CSC is formalized next and proved in Appendix~\ref{sec:proof of prop}.
\begin{proposition}\label{prop:convex}
In Program (\ref{prob:01}), suppose that
\begin{itemize}
\item $f:\RR^d\rightarrow\RR$ is convex,
\item $g=1_C$ is the indicator on a convex and bounded set $C\subset\RR^d$,
\item $A:\RR^d\rightarrow\RR^m$ is a nonsingular linear operator, represented with the full-rank matrix $A\in \RR^{m\times d}$.\footnote{As mentioned earlier, it is easy to remove the full-rank assumption by replacing $DA(x)$ in \eqref{eq:new slater defn} with its orthonormal basis. We assume $A$ to be full-rank for clarity at the cost of a simple projection to remove any ``redundant measurements'' from Program~\eqref{prob:01}.}
\end{itemize}
Assume also that Program (\ref{prob:01}) is feasible, namely, there exists $x\in C$ such that $Ax=0$. Then CSC holds if NSC holds.
Moreover, suppose that $S$ is the affine hull of $C$. Then CSC holds if and only if NSC holds.
\end{proposition}
\paragraph{Boundedness of $C$.}
Let us add that the boundedness of $C$ in Proposition~\ref{prop:convex} is necessary. For example, suppose that $A\in \RR^{1\times d}$ is the vector of all ones and, for a small perturbation vector $\epsilon\in \RR^d$, let $C=\{x\in \RR^d: (A+\epsilon) x\ge 0\}$ be a half space. Then CSC holds but $\nu(g,A,S)$ (with $S=\RR^d$) can be made arbitrarily small by making $\epsilon$ small.
\section{Proof of Proposition \ref{prop:convex} \label{sec:proof of prop}}
To prove the first claim of the proposition, suppose that CSC does not hold, namely, that
\begin{equation}
\relint(\Null(A) \cap C) = \Null(A)\cap \relint(C) = \emptyset,
\label{eq:no feas}
\end{equation}
where $\Null(A)$ and $\relint(C)$ denote the null space of the matrix $A$ and the relative interior of $C$, respectively.
By assumption, there exists $x_0\in C$ such that $Ax_0=0$. It follows from \eqref{eq:no feas} that $x_0\in \text{boundary}(C)$ and that $\Null(A)$ supports $C$ at $x_0$, namely,
$A x\ge 0$, for every $x\in C$. (The inequality applies to each entry of the vector $Ax$.)
Consequently, $\Null(A) \cap T_C(x_0) \ne \{0\}$, where $T_C(x_0)$ is the tangent cone of the set $C$ at $x_0$.
Equivalently, it holds that
$\row(A)\cap N_C(x_0) \ne \{0\}$, where $\row(A)$ is the row space of the matrix $A$. That is, there exists a unit-norm vector $v$ such that
$P_{T_C(x_0)}A^\top v=0$
and, consequently, $P_S P_{T_C(x_0)}A^\top v=0$. Let us take $\rho'=\|x_0\|$ in \eqref{eq:nu for cvx}. We then conclude that $\nu(g,A,S)=0$, namely, NSC also does not hold, which proves the first claim in Proposition \ref{prop:convex}.
For the converse, suppose that NSC does not hold with $\rho'=0$, namely, there exists $x\in \RR^d$ such that
\begin{align*}
\eta_m(P_S P_{T_C(x)} A^\top) & = \eta_m( P_{T_C(x)} A^\top) =0,
\end{align*}
\begin{align*}
\quad A(x) = 0, \quad x\in C,
\end{align*}
where the first equality above holds because $S$ is the affine hull of $C$.
Then, thanks to the boundedness of $C$, it must be the case that $x\in \text{boundary}(C)$. Indeed, if $x\in \text{relint}(C)$, we have that $T_C(x)=S$ and thus
\begin{equation*}
\eta_m(P_{S}P_{T_C(x)} A^\top)= \eta_m( P_S A^\top)>0,
\end{equation*}
where the inequality holds because, by assumption, $A$ is full-rank.
Since $x\in\text{boundary}(C)$, it follows that $\text{dim}(T_C(x)) \ge m-1$. That is, $\text{row}(A)$ contains a unique direction orthogonal to $T_C(x)$. In particular, it follows that $\text{null}(A) \subset T_C(x)$ and, consequently, $\text{int}(\text{null}(A)\cap C) =\emptyset $, namely, CSC does not hold. This proves the second (and last) claim in Proposition \ref{prop:convex}.

Event Timeline