Page MenuHomec4science

10-rn-spectral.tex
No OneTemporary

File Metadata

Created
Thu, Mar 13, 07:57

10-rn-spectral.tex

\section{Spectral Representation of Random Signals}
Obtaining a meaningful frequency-domain representation of a stochastic process is perhaps a bit less straightforward than expected. From a practical point of view, given a finite set of samples from a random process, we can obviously always compute a DFT; the question, however, is whether the result is indicative at all of the \textit{general} spectral properties of the underlying process. Let's illustrate the problem with a simple example using the coin-tossing machine described in~(\ref{eq:rn:rademacher}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[t]
\center\small
\begin{tabular}{cc}
$\breve{x}_i[n]$ & $|\breve{X}_i[k]|^2$ \\
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,xout=true]{0, 32}{-1.3, 1.3}
\dspTapsAt[linecolor=ColorDT]{0}{1 -1 -1 -1 -1 1 1 1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 -1 -1 -1 1 -1 -1 1 -1 -1}
\end{dspPlot}
&
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,yticks=50]{0, 32}{0, 150}
\dspTapsAt[linecolor=ColorDF]{0}{100.0000 22.9177 65.9209 24.1467 20.0000 2.2848 65.1071 0.5586 52.0000 8.1548 61.5203 39.4376 20.0000 59.3857 15.4517 3.1141 4.0000 3.1141 15.4517 59.3857 20.0000 39.4376 61.5203 8.1548 52.0000 0.5586 65.1071 2.2848 20.0000 24.1467 65.9209 22.9177}
\end{dspPlot}
\\
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,xout=true]{0, 32}{-1.3, 1.3}
\dspTapsAt[linecolor=ColorDT]{0}{-1 -1 -1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 1 1 -1 1 -1 1 -1 1 -1 -1 1 1 -1 1 1 -1 -1 -1}
\end{dspPlot}
&
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,yticks=50]{0, 32}{0, 150}
\dspTapsAt[linecolor=ColorDF]{0}{16.0000 15.4013 0.9065 101.4653 24.0000 8.5766 55.7526 44.4877 32.0000 37.8811 26.1885 11.0772 24.0000 6.8809 13.1524 30.2299 144.0000 30.2299 13.1524 6.8809 24.0000 11.0772 26.1885 37.8811 32.0000 44.4877 55.7526 8.5766 24.0000 101.4653 0.9065 15.4013}
\end{dspPlot}
\\
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,xout=true]{0, 32}{-1.3, 1.3}
\dspTapsAt[linecolor=ColorDT]{0}{-1 1 1 1 1 1 -1 1 1 -1 1 1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 1 -1 -1 1 1 -1}
\end{dspPlot}
&
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight,yticks=50]{0, 32}{0, 150}
\dspTapsAt[linecolor=ColorDF]{0}{0 144.6809 8.2474 10.1725 62.6274 4.0518 50.8476 11.7632 16.0000 61.8046 63.0935 17.2296 17.3726 6.0365 37.8115 0.2610 0 0.2610 37.8115 6.0365 17.3726 17.2296 63.0935 61.8046 16.0000 11.7632 50.8476 4.0518 62.6274 10.1725 8.2474 144.6809}
\end{dspPlot}
\end{tabular}
\caption{Signal and DFT (squared magnitude) for 3 different partial realizations of the coin-toss process.}\label{fig:rn:dft1}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Figure~\ref{fig:rn:dft1} shows three 32-point data sets generated by the system, together with the magnitude of their DFT. Each realization is clearly different and so are the associated transforms, so that it's really hard to discern a pattern in the spectral plots. We could try to increase the length of the datasets, as in Figure~\ref{fig:rn:dft2}, but the result is the same: no clear information is emerging from the (now longer) DFT's.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[t]
\center\small
\begin{dspPlot}[height=0.5\dspHeight,yticks=100]{0, 64}{0, 300}
\dspTapsAt[linecolor=ColorDF]{0}{4.0000 73.3423 74.2921 5.2269 7.5227 158.0052 209.3729 28.5502 24.6863 37.6667 33.4842 163.0398 42.2540 19.5449 6.7859 73.3613 52.0000 54.8200 74.9140 12.3180 113.6282 49.8991 82.8052 220.2623 47.3137 28.6746 91.4748 28.1135 12.5951 17.3429 34.8709 117.8323 100.0000 117.8323 34.8709 17.3429 12.5951 28.1135 91.4748 28.6746 47.3137 220.2623 82.8052 49.8991 113.6282 12.3180 74.9140 54.8200 52.0000 73.3613 6.7859 19.5449 42.2540 163.0398 33.4842 37.6667 24.6863 28.5502 209.3729 158.0052 7.5227 5.2269 74.2921 73.3423}
\end{dspPlot}
\vspace{2em}
\begin{dspPlot}[height=0.5\dspHeight,yticks=200]{0, 128}{0, 600}
\dspTapsAt[linecolor=ColorDF]{0}{ 16.0000 26.7116 134.5859 118.7696 276.5874 215.8617 94.8197 172.8854 105.5255 46.8213 151.3091 99.6768 86.7526 92.5495 43.7254 107.8840 143.5980 151.6277 496.0279 42.4477
133.3438 160.3788 22.5627 133.7638 218.4027 531.2848 161.4694 54.1406 53.0671 22.6007 10.3571 6.9683 136.0000 15.0837 35.3089 118.5829 260.0175 7.5644 42.8459 46.9704 523.7538 66.7780 146.8260 145.9124 509.1014 255.1025 66.1601 134.0361 64.4020 35.6198 7.4152 79.8580 151.9393 25.0217 3.7836 152.8543 48.3181 204.2682 0.8546 42.0079 65.1910 130.5055 373.9485 139.4619 64.0000 139.4619 373.9485 130.5055 65.1910 42.0079 0.8546 204.2682 48.3181 152.8543 3.7836 25.0217 151.9393 79.8580 7.4152 35.6198 64.4020 134.0361 66.1601 255.1025 509.1014 145.9124 146.8260 66.7780 523.7538 46.9704 42.8459 7.5644 260.0175 118.5829 35.3089 15.0837 136.0000 6.9683 10.3571 22.6007 53.0671 54.1406 161.4694 531.2848 218.4027 133.7638 22.5627 160.3788 133.3438 42.4477 496.0279 151.6277 143.5980 107.8840 43.7254 92.5495 86.7526 99.6768 151.3091 46.8213 105.5255 172.8854 94.8197 215.8617 276.5874 118.7696 134.5859 26.7116}
\end{dspPlot}
\caption{DFT (squared magnitude) for partial realizations of increasing length.}\label{fig:rn:dft2}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Given the random nature of the signal, we may be tempted to take the expected value of the transform (in this case by taking ensemble averages, since we can easily generate realizations); however both the DFT and the expectation are linear operators, and the expected value of the signal is zero for all $n$, so we have:
\[
\expt{\displaystyle\sum_{n=0}^{N-1}x[n]e^{j\frac{2\pi}{N}nk}} = \sum_{n=0}^{N-1}\expt{x[n]}e^{j\frac{2\pi}{N}nk} = 0.
\]
This is intuitively dissatisfying since we can see that the signal ``moves'' up and down in time and therefore the averaged spectrum should show some energy content. Since energy is related to the squared norm of the data, let's try an approach based on the squared magnitude of the DFT: given $N$ points of a realization of the process, its \textit{periodogram} is defined as the squared magnitude of the DFT, normalized by the length of the data set:
\begin{equation}\label{eq:rn:periodogram}\index{periodogram}
P_N[k] = \frac{1}{N}\, \displaystyle \bigg|\sum_{n = 0}^{N-1}\breve{x}[n]e^{-j\frac{2\pi}{N}nk} \bigg|^2;
\end{equation}
the normalization factor will keep the result independent of the number of data points. If we now average the periodogram over $L$ realizations of the process, we can see from Figure~\ref{fig:rn:sqdftavg} that, as $L$ grows, $\expt{P_N[k]} \approx 1$; we can interpret this by saying that this particular random process contains ``energy'' at all frequencies in equal amounts. Intuitively this is actually quite reasonable since a realization of this process could very well turn out as a sequence of all identical values (which would have energy only at $k = 0$), or as a sequence of strictly alternating values (which would have energy only at $k = N/2$), or as any other combination of values in between.
In the remainder of this section we will formalize this intuition and derive the expression for the preferred spectral representation of a general (wide-sense) stationary random process.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[t]
\center\small
\begin{tabular}{cc}
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight]{0,32}{0,2.6}
\dspText(28,2){$L=1$}
\dspTapsAt[linecolor=ColorDF]{0}{2.0000 1.2263 1.3266 0.0859 0.4393 0.0259 0.8647 1.4264 1.0000 0.1274 1.1353 1.0376 2.5607 0.4364 0.6734 1.6340 2.0000 1.6340 0.6734 0.4364 2.5607 1.0376 1.1353 0.1274 1.0000 1.4264 0.8647 0.0259 0.4393 0.0859 1.3266 1.2263}
\end{dspPlot}
&
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight]{0,32}{0,2.6}
\dspText(28,2){$L=10$}
\dspTapsAt[linecolor=ColorDF]{0}{0.8250 1.6433 1.1168 0.9961 0.5452 0.9240 0.8540 1.4178 0.5500 1.0745 0.5399 1.3725 1.0048 1.1538 0.4892 1.0179 1.7750 1.0179 0.4892 1.1538 1.0048 1.3725 0.5399 1.0745 0.5500 1.4178 0.8540 0.9240 0.5452 0.9961 1.1168 1.6433}
\end{dspPlot}
\\
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight]{0,32}{0,2.6}
\dspText(28,2){$L=1000$}
\dspTapsAt[linecolor=ColorDF]{0}{0.9394 0.9929 0.9606 0.9819 0.9769 0.9358 0.9567 1.0096 1.0049 0.9951 1.0083 0.9810 1.0639 1.0595 1.0539 1.0512 0.9964 1.0512 1.0539 1.0595 1.0639 0.9810 1.0083 0.9951 1.0049 1.0096 0.9567 0.9358 0.9769 0.9819 0.9606 0.9929}
\end{dspPlot}
&
\begin{dspPlot}[width=0.5\dspWidth,height=0.5\dspHeight]{0,32}{0,2.6}
\dspText(28,2){$L=5000$}
\dspTapsAt[linecolor=ColorDF]{0}{0.9673 1.0021 0.9745 1.0184 0.9908 0.9739 0.9863 0.9771 0.9960 0.9855 1.0135 1.0083 1.0203 1.0277 1.0172 1.0165 1.0166 1.0165 1.0172 1.0277 1.0203 1.0083 1.0135 0.9855 0.9960 0.9771 0.9863 0.9739 0.9908 1.0184 0.9745 1.0021}
\end{dspPlot}
\end{tabular}
\caption{Ensemble averages of the normalized squared DFT magnitude; $L$ is the number of realizations used in the averaging.}\label{fig:rn:sqdftavg}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Power Spectral Density}
\index{random process!power spectral density|(}
In Section~\ref{ErgPowSec} we discussed the difference between \textit{energy} and \emph{power} signals. An energy signal $\mathbf{x}$ is a square-summable sequence and its total energy is given by its squared norm; an energy signal has a well-defined Fourier transform and, because of Parseval's theorem, its norm in $\ell_2(\mathbb{Z})$ is equal to its norm in $L_2([-\pi,\pi])$:
\[
\|\mathbf{x}\|^2 = \|X(e^{j\omega})\|^2
\]
or, explicitly,
\[
\sum_{n=-\infty}^{\infty} |x[n]|^2 = \frac{1}{2\pi}\int_{-\pi}^{\pi}|X(e^{j\omega})|^2 d\omega.
\]
From this, the squared magnitude of the DTFT can be interpreted as the \textit{energy spectral density} of the signal, expressed in units of energy over units of frequency; in practical terms, if we were to filter an energy sequence with an ideal bandpass filter, the total energy in the output would be proportional to the integral of $|X(e^{j\omega})|^2$ over the support of the filter's passband. Interestingly, the energy spectral density is related to $\mathbf{r_x}$, the deterministic autocorrelation of an energy signal, which is the convolution of the signal with a time-reversed version of itself:
\[
r_x[k] = \sum_{n=-\infty}^{\infty} x[n]x[n+k].
\]
It is easy to show that
\[
|X(e^{j\omega})|^2 = \DTFT{\mathbf{r_x}}
\]
and this link between autocorrelation and energy distribution will reappear in slightly different form when we define a suitable spectral representation for random signals.
Signals that are not square-summable are generally problematic (think for instance of the exponentially growing sequences generated by unstable LTI systems); one exception is the class of \textit{power signals}, for which the amount of energy per unit of time remains finite; in that case the limit
\[
P_{x} = \lim_{M\rightarrow\infty} \frac{1}{2M + 1}\sum_{n = -M}^{M} \bigl|x[n] \bigr|^2
\]
exists and it represents the signal's average power.\footnote{
%
In the case of $M$-periodic sequences the power is simply \textit{defined} as $(1/M)\sum_{n=0}^{M-1}|x[n]|^2$.}.
%
In Section~\ref{sec:fa:dirac} we have shown how the DTFT formalism can be extended to successfully handle power signals via the use of Dirac delta functionals (indeed, the presence of a Dirac in a spectrum is \textit{the} tell-tale sign of infinite energy). But the concept of energy spectral density cannot be applied directly to power signals: intuitively this makes sense since the energy is infinite and, from a mathematical point of view, this is consistent with the fact that we simply cannot take the square of a Dirac delta. The solution is to move to a spectral representation that describes the signal's \textit{power} distribution in frequency instead; to do so, we can start with the formula for a truncated DTFT over $2M+1$ points around the origin:
\begin{equation}\label{eq:rn:truncDTFT}
X_M ( e^{j\omega}) = \sum_{n = -M}^{M} x[n]\, e^{-j\omega n}
\end{equation}
which always exists and whose squared magnitude is the spectral energy distribution of the signal for the time interval $[-M, M]$. The \textit{power spectral density} (PSD) is defined as
\begin{equation}\label{eq:rn:PSDdef}
P(e^{j\omega}) = \lim_{M\rightarrow\infty}\left\{\frac{1}{2M+1} \, \bigr|X_M(e^{j\omega}) \bigr|^2\right\}
\end{equation}
and its physical dimension is expressed in units of energy over units of time over units of frequency. Obviously, the PSD is a $2\pi$-periodic, real, and non-negative function of $\omega$ and it is equal to zero for energy signal; in physical terms, if we put a power sequence through an ideal bandpass filter, the total power available the output is going to be proportional to the integral of the PSD over the support of the filter's passband.
Looking back at the power signals we have encountered so far, it can be shown for instance (see also Exercise~\ref{ex:rn:psd1}) that the PSD of the constant signal $\mathbf{x} : x[n] = \alpha$ is
\begin{equation}
P(e^{j\omega}) = \alpha^2\tilde{\delta}(\omega)
\end{equation}
whereas for the scaled unit step $\alpha \mathbf{u}$ we have
\begin{equation}
P(e^{j\omega}) = (\alpha^2 / 2)\tilde{\delta}(\omega).
\end{equation}
Finally, for an $N$-periodic signal $\tilde{\mathbf{x}}$, the PSD is given by:
\begin{equation}
P(e^{j\omega}) = \sum_{k=0}^{N-1} \bigl| \tilde{X}[k] \bigr|^2 \, \tilde{\delta} \left( \omega - \frac{2\pi}{N}\, k \right),
\end{equation}
where $\tilde{\mathbf{X}}$ is the DFS of $\tilde{\mathbf{x}}$; this shows that, as we would expect, the power of a periodic signal is concentrated over the multiples of the fundamental frequency.
\subsection{PSD of a WSS Random Process}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[t]
\center\small
\def\sp{ }
\def\btick#1{\psline[linewidth=0.6pt](#1,-.1)(#1,.1)}
\def\qtick#1#2{\btick{#1}\uput{10pt}[-90](#1,0){#2}}
\def\qseg#1#2{%
\color{ColorTwo}
\psline[linecolor=ColorTwo]{|-|}(#1,0.2)(! #1\sp #2\sp add 0.2)%
\uput{10pt}[90](#1,0.2){$n$}\uput{10pt}[90](! #1\sp #2\sp add 0.2){$m$}}%
%
\psset{xunit=2.4cm,yunit=2cm}
\begin{pspicture}(0.5,-.4)(5.5,.5)
\psline[linewidth=2pt,tickwidth=2pt,](1,0)(5,0)%
\qtick{1}{$-M$}\btick{2}\btick{3}\btick{4}\qtick{5}{$M$}%
\qseg{1}{2}%
\end{pspicture}
\begin{pspicture}(0.5,-.4)(5.5,.5)
\psline[linewidth=2pt,tickwidth=2pt,](1,0)(5,0)%
\qtick{1}{$-M$}\btick{2}\btick{3}\btick{4}\qtick{5}{$M$}%
\qseg{2}{2}%
\end{pspicture}
\begin{pspicture}(0.5,-.4)(5.5,.5)
\psline[linewidth=2pt,tickwidth=2pt,](1,0)(5,0)%
\qtick{1}{$-M$}\btick{2}\btick{3}\btick{4}\qtick{5}{$M$}%
\qseg{3}{2}%
\end{pspicture}
\caption{Computing the value of $c_{2M}(k)$ in~(\ref{eq:rn:ck}) as the number of ways a segment of length $k$ can fit into the $[-M, M]$ interval (here, $k=2$ and $M = 2$).}\label{fig:rn:segfit}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Wide-sense stationary processes behave in expectation just like power signals; indeed (assuming without loss of generality a zero-mean, real-valued process), we have that
\[
\expt{\displaystyle \sum_{n =-M}^{M} x^2[n] } = \sum_{n = -M}^{M} \expt{ x^2[n] } = (2M+1)\sigma_x^2
\]
and
\[
\expt{\displaystyle \frac{1}{2M + 1}\sum_{n = -M}^{M} x^2[n] } = \sigma_x^2
\]
so that, as $M$ goes to infinity, the energy grows unbounded but the average power remains finite as long as the variance is finite. The truncated DTFT in~(\ref{eq:rn:truncDTFT}), when applied to the process, yields a random variable, parameterized by $\omega$, whose expected squared magnitude is:
\begin{align*}
\expt{\bigl|X_M(e^{j\omega}) \bigr|^2} & = \textrm{E} \bigl[ X^*_M(e^{j\omega}) X_M(e^{j\omega}) \bigr] \\
& = \textrm{E} \left[ \sum_{n = -M}^{M} x[n]\, e^{j\omega n} \, \sum_{m = -M}^{M} x[m] \, e^{-j\omega m}\right] \\
& = \sum_{n = -M}^{M}\sum_{m = -M}^{M} \textrm{E} \bigl[ x[n] x[m] \bigr]\, e^{-j\omega (m-n)} \\
& = \sum_{n = -M}^{M} \sum_{m = -M}^{M} r_x[m-n]\, e^{-j\omega (m-n)}.
\end{align*}
In order to reduce the double summation to a single summation, set $k = n-m$; as $n$ and $m$ vary from $-M$ to $M$, $k$ will range from $-2M$ to $2M$ so that we can write
\[
\textrm{E} \left[ \bigl|X_M(e^{j\omega}) \bigr|^2 \right] = \sum_{k = -2M}^{2M} c_{2M}(k)\, r_x[k] \, e^{-j\omega k}
\]
where $c_{2M}(k)$ is the number of times $m-n = k$ when $n$ and $m$ range over all values in the $[-M, M]$ interval. One easy way to find $c_{2M}(k)$ is to realize that it represents the number of distinct ways that a segment of length $k$ can fit into an interval of length $2M$ (see Figure~\ref{fig:rn:segfit}); after some simple geometrical considerations, this number turns out to be
\begin{equation} \label{eq:rn:ck}
c_{2M}(k) = 2M + 1 - |k|.
\end{equation}
To obtain the power spectral density we need to normalize the expected squared value by the length of the data set, and then take the limit as the size of the interval grows to infinity:
\begin{align}
P_x(e^{j\omega}) & = \lim_{M\rightarrow\infty} \left\{ \frac{1}{2M+1} \, \textrm{E} \left[ \bigl| X_M(e^{j\omega}) \bigr|^2 \right] \right\} \nonumber \\
& = \lim_{M\rightarrow\infty} \sum_{k = -2M}^{2M} \left(1 - \frac{|k|}{2M + 1} \right) \left( r_x[k]\, e^{-j\omega k} \right) \nonumber \\
& = \lim_{M\rightarrow\infty} \sum_{k = -\infty}^{\infty} w_M(k)\, r_x[k] \, e^{-j\omega k} \label{eq:rn:PSDSum}
\end{align}
where we have set
\begin{equation} \label{eq:rn:weighFunPSDEq}
w_M(k) = \begin{cases}
\displaystyle 1 - \frac{|k|}{2M + 1} & \ |k| \leq 2M \\
0 & \ |k| > 2M
\end{cases}
\end{equation}
Since $\bigl| w_M(k) r_x [k]\, e^{-j\omega k} \bigr| \leq \bigl| r_x[k] \bigr|$, if the autocorrelation is absolutely summable then the sum~(\ref{eq:rn:PSDSum}) converges uniformly to a continuous function of $M$ and we can therefore move the limiting operation inside the sum. Now the key observation is that the weighting term $w_M(k)$, considered as a function of $k$, converges to the constant one as $M$ grows large, as shown graphically in Figure~\ref{fig:rn:weighFun},
\[
\lim_{M \rightarrow \infty} w_M(k) = 1
\]
so that in the end we obtain \index{power spectral density|mie}
\begin{equation}
P_x(e^{j\omega}) = \sum_{k = -\infty}^{\infty} r_x[k]\, e^{-j\omega k} = \DTFT{\mathbf{r}_x}.
\end{equation}
This fundamental result states that the power spectral density of a WSS random process is the \emph{discrete-time Fourier transform of the autocorrelation of the process}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[t]
\center
\begin{dspPlot}[yticks=1,sidegap=0,height=5cm]{-40,40}{0,1.2}
\psclip{\psframe[linestyle=none](-40,0)(40,1.2)}
\dspFunc[linecolor=ColorTwo,linewidth=2pt]{1 x abs 2 5 mul 1 add div sub }%
\uput[0](10,0.2){$M=5$}%
\dspFunc[linecolor=ColorThree,linewidth=2pt]{1 x abs 2 20 mul 1 add div sub }%
\uput[0](22,0.5){$M=20$}%
\dspFunc[linecolor=ColorFour,linewidth=2pt]{1 x abs 2 500 mul 1 add div sub }%
\uput[0](22,0.85){$M=500$}%
\endpsclip
\end{dspPlot}
\caption{Weighting sequence $w_M(k)$ in~(\ref{eq:rn:weighFunPSDEq}) for growing values of $M$.}\label{fig:rn:weighFun}
\end{figure}\index{random process!power spectral density|)}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{White Noise} \label{sec:rn:whitenoise}
An important (if very specific) class of WSS random processes is that for which the underlying random variables are all zero-mean and statistically uncorrelated (that is, $\expt{x[n]x[m]} = \expt{x[n]}\expt{x[m]}$). In this case the autocorrelation simplifies to
\begin{equation} \label{eq:rn:whitenoisecorr}
\mathbf{r}_x = \sigma^2_x \, \boldsymbol{\delta}
\end{equation}
where $\sigma^2_x$ is the variance (or the average power) of the process. A process with these properties is called \textit{white noise} and it is a very useful model for additive noise sources. While not a necessary requirement, it is common to assume that the samples in a white noise process are identically distributed; according to the underlying pdf, we can therefore have Gaussian white noise, uniform white noise or any other flavor of white noise.
Using~(\ref{eq:rn:whitenoisecorr}), we can see that the PSD of white noise is simply
\begin{equation} \label{eq:rn:whitenoisepsd}
P_x(e^{j\omega}) = \sigma^2_x,
\end{equation}
that is, the power of a white noise process is uniformly distributed in frequency; this is in fact the reason behind the name: in optics, white light possesses approximately equal power over the visible frequency band.
Note that the coin-tossing signal in~(\ref{eq:rn:rademacher}) is indeed a white noise process since the samples are statistically independent (which implies they are uncorrelated) and identically distributed with zero mean. Indeed, the experimental results that we described at the beginning of this section lead to a power spectral density consistent with~(\ref{eq:rn:whitenoisepsd}).
\subsection{Filtering a Stochastic Process}
\label{sec:rn:filtering}
Given a WSS random process $\mathbf{x}$ and a deterministic \textit{stable} filter with impulse response $\mathbf{h}$, the filtered process
\[
y[n] = \left(\mathbf{h \ast x}\right)[n] = \sum_{k=-\infty}^{\infty} h[k]x[n-k]
\]
is itself a random process, since at each output index $n$ we are computing a linear combination of random variables. We will now show that $\mathbf{y}$ is itself WSS and we will derive the equivalent of the convolution theorem for WSS random processes.
\itempar{Time-Domain Analysis.}
The expected value of the filter's output at $n$ is
\begin{align}
\expt{y[n]} &= \expt{\displaystyle \sum_{k = -\infty}^{\infty} h[k] x[n - k] } \nonumber\\
&= \sum_{k = -\infty}^{\infty} h[k] \expt{x[n - k]} \nonumber \\
&= m_x \sum_{k = -\infty}^{\infty} h[k] \label{eq:rn:filtmean}
\end{align}
which shows that the mean of the output is time-invariant. The autocorrelation is
\begin{align*}
\expt{y[n]y[m]} &= \expt{\displaystyle \sum_{k = -\infty}^{\infty} h[k] x[n - k] \, \sum_{i = -\infty}^{\infty} h[i] x[m - i] }\\
&= \sum_{k = -\infty}^{\infty}\sum_{i = -\infty}^{\infty} h[k]h[i] \expt{x[n - k]x[m - i]} \\
&= \sum_{k = -\infty}^{\infty}\sum_{i = -\infty}^{\infty} h[k]h[i] r_x[(n - m) - k + i];
\end{align*}
since the result depends only on the difference $n-m$, we have proven that a filtering operation preserves the WSS nature of the input. With a simple change of variable we can rewrite the expression for the output autocorrelation as
\begin{equation}\label{eq:rn:outputcorr}
\mathbf{r}_y = \mathbf{h} \ast \mathcal{R}\{\mathbf{h}\} \ast \mathbf{r}_x
\end{equation}
where $\mathcal{R}$ is the time-reversal operator. In other words, the autocorrelation of the output can be obtained by convolving the autocorrelation of the input with a sequence that is the convolution of the filter's impulse response with its time-reversed version. Note that $\mathbf{h} \ast \mathcal{R}\{\mathbf{h}\}$ is symmetric and therefore zero-phase.
In certain applications, we may be interested in the cross-correlation between input and output; it is easy to show that we have
\begin{equation}\label{eq:rn:filtcrosscorr}
\mathbf{r}_{xy} = \mathbf{h} \ast \mathbf{r}_x.
\end{equation}
\itempar{Frequency-Domain Analysis.}
Using the result in~(\ref{eq:rn:outputcorr} it is straightforward to obtain the relationship between input and output in the frequency domain:
\begin{equation}\label{eq:rn:filtpsd}
P_y(e^{j\omega}) = \bigl| H(e^{j\omega}) \bigr|^2 P_x(e^{j\omega}).
\end{equation}
The above result is the stochastic equivalent of~(\ref{eq:ls:convtheo}) and represents the fundamental result of stochastic signal processing, namely, the frequency-domain characteristics of LTI systems carry over (in magnitude) to the domain of random processes. In other words, a frequency-selective filter designed for deterministic signals can be used in a stochastic context and it will shape the \textit{power} distribution of the input process in the same way it would shape the energy distribution of a deterministic input.
The frequency response of the filter appears in~(\ref{eq:rn:filtpsd}) in magnitude only; an intuitive interpretation of that fact starts with recalling that a signal's phase component, for a given magnitude spectrum, determines a signal's ``shape'' in time. When it comes to random processes, we can make no prediction about the exact shape of a given realization, only about the average distribution in frequency of the power of the process. Therefore, when filtering a random process, we cannot hope to control the phase of the output signal.
\subsection{MA, AR, and ARMA Processes}
\label{sec:rn:arma}
The results in the previous section seem to indicate that, given a desired power spectral density, we could build the underlying discrete-time random process simply by running white noise through a filter with the appropriate magnitude response. In fact, a fundamental result in statistics, called Wold Representation Theorem, proves that \textit{any} stationary process $\mathbf{y}$ can be obtained from a white noise sequence $\mathbf{w}$ as
\[
y[n] = \sum_{k=0}^{\infty} b[k] w[n-k],
\]
that is, as the convolution between a white noise process and a \textit{causal} sequence of coefficients $\mathbf{b}$. The theorem is not constructive, in that it doesn't provide us with a way to compute the potentially infinite-support sequence $\mathbf{b}$; furthermore, there is no guarantee that $\mathbf{b}$ corresponds to the impulse response of a realizable filter. In practice, however, the representational power of rational transfer functions is sufficient to cover all cases of practical importance.
The ability to decompose an arbitrary stationary process as white noise plus a realizable filter is fundamental in two respects:
\begin{itemize}
\item for \textit{synthesis} purposes, when we want to generate realizations of a random process with a given autocorrelation (or, equivalently, with a given PSD)
\item for \textit{analysis} purposes, when we want to compactly model the spectral properties of a random process and gain insight on the underlying generating system. This is particularly useful in applications such as system identification and data compression, as we will see later.
\end{itemize}
In both cases, the hypothesis is that the random process is obtained by filtering white noise with a realizable filter with a rational transfer function of the form
\[
H(z) = \frac{B(z)}{A(z)}
\]
and the problem is to determine the coefficients of numerator and denominator. Two special cases are particularly important:
\itempar{MA processes:} when $A(z)=1$, $H(z)$ is an FIR filter and the resulting process is called a Moving Average process. From~(\ref{eq:rn:outputcorr}) we can easily determine that if the FIR filter has $M$ coefficients, the output autocorrelation will be finite-support as well with ony $2M+1$ coefficients:
\[
r_y[k] = 0 \mbox{ for } |k| > M.
\]
AR models are therefore useful to characterize random process with a short-range interdependence between samples.
\itempar{AR processes:} when $B(z)=1$, $H(z)$ is an IIR filter with poles only, and the resulting process is called a Auto-Regressive process. All-pole filters are particularly suitable to modeling physical systems, where the spectral characteristics are determined by natural \textit{resonances} (think about musical instruments or the human voice). The autocorrelation in these cases is a symmetric, two-sided infinite sequence, as we can easily see when $\mathbf{h}$ in~(\ref{eq:rn:outputcorr}) is an infinite impulse response.

Event Timeline