+As we have seen, a continuous-time signal can be converted to a discrete-time sequence via sampling. By changing the value of the sampling rate we can obtain an arbitrary number of discrete-time signals from the same original continuous-time source; the number of samples per second will increase or decrease linearly with the sampling rate and, according to whether the conditions of the sampling theorem are satisfied or not, the resulting discrete-time sequences will be an exact representation of the original signal or will be affected by aliasing. Mutirate theory studies the relationship between such sequences; or, in other words, it addresses the question of whether we can transform a sampled representation into another with a different underlying sampling frequency purely from within discrete time.
+
+The primary application of multirate theory is digital sampling rate conversion, an operation that becomes necessary when the original sampling frequency of a stored signal is incompatible with the working rate of the available D/A device. But we will see that multirate also plays an important role in a variety of applied signal processing algorithms; in particular, the technique of \textit{oversampling} is often used in situations where an increase of the data rate is used to improve the performance of the analog elements in the processing chain. Since speeding up digital computations is in general much cheaper than using high-performance analog circuits, oversampling is commonly used in consumer-level devices. Other applications of multirate range from efficient filter design, to spectral shaping for communication systems, to data compression standards. And, finally, from a more theoretical point of view, multirate is a fundamental ingredient of advanced analysis techniques which go under the name of time-frequency decompositions.
+
+
+
+\section{Downsampling}\index{downsampling|(}
+\label{sec:mr:down}
+
+Downsampling by~$N$\index{downsampling|mie}\index{decimation} (also called subsampling or decimation\footnote{%
+ Technically, decimation means $9$ out of $10$ and refers to a roman custom of killing every $10$th soldier of a defeated army\ldots})
+produces a lower-rate sequence by keeping only one out of $N$ samples in the original signal. If we denote by $\mathcal{S}_N$ the downsampling operator\footnote{
+ We use the letter $\mathcal{S}$ rather than $\mathcal{D}$ since the latter indicates the delay operator.},
+Downsampling, as shown in Figure~\ref{fig:mr:down} effectively {\em discards} $N-1$ out of $N$ samples and therefore a loss of information may be incurred. Indeed, in terms of the underlying sampling frequency, decimation produces the signal that would have been obtained by sampling $N$ times more slowly. The potential problems with this data reduction will take the form of aliasing, as we will see shortly.
+One of the consequences of the lack of time-invariance is that complex sinusoids are not eigensequences for the downsampling operator; for instance, if $x[n] = e^{j \pi n} = (-1)^n$, we have
+\begin{equation}
+\label{eq:mr:12}
+ x_{2\downarrow}[n] = x[2n] = e^{j 2\pi n} = 1;
+\end{equation}
+in other words, the highest digital frequency has been mapped to the lowest frequency. This looks very much like aliasing, as we will now explore in detail.
+ \dspSignal[linecolor=dspDTColor]{x 4 mul \dtSig}
+ \end{dspPlot}
+ \caption{Downsampling by $4$ in the time domain: original signal (top panel); samples ``killed'' by the downsampler (middle panel); downsampled signal (bottom panel). Note the difference in time indexes between top and bottom panels.}\label{fig:mr:down}
+where $\xi_N[n]$ is a ``killer sequence'' defined as
+\[
+ \xi_N[n] =
+ \begin{cases}
+ 1 & \mbox{ for $n$ multiple of $N$} \\
+ 0 & \mbox{ otherwise}
+ \end{cases}
+\]
+and shown in Figure~\label{fig:mr:xi}; indeed, $\xi_N[n]$ will ``kill off'' all the terms in the sum for which the index is not a multiple of $N$. The sequence $\xi_N[n]$ is $N$-periodic and one way to represent it is as the inverse DFS of size $N$ of a vector of all ones. In other words,
+we have the scaled sum of $N$ superimposed copies of the original spectrum $X(e^{j\omega})$ where each copy is shifted in frequency by a multiple of $2\pi/N$. We are in a situation similar to that of equation~(\ref{eq:is:periodizedFT}) where sampling created a periodization of the underlying spectrum; here the spectra are already inherently $2\pi$-periodic, and downsampling creates $N-1$ additional interleaved copies. The final spectrum $X_{N\downarrow} (e^{j\omega})$ is simply a stretched version of $A(e^{j\omega})$, so that the interval $[-\pi/N, \pi/N]$ becomes $[-\pi, \pi]$.
+
+
+Because of the superposition, aliasing\index{aliasing!in multirate} can occur; this is a consequence of the potential loss of information that occurs when samples are discarded. For baseband signals, it is easy to verify that in order for the spectral copies in~(\ref{eq:mr:dss}) not to overlap, the maximum (positive) frequency $\omega_M$ of the original spectrum\footnote{
+ Here, for simplicity, we are imagining a lowpass real signal whose spectral magnitude is symmetric. More complex cases exist and some examples will be described next.}
+must be less than $\pi/N$; this is the \emph{non-aliasing condition} for the downsampling operator. Conceptually, fulfillment of the non-aliasing condition indicates that the discrete-time representation of the original signal is intrinsically redundant; $(N-1)/N$ of the information can be safely discarded and this is mirrored by the fact that only $1/N$ of the spectral frequency support is nonzero. We will see shortly that, in this case, the original signal can be perfectly reconstructed with an upsampling and filtering operation.\index{downsampling|)}
+
+
+\subsection{Examples}
+In Figures~\ref{fig:mr:exA} to~\ref{fig:mr:exE}) the top panel shows the original spectrum $X(e^{j\omega})$; the second panel shows the same spectrum but plotted over a wider interval so as to make its periodic nature explicit; the third panel shows in different colors the individual terms in the sum in~(\ref{eq:mr:nonscaled}); the fourth panel shows $A(e^{j\omega})$ \emph{before} scaling and stretching by $N$; finally, the last panel shows $X_{N\downarrow}(e^{j\omega})$ over the usual $[-\pi, \pi]$ interval.
+
+
+\itempar{Downsampling by 2.} If the downsampling factor is $2$, the \ztrans\ and the Fourier transform of the output are simply
+Figure~\ref{fig:mr:exA} shows an example for a lowpass signal whose maximum frequency is $\omega_M = \pi/2$ (i.e.\ a half-band signal). The non-aliasing condition is fulfilled and, in the superposition, the two shifted versions of the spectrum do not overlap. As the frequency axis stretches by a factor of $2$, the original half-band signal becomes full band.
+
+Figure~\ref{fig:mr:exB} shows an example in which the non-aliasing condition is violated. In this case, $\omega_M = 2\pi/3 > \pi/2$ and the spectral copies do overlap. We can see that, as a consequence, the downsampled signal loses its lowpass characteristics. Information is irretrievably lost and the original signal cannot be reconstructed.
+ \caption{Downsampling by $3$; the highest frequency is larger than $\pi/3$ (here, $\omega_M = 2\pi/3$) and aliasing occurs. Notice how three spectral replicas contribute to the final spectrum.}\label{fig:mr:exC}
+\itempar{Downsampling a Highpass Signal.} Figure~\ref{fig:mr:exD} shows an example of downsampling by $2$ applied to a half-band {\em highpass} signal. Since the signal occupies only the upper half of the $[0, \pi]$ frequency band (and, symmetrically, only the lower half of the $[-\pi, 0]$ interval), the interleaved copies do not overlap and, technically, there is no aliasing. The shape of the signal, however, is changed by the downsampling operation and what started out as a highpass signal is transformed into a lowpass signal. To make the details of the transformation clearer in this example we have used a {\em complex-valued} highpass signal for which the positive and negative parts of the spectrum have different shapes; it is apparent how the original left and right spectral halves are end up in reverse positions in the final result. The original signal can be exactly reconstructed (since there is no destructive overlap between spectral copies) but the required procedure is a bit more creative and will be left as an exercise.
+ \caption{Downsampling by~$2$ of a {\em complex} highpass signal; the asymmetric spectrum helps to understand how non-destructive aliasing works.}\label{fig:mr:exD}
+\subsection{Antialiasing Filters in Downsampling} We have seen in Section~\ref{sec:is:antialias} that, in order to control the error when sampling a non-bandlimited signal, our best strategy is to bandlimit the signal using a lowpass filter. The same holds when downsampling by $N$ a signal whose spectral support extends beyond $\pi/N$: before downsampling we should apply a lowpass with cutoff $\omega_c = \pi/N$ as shown in Figure~\ref{fig:mr:downfilt}. While a loss of information is still unavoidable, the filtering operation allows us to control said loss and prevent the disrupting effects of aliasing.
+An example of the processing chain is shown in Figure~\ref{fig:mr:exE} for a downsampling factor of~$2$; a half-band lowpass \index{half-band filter} filter is used to truncate the signal's spectrum outside of the $[-\pi/2, \pi/2]$ interval and then downsampling proceeds as usual with non-overlapping spectral copies.
+Upsampling by $N$ produces a higher-rate sequence by creating $N$ samples for each sample in the original signal. In its simplest form, an upsampler just inserts $N-1$ zeros after every input sample, as shown in Figure~\ref{fig:mr:up}. If we denote by $\mathcal{U}_N$ the upsampling operator, we have
+ \caption{Upsampling by $4$ in the time domain: original signal (top panel); upsampled signal, where 3 zeros have been appended to each original sample (bottom panel). Note the difference in time indexes between top and bottom panels. }\label{fig:mr:up}
+Upsampling is a much ``nicer'' operation than downsampling since no information is lost; the original signal can always be exactly recovered by applying a congruent downsampling operation:
+so that upsampling is simply a contraction of the frequency axis by a factor of $N$. The inherent $2\pi$-periodicity of the spectrum must be taken into account so that, in this contraction, the periodic repetitions of the base spectrum are ``pulled in'' the $[-\pi, \pi]$ interval. The effects of upsampling on a signal's spectrum are shown graphically for a simple signal in Figures~\ref{fig:mr:upA} and~\ref{fig:mr:upB}; in all figures the top panel shows the original spectrum $X(e^{j\omega})$ over $[-\pi, \pi]$; the middle panel shows the same spectrum over a wider range to make the $2\pi$-periodicity explicitly; the last panel shows the upsampled spectrum $X_{N\uparrow}(e^{j\omega})$, highlighting the rescaling of the $[-N\pi, N\pi]$ interval.\index{upsampling|)} As a rule of thumb, upsampling ``brings in'' exactly $N$ copies of the original spectrum over the $[-\pi, \pi]$ interval even if, in the case of an even upsampling factor, one copy is split between the negative and positive frequencies.
+The upsampled signal in~(\ref{eq:mr:up}), with its $N-1$ zeros between original samples, exhibits two problems. In the time domain, the upsampled signal looks ``jumpy'' because of the periodic zero runs. This is the discrete-time equivalent to a lack of smoothness, since the signal keeps dropping back to zero, and it is apparent in the bottom panel of Figure~\ref{fig:mr:up}. In the frequency domain, simple upsampling has ``brought in'' copies of the original spectrum in the $[-\pi, \pi]$ interval, creating spurious high frequency content. These two issues are actually one and the same and they can be solved, as we will see, by using an appropriate filter.
+The problem of filling the gaps between nonzero samples in an upsampled sequence is, in many ways, similar to the discrete- to continuous-time interpolation problem of Section~\ref{sec:is:interp}, except that now we are operating entirely in discrete time. If we adapt the interpolation schemes that we have already studied, we have the following cases\index{interpolation!in multirate}:
+In this discrete-time interpolation scheme, also known as \emph{piecewise-constant interpolation}, after upsampling by $N$, we use a filter with impulse response
+\begin{equation}\label{eq:mr:zoh}
+ h_0 [n] =
+ \begin{cases}
+ 1 & \ n = 0,1, \ldots, N-1 \\
+ 0 & \mbox{ otherwise}
+ \end{cases}
+\end{equation}
+which is shown in Figure~\ref{fig:mr:zoh}-(a). This interpolation filter simply repeats the last original input samples $N$ times, giving a staircase approximation as shown in the top panel of Figure~\ref{fig:mr:upinterp}.
+We know that, in continuous time, the smoothest interpolation is obtained by using a sinc function. This holds in discrete-time as well, and the resulting interpolation filter is the discrete-time sinc:
+\begin{equation}
+ h[n] = \sinc\left(\frac{n}{N}\right)
+\end{equation}
+Note that the sinc above is equal to one for $n = 0$ and is equal to zero at all integer multiples of $N$, $n = kN$; this fulfills the interpolation condition, that is, after interpolation, the output equals the input at multiples of $N$: $(h \ast x_{N\downarrow})[kN] = x_{N\downarrow}[kN] = x[k]$.
+
+
+
+The three impulse responses above are all lowpass filters; in particular, the sinc interpolator is an ideal lowpass with cutoff frequency $\omega_c = \pi/N$ while the others are approximations of the same. As a consequence, the effect of the interpolator in the frequency domain is the removal of the $N-1$ spectral copies ``drawn in'' the $[-\pi, \pi]$ interval by the upsampler. An example is shown in
+Figure~\ref{fig:mr:upfilt} where the signal in Figure~\ref{upsamplingFigC} is filtered by an ideal lowpass filter with cutoff $\pi/4$.
+
+It turns out that the smoothest possible interpolation in the time domain corresponds to the perfect removal of the spectral repetitions in the frequency domain. Interpolating with a zero-order or a first-order kernel, by contrast, only attenuates the replicas instead of performing a full removal, as we can readily see by considering their frequency responses. Since we are in discrete-time, however, there are no difficulties associated to the design of a digital lowpass filter which closely approximates an ideal filter, so that alternate kernel designs (such as optimal FIRs) can be employed.
+This is in contrast to the design of discrete---to continuous---time interpolators, which are analog designs. That is why sampling rate changes are much more attractive in the discrete-time domain.
+
+
+
+
+\section{Fractional Sampling Rate Changes}
+
+The conversion from one sampling rate to another can always take the ``obvious'' route of interpolating to continuous time and resampling the resulting signal at the desired rate; but this would require dedicated analog equipment and would introduce some quality loss because of the inevitable analog noise. We have just seen, however, that we can increase or decrease the implicit rate of a sequence by an integer factor entirely in the discrete-time domain, by using an upsampler or a downsampler and a digital lowpass filter. For fractional sampling rate changes we simply need to cascade the two operators.
+
+The order in which upsampling and downsampling are performed in a rate changer is crucial since, in general, the operators are not commutative. It is easy to appreciate this fact by means of
+Intuitively it's clear that, since no information is lost when using an upsampler, in a fractional sampling rate change the upsampling operation will come first. Typically, a rate change by $N/M$ is obtained by cascading an upsampler by~$N$, a lowpass filter, and a downsampler by~$M$. Since normally the upsampler is followed by a lowpass with cutoff $\pi/N$ while the downsampler is preceded by a lowpass with cutoff $\pi/M$, we can use a single lowpass whose cutoff frequency is the minimum of the two. A block diagram of this system is shown in Figure~\ref{fig:mr:frac}.\index{fractional sampling rate change} \index{rational sampling rate change}
+As an example, suppose we want to increase the rate of a sequence originally sampled at 24~KHz up to 32~KHz. For this we need a fractional change of $32/24$ which, after simplification, corresponds to an upsampler by 4 followed by a downsampler by 3, as shown in the top panel of Figure~\ref{fig:mr:fracex}; the lowpass filter's cutoff frequency is $\pi/4$ and, in this case, the lowpass filter acts solely as an interpolator since the overall rate is increased. Conversely, if we want to convert a 32~KHz signal to a 24~KHz, that is, apply a sampling rate change by $3/4$, we can use the cascade shown in the bottom panel of Figure~\ref{fig:mr:fracex}; the cutoff frequency of the filter does not change but the filter, in this case, acts as an anti-alias.
+If the ratio between the two sampling frequencies cannot be decomposed into a ratio of small coprime factors, the intermediate rates in a fractional rate changer can be prohibitively high. That was the rationale, for instance, behind an infamous engineering decision taken by the audio industry in the early 90's when the first digital audio recorders (call DAT) were introduced on the market. In order to make it impossible for users to create perfect copies of CDs on digital tapes, the sampling frequency of the recorders was set to 48~KHz. Since CDs are encoded at 44.1~KHz, this resulted in a required fractional change rate of 160/147. At the time, a 160-fold upsampling was simply not practical to implement so users would have to necessarily go through the analog route to copy CDs.
+
+In practice, however, fractional resamplers can be implemented using local interpolation techniques. To understand the procedure, let's first analyze the practical version of a subsample interpolator and then apply the idea to a resampler.
+
+
+\itempar{Subsample Interpolation.}
+\index{subsample interpolation}
+Consider an $F_s$-bandlimited continuous-time signal $x(t)$ and its sampled version $x[n] = x(nT_s)$, with $T_s \leq 1/F_s$. Given a fractional delay $\tau T_s$, with $|\tau|< 1/2$, we want to determine the sequence
+\[
+ x_\tau[n] = x(nT_s + \tau T_s)
+\]
+using only discrete-time processing; for simplicity, let's just assume $T_s=1$. We know from Section~\ref{sec:is:duality} that the theoretical way to obtain $x_\tau[n]$ from $x[n]$ is to use an ideal fractional delay filter:
+As we have seen in Section~\ref{sec:is:sincinterp}, the sinc interpolator originates as the limit of polynomial interpolation when the number of interpolation points goes to infinity. In this case we can work backwards, and replace the sinc with a low-degree, \textit{local} Lagrange interpolation as in Section~\ref{sec:is:lagrange}. Given $L$ samples to the left and the right of $x[n]$, we can build the continuous-time signal\index{Lagrange interpolation}
+where $L_k^{(N)}(t)$ is the $k$-th Lagrange polynomial of order $2N$ defined in~(\ref{eq:is:lagPoly}). With this, we can use the approximation
+\begin{equation} \label{eq:mr:subApprox}
+ x_\tau[n] = x_L(n; \tau) \approx x(n + \tau).
+\end{equation}
+As an example, consider setting $L=1$, that is, using a quadratic Lagrange interpolation; the corresponding three Lagrange polynomials are shown in Figure~\ref{fig:mr:lagPoly} while the interpolation and approximation procedure are shown graphically in Figure~\ref{fig:mr:approx}.
+The FIR interpolator is expressed in noncausal form purely out of convenience; in practical implementations an additional delay would be used to make the whole processing chain causal.
+
+
+\itempar{Local resampling.} The principle of the subsample interpolator we just described can be used to perform fractional resampling. Again, consider an $F_s$-bandlimited continuous-time signal $x(t)$ and its sampled version $x_1[n] = x(n/F_1)$, with $F_1 \geq F_s$. Given a second sampling frequency $F_2 \geq F_s$, we want to estimate $x_2[n] = x(n/F_2)$ using only $x_1[n]$. We have:
+\begin{equation}%\label{eq:mr:FIRcoeffs}
+ x_2[n] = x(n/F_2) = x
+\end{equation}
+
+
+
+
+\end{document}
+
+
+
+Interestingly enough, however, if the downsampling and
+upsampling factors $N$ and $M$ are coprime, the operators do
-The idea behind the discrete-time signals we've seen so far is that of a well-ordered sequence of scalar values; whether the result of a series of measurements (sampling) or the output of a signal generator (synthesis), the resulting sequence unfolds ``chronologically'' as a function of an integer index $n \in \mathbb{Z}$. While this model captures extremely well a very large number of physical frameworks, there are in fact situations in which the set of values of interest is best represented by a function of a multidimensional index. The everyday example is provided by images, which can be modeled as a light intensity value over a set of points in the two-dimensional plane. In the digital world, an ``image signal'' becomes a discrete set of grayscale or color values, a.k.a. {\em pixels}, which are ordered in space rather than in time; the corresponding signal can be written as $x[n_1, n_2]$, with $n_1$ and $n_2$ denoting the horizontal and vertical discrete coordinates.
+So far we have only encountered discrete-time signals that are ordered sequences of scalar values indexed by an integer variable $n \in \mathbb{Z}$. This type of signal models very well a very large number of physical frameworks where the data unfolds ``chronologically'' as a function of time. There are however many other situations where a data set is best represented by a function of a multidimensional index. The everyday example is provided by digital images, which can be modeled as a grayscale or color intensity signal over a set of points on a two-dimensional planar grid. Otherwise stated, an ``image signal'' is a set of values (the {\em pixels}), which are ordered in space rather than in time; the corresponding signal can be written as $x[n_1, n_2]$, with $n_1$ and $n_2$ denoting the horizontal and vertical discrete coordinates.
-Note that, although in the rest of this chapter we will concentrate on images, i.e. on two-dimensional signals, higher-dimensional signals do exist and are indeed quite common. Video, for instance, is a sequence of still pictures; as such it can be modeled as a signal defined over a triplet of indices, two for the image coordinates in each video frame and one for the frame number. Another class of three-dimensional signals emerges from volume rendering, where the data represents units of volume (or {\em voxels}) in three-dimensional space. Four dimensions naturally appear in time-varying volume rendering applications common in biomedical imaging. Most of the elementary concepts developed for one-dimensional signals admit a ``natural'' extension to the multidimensional domain, in the sense that the extension performs according to expectations; a multidimensional Fourier transform, for instance, decomposes a multidimensional signal into a set of multidimensional oscillatory basis functions. As the number of dimensions grows, however, intuition begins to fail us and high-dimensional signal processing quickly becomes extremely abstract (and quite cumbersome notation-wise); as a consequence, in this introductory material, we will focus solely on two-dimensional signals in the form of images.
+Note that, although in the rest of this Chapter we will concentrate on images, i.e. on two-dimensional signals, higher-dimensional signals do exist and are indeed quite common. Video, for instance, is a sequence of still pictures; as such it can be modeled as a signal defined over a triplet of indices, two for the image coordinates in each video frame and one for the frame number. Another class of three-dimensional signals emerges from volume rendering, where the data represents units of volume (or {\em voxels}) in three-dimensional space. Four dimensions naturally appear in time-varying volume rendering applications common in biomedical imaging. Most of the elementary concepts developed for one-dimensional signals admit a ``natural'' extension to the multidimensional domain, in the sense that the extension performs according to expectations; a multidimensional Fourier transform, for instance, decomposes a multidimensional signal into a set of multidimensional oscillatory basis functions. As the number of dimensions grows, however, intuition begins to fail us and high-dimensional signal processing quickly becomes extremely abstract (and quite cumbersome notation-wise); as a consequence, in this introductory material, we will focus solely on two-dimensional signals in the form of images.
Finally, a word of caution. With respect to 1D signal processing, image processing often appears less formal and less self contained; the impression is not altogether unfounded and there are two main reasons for that. To begin with, images are very specialized signals, namely, they are signals ``designed'' for the unique processing unit that is the human visual system\footnote{For instance, if humans had bat-like ranging capabilities, 2D signals wouldn't be so special and 3D depth maps would be all the rage.}. With some rare exceptions such as barcodes or QR-codes, in order to make sense of an image one needs to deploy a level of semantic analysis that profoundly transcend the capabilities of the standard signal processing toolbox; although linear processing is an almost universal first step, it so happens that a handful of simple tools are all that's necessary in the vast majority of cases. The second reason, which is not at all unrelated to the first, is that Fourier analysis does not play a major role in image processing. Most of the information in an image is encoded by its edges (as we learn in infancy from coloring books), but edges represent signal discontinuities that, in the frequency domain, affect mostly the global phase response and are therefore hard to see. Also consider this: a spectral magnitude is an invaluable {\em visual} tool to quickly establish the properties of a 1D signal; but images are {\em already} visual entities, and their spectra does not represet their information in a more readable manner.
\section{Preliminaries}
An $N_1\times N_2$ digital picture is a collection of $M=N_1 N_2$ values, called {\em pixels}\footnote{From {\em pic}ture {\em el}ement. For an interesting historical account of the word see ``A Brief History of Pixel'', by Richard F. Lyon, 2006.}, usually arranged in a regular way to form a rectangular image. Scalar values for the pixels give rise to grayscale images, with each pixel's value normally constrained between zero (black) and some maximum value (white). The most commom quantization scheme allocates 8~bits per pixel, which gives 256~levels of gray to choose from; this is adequate in most circumstances for standard viewing.
To encode color images, we first need to define a {\em color space}; the popular RGB~encoding, for instance, represents visible colors as a weighed sum of the three primary colors red, green and blue, so that each color pixel is a triple of (quantized) coordinates in RGB space. Many other color spaces exist, each with their pros and cons, but since colorimetry (the theory and practice of color perception and reproduction) is a very complete and rich discipline unto itself, we will not attempt even a brief summary here. For all practical purposes we will simply consider a color image as a collection of $C$ superimposed scalar images, where $C$ is the number of coordinates in the chosen color space. For 24-bit RGB images, for instance, the red, green and blue images will be handled as three independent 8-bit grayscale images.
\subsection{Pixel Arrangement}
From the point of view of its information content, a grayscale $N_1\times N_2$ image is just a collection of $M=N_1N_2$ scalar values; consequently, the image could be ``unrolled'' into a standard one-dimensional $M$-point signal in $\mathbb{R}^{M}$ and we could revert to the ``classical'' signal processing techniques of the previous chapters. And, as a matter of fact, there are several applications that do precisely that: the storage or the transmission of raw image, for instance, involves a {\em scanning}\/ operation that serializes the set of image pixels into a 1-D signal. Normally, the serialization takes place in a row-by-row fashion, i.e., the image is converted to a {\em raster scan} format. Classic reproducing devices such as printers or cathode-ray tubes perform the inverse operation by stacking the 1-D data back into rectangular form one row at a time. Nevertheless, in ``true'' image processing applications, we will always use a 2-D representation indexed by two distinct coordinates; the fundamental concept here is that images possess \emph{local correlations} in the two-dimensional spatial domain and that those correlations would be lost in the scanning process. As a simple example imagine a $N\times N$ black image with a single white line. A row-by-row raster scan would produce very different length-$N^2$ 1D signals according to the line's orientation: if the line is vertical, we would obtain an extremely sparse signal where only one every $N$ samples is nonzero; a horizontal line, on the other hand, would produce a signal in which only a block of $N$ contiguous pixels are nonzero; other orientations would produce intermediate results. It's clear that the spatial structure of this simple image is completely obfuscated by a one-dimensional representation and that's why fully two-dimensional operators are necessary.
Normally pixels are arranged on a regular two-dimensional array of points on the plane. Although different arrangements are possible (each corresponding to different {\em point lattices}\/ in the plane), here we will consider only the ``canonical'' grid associated to the regular lattice $\mathbb{Z}^2$, i.e. each pixel will be associated to a pair of integer-valued coordinates.
%Figure~\ref{} shows three such standard arrangements, of which the first one is doubtlessly the simplest and the most common; all three arrangements belong to the set of. In general, a planar point lattice is defined as such: take a $2\times 2$ nonsingular {\em generating matrix} $\mathbf{L}$ and consider all integer-valued linear combinations of its columns. Formally, the coordinates of all points in the lattice are the set
%where $\mathbf{n} = [n_1 \, n_2]^T$ is a generic coordinate pair on the plane.
%indexed by two spatial coordinates. Normally (and in the following we will stick to this model) the coordinates belong to the regular lattice $\mathbb{Z}^2$, i.e. they are drawn from a grid of regularly spaced points in the plane, each point corresponding to a pair of integer-valued coordinates. Although the regular integer grid is the most common arrangement of points in a two-dimensional signal, other configurations are possible
\subsection{Visualization}
Since images can be modeled as real-valued functions on the plane, a formal graphical representation would require a three-dimensional plot such as the one in Figure~\ref{Gauss3D}, which is a natural extension of the Cartesian plots of 1D signals as functions of discrete time. Of course this kind of representation is useful primarily for ``abstract'' 2D sequences, i.e. 2D signals that do not encode semantically intuitive visual information; examples of the latter include 2D impulse and frequency responses and signal transforms. Figure~\ref{Gauss3D}, for example, depicts a portion of a Gaussian impulse response used in image blurring (see Section~\ref{blurring}); since the impulse response is obtained from a 2D function sampled on the grid, it makes sense to represent the signal in such an abstract fashion.
``Normal'' images, on the other hand, are usually represented by plotting a graylevel dot at each pixel location, where the shade of grey encodes the pixel's value; if the dots are closely spaced, the eye automatically interpolates the underlying pointwise structure and the plot gives the illusion of a ``continuous'' image\footnote{Of course, Pop artists (and Roy Lichtestein in particular) hit a gold mine when they realized that dots could be spaced far apart...}; perceived closeness is of course also dependent on viewing distance. A standard example of this ``pictorial'' representation of images is shown in Figure~\ref{TheImage}\footnote{For a nice and compact history of the ``Lena'' image please see {\tt http://www.lenna.org/}; in spite of potential controversy, the author's feeling is that Lena is really the Mona Lisa of image processing, that is to say, a universally recognized icon that completely transcends its earthly begetting. As such, Lena is indeed quite perfect for a short introduction to image processing.}. An important point to notice is that, since the range of gray levels available on print (or on a screen) is finite, most images will require a level adjustment in order to be readable. In fact, this is no different than rescaling a fixed-size axis in one-dimensional plots; in pictures, the representable range is normally standardized between zero, which is usually mapped to black, and one, which is mapped to white. Unless otherwise noted, all images in the rest of the chapter and scaled and shifted so that they occupy the full grayscale range. In other words, the represented signal is
\[
y[n_1, n_2] = (x[n_1, n_2] - m)/(M-m)
\]
where $m = \min\{x[n_1, n_2]\}$ and $M = \max\{x[n_1, n_2]\}$. In Matlab, this rescaling is performed automatically when using the {\tt imagesc} command.
A third, intermediate graphical option is given by ``support plots''. In these plots, we revert to a Cartesian viewpoint but we look at the 3D space ``from the top'', so to speak, and we only represent the nonzero values of a signal. The result is a 2D plot where the nonzero pixels appear as dots on the plane and examples can be seen in Figure~\label{basic2D}. This representation focuses on the support of a signal and is particularly useful in describing finite impulse responses and other algorithmic techniques.
Interpolation is the procedure with which we convert a discrete-time sequence $x[n]$ to a continuous-time function $x(t)$. At the core of the interpolation procedure, as we have mentioned, is the association of a {\em physical}\/ time duration $T_s$ to the intervals between the samples in the discrete-time sequence; next, we need to ``fill the gaps'' between sample values. To develop the intuition, let's consider a simple example where we want to interpolate a finite-length signal as in Figure~\ref{fig:is:diffint}-(a); Figures~\ref{fig:is:diffint}-(b), (c) and (d) show three possible interpolations of the given dataset to a continuous-time signal $x(t)$ using $T_s = 1$.
Although we haven't really formalized the requirements for the interpolation process, it is rather intuitive that all three proposed interpolations are not entirely satisfactory:
\begin{itemize}
\item[(b)] the interpolation does not ``go through'' the data points; indeed, it seems intuitive that the interpolation should respect the original dataset;
\item[(c)] while the function goes through the data points, it does not look smooth; in general, we are right to be suspicious of non-smooth curves, since no natural phenomenon can create discontinuous jumps in amplitude (achievable only via infinite speed) or curvature (achievable only via infinite acceleration);
\item[(d)] while the function goes through the data points and is smooth, it seems to ``wander around'' too much, as if interpolating a more complicated dataset; indeed, we want our interpolation strategy to be ``minimalistic'' in using the information contained in the discrete-time signal.
\end{itemize}
In the following sections we will try to make our requirements more precise and see how we can arrive at a universal interpolation scheme that is both intuitively and mathematically sound. To formalize the problem, and without loss of generality, we will assume in the following that $T_s = 1$ and that the finite-length discrete-time signal is defined for $n = -N, \ldots, 0, \ldots, N$ for an odd length of $M = 2N+1$ points.
Given a discrete-time signal $x[n]$, once we have chosen an interpolation interval $T_s$, the first requirement for the interpolating function $x(t)$ is that its values at multiples of $T_s$ be equal to the corresponding points of the discrete-time sequence, i.e.
\[
x(t) \Bigr|_{t = nT_s} = x[n];
\]
with our choice of $T_s = 1$ the above condition becomes simply $x(n) = x[n]$. The second requirement is that we want the interpolating function to be smooth and, mathematically, the smoothness of a function increases with the number of its continuous derivatives. For maximal smoothness, therefore, we require $x(t)$ to be in $C^{\infty}$, where $C^{M}$ is the class of functions for which all derivatives up to order $M$ exist and are continuous.
One maximally differentiable curve through a set of $2N+1$ data points is the unique polynomial interpolator of degree $2N$
which, like all polynomials, belongs to $C^{\infty}$. Computation of the interpolator's $2N+1$ coefficients is a classic algebraic problem, first solved in the 17th century by Newton. Numerically, one way to arrive at the solution is to work out the system of $2N+1$ equations
which can be carried out algorithmically using Pascal's triangle, for instance. A more interesting approach is to consider the space of finite-degree polynomials over the interval $[-N, N]$ and write $p(n)$ as a linear combination of basis vector for the space. This is not the first time we encounter the need for polynomial basis: in Chapter~\ref{ch:vs} we briefly encountered the Legendre polynomial basis in the context of approximation, while the interpolator in~(\ref{eq:is:straightPI}) is clearly expressed as a linear combination of the $2N+1$ {\em monomial}\/ basis vectors $\{1, t, t^2, \ldots t^{2N}\}$. For the task at hand, however, a more appropriate basis is the set of {\em Lagrange polynomials}.
The Lagrange polynomial basis for the interval $I=[-N, N]$ is the family of $2N+1$ polynomials
\begin{equation} \label{eq:is:lagPoly}
L^{(N)}_n(t) = \mathop{\prod_{k = -N}}_{k\neq n}^{N}\frac{t - k}{n - k}, \qquad n = -N, \ldots, N
\end{equation}
each of which has degree $2N$. As an example, the family of five polynomials for $N=2$ is shown in Figure~\ref{fig:is:LagBasis}. A key property of the Lagrangian basis vector is that, for integer values of their argument, we have
\begin{equation} \label{eq:is:lagInterpProp}
L^{(N)}_n(m) = \left\{
\begin{array}{ll}
1 & \mbox{if $n=m$}\\
0 & \mbox{if $n \neq m$}
\end{array}\right. \qquad\qquad -N \leq n,m \leq N.
\end{equation}
Using the above result, it is easy to verify that the polynomial interpolator for a discrete-time signal of length $2N+1$ is simply
\begin{equation} \label{eq:is:lagInterp}
x(t) = \sum_{n = -N}^{N} x[n]L^{(N)}_n(t)
\end{equation}
that is, a linear combination of the Lagrangian basis vectors where the scalar coefficients are the discrete-time samples themselves. Indeed, since a polynomial of degree $M$ is uniquely determined by $M$ of its values and since~(\ref{eq:is:lagInterp}) is indeed equal to $x[n]$ for $2N+1$ integer values of $t$, the interpolator is indeed the unique solution to the problem.
\caption{Lagrange interpolation polynomials $L_n^{(2)}(t)$ for $n=-2,\ldots,2$. Note that $L_n^{(N)}(t)$ is zero for $t$ integer except for $t = n$, where it is $1$.}\label{fig:is:LagBasis}
As an example, Figure~\ref{fig:is:weightedLag} shows how the five polynomials $L_n^{(2)}(t)$ beautifully come together to interpolate a $5$-point discrete-time signal into the smooth curve in Figure~\ref{fig:is:finalLag}. A fundamental characteristic of polynomial interpolation is that it is {\it global}: for all values of $t$, \textit{all} the original discrete-time data points contribute to the instantaneous value of $x(t)$.
Although it elegantly solves the problem of finding the smoothest curve through a finite data set, polynomial interpolation suffers from two serious drawbacks. Firstly, although the Lagrangian basis provides a way to quickly compute the necessary coefficients, the set of basis vectors is completely different for every different length of the input data. From an engineering point of view it is obvious that we would like a more universal interpolation machine rather than having to redesign our interpolator as a function of problem size. The second difficulty, that is common to all types of polynomial fitting, is that the method becomes numerically ill-conditioned as the polynomial degree grows large.
A way out of these problems is to relax at least partially the smoothness constraint; this leads to the simpler interpolation schemes that we will look at in the next Section.
If we relax the smoothness constraint, we can design much simpler interpolation schemes. Consider for instance a device that builds a continuous-time signal simply by keeping its output constant and equal to the last discrete-time input data point between interpolation intervals:
\begin{equation*}
x_0(t) = x[\, \lfloor t + 0.5 \rfloor \,], \qquad -N \leq t \leq N;
\end{equation*}
the result of this interpolation method, known as Zero-Order Hold (ZOH), is shown in Figure~\ref{fig:is:zoh} for the same 5-point signal as in Figure~\ref{fig:is:diffint}-(a). The above expression can be rewritten as
\begin{equation} \label{eq:is:zoh}
x_0(t) = \sum_{n = -N}^{N}x[n]\rect(t - n),
\end{equation}
which shows a remarkable similarity to~(\ref{eq:is:lagInterp}). The differences, however, are extremely significative:
\begin{itemize}
\item the continuous-time term in the sum (i.e the rect function) is no longer dependent on the length of the original data set
\item the dependence of the continuous-time term on interpolation interval (that is, on $n$) is only via a simple time shift
\item the value of the output at any instant $t$ is dependent only on one input data point: the interpolation is \textit{local} rather than global.
\end{itemize}
In summary, the ZOH therefore creates a continuous-time signal by stitching together delayed and scaled versions of the \textit{same} interpolation function, also known as the interpolation kernel, independently of the amount of data to interpolate. The resulting ``interpolation machine'' is both conceptually and practically a much simpler device. Of course the price to pay for this simplicity is a discontinuous signal, but we can easily extend the local interpolation paradigm to improve the result.
The kernel used in the ZOH is, as we have seen, the simple rect function shown in Figure~\ref{fig:is:interpKernels}-(a):
\begin{equation}
I_0(t) = \begin{cases}
1 & \mbox{ if } |t| < 1/2 \\
0 & \mbox{ otherwise}
\end{cases}
\end{equation}
The kernel is discontinuous (that is, it belongs to $C^0$) and so is the interpolated signal. Only one value of the input is used to produce the current value of the output.
The kernel used in first-order interpolation is the triangle function shown in Figure~\ref{fig:is:interpKernels}-(b):
\begin{equation}
I_1(t) = \begin{cases}
1- |t| & \mbox{ if } |t| < 1 \\
0 & \mbox{ otherwise}
\end{cases}
\end{equation}
The kernel belongs to $C^1$ and the same holds for the output signal. An example of first-order interpolation applied to the usual five-point data set is shown in Figure~\ref{fig:is:firstOrdInterp}. The interpolation strategy is akin to ``connecting the dots'' between discrete-time samples so that, at any time $t$, \textit{two} discrete-time samples contribute to the output value.
Although the first-order interpolator is only marginally more complex than the ZOH, the continuity of the output implies a much better quality of the result as can be appreciated visually for familiar signals as in Figure~\ref{fig:is:interpCompare}.
In third-order interpolation\footnote{Even-order kernels are not used in practice since they are not symmetric around a central interpolation instant.} a commonly used kernel is the cubic piecewise function shown in Figure~\ref{fig:is:interpKernels}-(c):
The kernel belongs to $C^2$ (first and second derivatives are continuous) and the same holds for the output signal. The kernel's support is four and, as a consequence, the values of the continuous-time signal depend on four neighboring discrete-time samples. An example of third-order interpolation applied to the usual five-point data set is shown in Figure~\ref{fig:is:thirdOrdInterp}.
Local interpolation schemes can be extended to higher order kernels and, in general, a kernel of order $2N+1$ is composed of polynomials of degree $2N+1$ over a support of length $2N+2$. The resulting interpolation belongs to $C^{2N}$ and the lack of continuity in the derivative of order $2N+1$ ultimately will cause undesired high frequency content in the output, as we will see later.
The tradeoff seems however rather clear:
\begin{itemize}
\item global interpolation schemes such as polynomial interpolation provide maximum smoothness at the price of a complex procedure that depends on the length of the data set
\item local kernel-based methods are simple to implement but ultimately lack in smoothness.
\end{itemize}
Interestingly, however, as the size of the data set increases to infinity, the two methods become one and the same! This is perhaps one of the most remarkable mathematical results in discrete-time signal processing, and one that will lead naturally to the sampling theorem.
\subsection{Sinc Interpolation}
+\label{sec:is:sincinterp}
Consider again the expression for the Lagrange polynomial (\ref{eq:is:lagPoly}) and let's try to determine what happens when the size of the data set to interpolate (and therefore the degree of the Lagrange basis vectors) goes to infinity:
\lim_{N\rightarrow\infty} L^{(N)}_n(t) = \sinc\left(t - n \right)
\end{equation}
Remarkably, as $N$ goes to infinity, all Lagrange polynomials converge to simple time shifts of the same function and the function in question is the sinc; a graphical illustration of the convergence process for $L^{(N)}_0(t)$ is shown in Figure~\ref{fig:is:Lag2Sinc}. In other words, the maximally smooth interpolation for an infinite-length sequence can be obtained using a kernel-based method where the kernel is the sinc function:
Obviously, since the sinc is a two-sided, infinite support function, sinc interpolation falls in the same unattainable category as ideal filters: something that we can try to approximate but that we will never be able to exactly implement in practice. Nevertheless, the mathematical construction represents a fundamental cornerstone of the translation machinery between discrete and continuous time and will represent the starting point for the development of sampling theory.
Figure~\ref{fig:is:sincInterp} shows how the scaled and time-shifted copies of the sinc kernel come together to interpolated a discrete-time sequence (in the figure, $T_s = 1$). Note how the interpolation property of the sinc that we introduced in Section~\ref{sec:is:SincProp} guarantees that $x(nT_s) = x[n]$.
\caption{Sinc interpolation: discrete-time signal (top); first four scaled copies of the sinc kernel for $n = 0,1,2,3$ (center); final interpolation (bottom).}\label{fig:is:sincInterp}
\itempar{Spectral Properties of the Sinc Interpolation.}
The interpolation kernel $\sinc(t/T_s)$ is $F_s$-bandlimited, with $F_s = 1/T_s$. Therefore the sinc interpolation of a discrete-time sequence, being a linear combination of bandlimited functions, is itself bandlimited. If the DTFT $X(e^{j\omega})$ of the discrete-time sequence is well defined, the spectrum of the interpolated signal can be obtained as follows:
In other words, the continuous-time spectrum is just a scaled and stretched version of the DTFT of the discrete-time sequence between $-\pi$ and $\pi$. The duration of the interpolation interval $T_s$ is inversely proportional to the resulting bandwidth of the interpolated signal. Intuitively, a slow interpolation ($T_s$ large) results in a spectrum concentrated around the low frequencies; conversely, a fast interpolation ($T_s$ small) results in a spread-out spectrum (more high frequencies are present).\footnote{To find a simple everyday (yesterday's?) analogy, think of a $45$ rpm vinyl\index{vinyl} record played at either $33$~rpm (slow interpolation) or at $78$ rpm (fast interpolation) and remember the acoustic effect on the sounds.} Figure~\ref{fig:is:interpSpeed} illustrates the concept graphically.
\caption{Spectral characteristics of a sinc-interpolated signal: DTFT of the original discrete-time signal $x[n]$ (top); Fourier transform of its sinc interpolation with interpolation interval $T_s = T_0$ (second panel); spectrum of a sinc interpolation using a slower interpolation rate ($T_s = 2T_0$): spectrum concentrates in the low frequencies around the origin (third panel); spectrum of a sinc interpolation using a faster interpolation rate ($T_s = T_0/2$): spectrum spreads out to higher frequencies (third panel); the spectral shape and total energy remain the same.}\label{fig:is:interpSpeed}
In the previous Section we derived the sinc interpolation scheme as the limit of polynomial interpolation applied to infinite-length discrete-time sequences. A consequence of the previous result is that any square-summable discrete-time signal can be interpolated into a continuous-time signal which is smooth in time and strictly bandlimited in frequency. This suggests that the class of bandlimited functions must play a special role in bridging the gap between discrete and continuous time and this deserves further investigation. In particular, since a discrete-time signal can be interpolated exactly into a
bandlimited function, we now ask ourselves whether the converse is true: can any bandlimited signal be transformed into a discrete-time signal with no loss of information? The answer, once again, will show us the power of representing signals in an opportune vector space.
\itempar{The Space of Bandlimited Signals.}
The class of $F_s$-bandlimited functions with finite energy is a Hilbert space, where the inner product defined by~(\ref{eq:is:inner}). An orthogonal basis for this space can be obtained using the prototypical $F_s$-bandlimited function, that is, $\sinc(t/T_s)$; indeed, consider the family\index{basis!sinc}:
\begin{equation}
\varphi^{(n)}(t) = \sinc\left(\frac{t-nT_s}{T_s}\right), \qquad\quad n \in \mathbb{Z}
\end{equation}
where, once again, $T_s = 1/F_s$. Note that we have $\varphi^{(n)}(t) = \varphi^{(0)}(t -nT_s)$ so that each basis function is simply a shifted version of the prototype basis function $\varphi^{(0)}$. Orthogonality can easily be proved as follows: first of all, because of the symmetry of the sinc function and the time-invariance of the convolution, we can write
so that $\bigl\{\varphi^{(n)}(t) \bigr\}_{n\in \mathbb{Z}}$ is orthogonal with normalization factor $1/T_s$.
In order to show that the space of $F_s$-bandlimited functions is indeed a Hilbert space, we should also prove that the space is complete. This is a
more delicate notion to show\footnote{Completeness of the sinc basis can be proven as a consequence of the completeness of the Fourier series in the continuous-time domain.} and here it will simply be assumed.
\itempar{Sampling as a Basis Expansion.}
Now that we have an orthogonal basis, we can compute coefficients in the basis expansion of an arbitrary $F_s$-bandlimited function $x(t)$. We have
& = \int_{-\infty}^{\infty} \frac{1}{F_s}\, \rect\left(\frac{f}{F_s}\right) X(f)\, e^{j2\pi f n T_s}\, df\\
& = \frac{1}{F_s}\, \frac{1}{2\pi} \int_{-F_s/2}^{F_s/2} X(f) \, e^{j2\pi f n T_s}\, df \label{NBLProj}\\
& = T_s \, x(nT_s)
\end{align}
In the derivation, firstly we have rewritten the inner product as a convolution operation, after which we have applied the convolution theorem, and recognized the penultimate line as simply the inverse FT of $X(f)$ calculated in $t = nT_s$. We therefore have the remarkable result that the $n$-th basis expansion coefficient is \emph{equal to the sampled value of} $x(t)$ at $t = nT_s$ up to a scaling factor $T_s$. For this reason, the sinc basis expansion is also called \emph{sinc sampling}.
Reconstruction of $x(t)$ from its projections can now be achieved via the orthonormal basis reconstruction formula~(\ref{eq:vs:synthesis}); since the sinc basis is only orthogonal, rather than orthonormal, the formula needs to take into account the normalization factor and we have
which corresponds to the interpolation formula~(\ref{eq:is:sincInterp}).
\itempar{The Sampling: Theorem.}
\index{sampling theorem|mie}%
\index{sampling!theorem}
We have now all the elements in hand to formally enunciate the sampling theorem:
\begin{quote}
If $x(t)$ is an $F_s$-bandlimited continuous-time signal, a \emph{sufficient} representation of $x(t)$ is given by the discrete-time signal $x[n] = x(nT_s)$, with $T_s = 1/F_s$. The continuous time signal $x(t)$ can be exactly reconstructed from the discrete-time signal $x[n]$ as
The proof of the theorem is inherent to the properties of the Hilbert space of bandlimited functions, and is trivial once the the existence of an orthogonal
basis has been fully proven (but we skipped the completeness part).
The theorem gives us a sufficient condition for converting a continuous-time singnal into a discrete-time sequence with no loss of information. If there exists a frequency $f_N$ so that $X(f) = 0$ for $f>f_N$ then $x(t)$ is $f$-bandlimited for all choices of $f > 2f_N$; this means that we can safely sample $x(t)$ with any sampling period smaller than $1/(2f_N)$ or, alternatively, with any sampling frequency larger than twice the maximum frequency
\[
F_s > 2f_N.
\]
The original signal $x(t)$ can in this case be perfectly reconstructed from the sequence of samples via sinc interpolation.
In practice, if we know that the spectral content of a continuous-time signal is zero (or becomes \textit{de facto} negligible) above a certain maximum positive frequency $f_N$, then we know that we can safely sample it for all sampling frequencies larger than twice this maximum frequency. In the next Section we will study what happens when this bandlimitedness condition is not fully satisfied or satisfiable.
\section{Aliasing}\index{aliasing|(}
-``Naive'' sampling, as we have mentioned, is associated to the very intuitive idea of measuring the value of a phenomenon of interest over time in order to record its evolution. We have all done it in our lives, whether to monitor the temperature of a cake we are cooking or to keep track of weather patterns. In the previous Section we have formalized this intuition and shown the condition under which the sampling operation entails no loss of information and allows for perfect reconstruction.
+``Naive'' sampling, as we have mentioned, is associated to the very intuitive idea of measuring the value of a phenomenon of interest over time in order to record its evolution. We have all done it in our lives, whether to monitor the temperature of a cake we are cooking or to keep track of weather patterns. In the previous section we have formalized this intuition and shown the condition under which the sampling operation entails no loss of information and allows for perfect reconstruction.
The mathematical derivations showed that sampling should be interpreted as the decomposition of a continuous-time signal into a linear combination of sinc basis functions and that the samples are in fact the coefficients of this linear combination. Since the sinc basis is orthogonal, the process to determine the expansion coefficients is formally identical to what we could call ``raw sampling'', that is, the inner product between the continuous-time signal and the $n$-th basis vector is simply the scaled value of the signal at $t = nT_s$:
The equivalence between sinc sampling and raw sampling of course only holds if $x(t)$ is bandlimited to $F_s = 1/T_s$. If this is not the case but we still want to sample every $T_s$ seconds, then the approximation properties of orthogonal bases described in Section~\ref{sec:vs:approx} state that the minimum-MSE discrete-time representation of $x(t)$ is given by the samples \emph{of its projection over the space of $F_s$-bandlimited signals\/}. Indeed, we can easily reformulate~\ref{eq:is:sincSamp} as:
this shows that sinc sampling is equivalent to filtering $x(t)$ with a continuous-time, ideal lowpass filter with cutoff frequency $F_s/2$, followed by raw sampling, as shown in Figure~\ref{fig:is:sincSampling}. The filter truncates the spectrum of the signal outside of the $[-F_s/2, F_s/2]$ interval; for an already-bandlimited signal the filtering operation is obviously irrelevant (up to a scaling factor) and therefore sinc and raw sampling coincide. This interpretation also gives us the ``recipe'' for sampling real world signals: given a sampling rate of choice $T_s$, filter the input with a good lowpass filter with cutoff frequency $1/(2T_s)$ and then measure the output at regular intervals.
\caption{Sinc sampling interpreted as projection over the space of $F_s$-bandlimited functions (lowpass filtering) followed by raw sampling.}\label{fig:is:sincSampling}
In signal processing practice, however, there are many situation in which this idealized setup does not hold:
\begin{itemize}
\item good approximations to a continuous-time ideal filter are difficult to build (continuous-time filters are made of resistors, capacitors and coils!)
\item we may think the signal is bandlimited but in reality it is not
\item even with a good filter, out of band noise will leak into the raw sampler
\end{itemize}
In these and in many other cases, the hypothesis of perfect bandlimitedness required by the sampling theorem is not fulfilled. We will therefore now study in detail what happens when we raw-sample a signal at a rate that is not adapted to its spectral support.
\subsection{Aliasing: Intuition}
In the rest of this section we will assume that we are able to produce a discrete-time sequence by measuring the instantaneous value of a continuous-time signal at multiples of a sampling period $T_s$, as in Figure~\ref{fig:is:rawSampling}.
The questions we are interested in concern the potential loss of information introduced by raw sampling, a phenomenon called ``aliasing''. We will start to develop the intuition considering the problem of raw-sampling a sinusoid.
\itempar{Sampling of Sinusoids.}\index{complex exponential!aliasing}
Consider the simple continuous-time signal
\begin{equation}\label{eq:is:CTsinusoid}
x(t) = e^{j2\pi f_0 t}
\end{equation}
that represents a complex oscillation at frequency $f_0$. We can visualize the signal as the position of a point that rotates around the unit circle on the complex plane with angular speed of $2\pi f_0$ radians per second. Note that, in continuous time, a sinusoidal signal is always periodic with period $T=1/f_0$ (in discrete time this only happens for angular frequencies that are rational multiples of $2\pi$) and that all angular speeds are allowed (while in discrete time we are limited to a maximum forward speed of $\pi$ radians per sample).
Clearly, since $x(t)$ contains only one frequency, it is $f$-bandlimited for all $f > |f_0|$. The raw-sampled version of the continuous-time complex exponential is the discrete-time sequence
\begin{equation}
x[n] = e^{j2\pi (f_0/F_s)n} = e^{j\omega_0 n}
\end{equation}
where $F_s = 1/T_s$ is the sampling frequency of the raw sampler. If the frequency of the sinusoid satisfies $|f_0| < F_s/2$, then $\omega_0 \in (-\pi, \pi)$ and the frequency of the original sinusoid can be univocally determined from the sampled signal; in other words, we can reconstruct the original continuous-time signal from its raw samples. Now assume that $f_0 = F_s/2$; we have
\[
x[n] = e^{j\pi n} = e^{-j\pi n}
\]
In other words, we encounter a first ambiguity with respect to the direction of rotation of the complex exponential: from the sampled signal we cannot determine whether the original frequency was $f_0 = F_s/2$ or $f_0 = -F_s/2$. If we increase the frequency of the oscillation even more, say $f_0 = (1+\alpha)(F_s/2)$, we have
\[
x[n] = e^{j(1+\alpha)\pi n} = e^{-j\alpha\pi n};
\]
if we try to infer the original frequency from the sampled sinusoid, we cannot discriminate between a counterclockwise rotation with $f_0 = (1+\alpha)(F_s/2)$ or a clockwise rotation with $f_0 = -\alpha(F_s/2)$. Two different frequencies are mapped to the same digital frequency (hence the term ``aliasing'', indicating two or more entities sharing the same name). Finally, if $f_0$ grows to be larger than $F_s$ we can see the full obliterating effects of aliasing: write $f_0 = kF_s + f_1$ with $f_1 < F_s$, that is, $k = f_0 \mod F_s$; we have
\dspFunc[linecolor=dspCTColor]{x 3 mul 360 mul cos}
\sampCos{2.9}
\end{dspPlot}
\caption{Graphical illustration of aliasing. Top panel: 3Hz-sinusoidal signal $x(t) = \sin(6\pi t)$ (note that it covers 3 full periods in one second); second panel: raw samples at $F_s = 50$Hz (no aliasing); third panel: raw samples at $F_s = 2.9$Hz (full aliasing); bottom panel: wider view of the signal sampled at 2.9Hz, showing how the samples describe a sinusoid of frequency $f_1 = 0.1$Hz (one period in 10 seconds); indeed, we can write $f_0 = F_s + f_1$ as in the derivation of~(\ref{eq:is:twoSinAlias}).}\label{fig:is:aliasSinusoid}
\itempar{Energy Folding of the Fourier Transform.}
Given a raw sampling frequency $F_s$, consider the continuous-time signal
\[
x(t) = Ae^{j2\pi f_a t} + Be^{j2\pi f_b t}
\]
where $f_b = f_a + F_s$. The sampled version of this signal is
\begin{align*}
x[n] &= A\, e ^{j2\pi(f_a/F_s)n} + B\, e ^{j2\pi(f_b/F_s) n}\\
&= A\, e ^{j2\pi(f_a/F_s)n} + B\, e ^{j2\pi(f_a/F_s + 1) n}\\
&= A\, e ^{j\omega_a n} + B\, e ^{j\omega_a n}e^{j2\pi n} \\
&= (A+B) \,e ^{j\omega_a n}
\end{align*}
In other words, two continuous-time exponentials that are $F_s$~Hz apart are indistinguishable, once sampled at $F_s$~Hz, from a single discrete-time complex exponential whose amplitude is equal to the sum of the amplitudes of the original sinusoids.
To understand what happens to a general signal after raw sampling, consider the interpretation of the inverse Fourier transform as a bank of (infinitely many) complex oscillators, initialized with phase and amplitude, each contributing to the energy content of the signal at their respective frequency. Sampling the original signal is like sampling its inverse FT, and therefore sampling all the contributing complex exponentials. In the
sampled version, any two frequencies $F_s$ apart become indistinguishable and so the contributions of all sampled oscillations at multiples of $F_s$ add up to the same discrete-time oscillation in the spectrum of the sampled signal. This aliasing can be represented as a spectral \emph{superposition}: the continuous-time spectrum above $F_s/2$ is shifted back to $-F_s/2$, summed over $[-F_s/2, F_s/2]$, and the process is repeated again and again; the
same applies for the spectrum below $-F_s/2$, as illustrated in Figure~\ref{fig:is:periodization}. This process is the familiar periodization of a signal:
In the following, we consider the relationship between the DTFT of a raw-sampled signal $x[n]$ and the FT of the original continuous-time signal $x_c(t)$. For clarity, we add the subscript ``$c$'' to all continuous-time quantities, e.g.:
\[
x[n] = x_c(nT_s);
\]
also, all periodic functions will be denoted by the usual tilde notation.
Assume $x(t)$ is absolutely integrable, so that the sampled sequence $x[n] = x_c(nT_s)$ is absolutely summable and therfore its DTFT $X(e^{j\omega})$ is well defined. Using the inversion formula we can write:
at which point we will be able to establish that the DTFT of the raw-sampled sequence is the function $g(\omega)$.
Fist of all, we know from the discussion in the previous section that all frequencies $F_s$ apart will give indistinguishable contributions to the discrete-time spectrum; we can therefore split the integration in~(\ref{eq:is:alpo}) into a sum of integrals over contiguous, non-overlapping intervals of width equal to $F_s$:
Now we perform the change of variable $f \rightarrow f + kF_s$ so that the integration limits of each term become $\pm F_s/2$ and $e^{j2\pi(f - kF_s)T_s n} = e^{j2\pi f\, T_s n}$:
After interchanging the order of integration and summation (which can be safely done if $x_c(t)$ is absolutely integrable as we have assumed), we can recognize the term in brackets as the periodization of the original spectrum $X_c(f)$. Define the $F_s$-periodic function
In other words, the DTFT of a raw-sampled signal is the Fourier transform of the signal, periodized with period $F_s$ and rescaled to be $2\pi$-periodic. The result is known in Fourier theory under the name of \emph{Poisson sum formula}. Explicitly,
When $x_c(t)$ is $F_s$-bandlimited, the copies in the periodized spectrum do not overlap and the $2\pi$-periodic discrete-time spectrum between $-\pi$ and $\pi$ is simply\index{aliasing|mie}
Figures \ref{fig:is:alias1} to \ref{fig:is:alias4} illustrate four different prototypical examples of the relationship between the continuous-time spectrum and the discrete-time spectrum. For all figures, the top panel shows the continuous-time spectrum $X(f)$, with labels indicating the sampling frequency. The middle panel shows the periodic function $\tilde{X}_c(f)$ defined in~(\ref{eq:is:periodizedFT}); the repeated, shifted copies of the spectrum are plotted with a dashed line (but they are not visible if there is no overlap), with the main period highlighted. Finally, the last panel shows the DTFT of the sampled sequence over the customary $[-\pi,\pi]$ interval.
Figure~\ref{fig:is:alias1} shows the result of sampling a bandlimited signal with a sampling frequency in excess of the minimum (in this case, $F_s = (3/2)f_N$, where $f_N$ indicates the maximum positive frequency in the original spectrum); in this case we say that the signal has been \emph{oversampled}. The result is that spectral copies do not overlap in the periodization so that the discrete-time spectrum is just a scaled version of the original spectrum,
with a narrower support than the full $[-\pi, \pi]$ range because of the oversampling (in this case $\omega_{\max} = 2\pi / 3$).
\BLcase{Oversampling a continuous-time signal with a sampling frequency larger than twice the maximum positive frequency; no aliasing.}{\dspQuad}{0.666 }{1}
Figure~\ref{fig:is:alias2} shows the result of sampling a bandlimited signal with a sampling frequency exactly equal to twice the maximum positive frequency; in this case we say that the signal has been \emph{critically sampled}. In the periodized spectrum, once again the copies do not overlap and the discrete-time spectrum is a scaled version of the original spectrum occupying the whole $[-\pi, \pi]$ range.
\BLcase{Critically sampling a continuous-time signal with a sampling frequency equal to twice the maximum positive frequency; no aliasing.}{\dspQuad}{1 }{2}
Figure~\ref{fig:is:alias3} shows the result of sampling a bandlimited signal with a sampling frequency less than twice the maximum frequency. In this case, copies do overlap in the periodized spectrum and the resulting discrete-time spectrum is an aliased version of the original; the continuous-time signal can\emph{not} be reconstructed from the sampled signal. Note, in particular, that the original lowpass spectral characteristic becomes a highpass shape in
the sampled domain (energy at $\omega=\pi$ is larger than at $\omega=0$). This example allows us to guess the type of distortion introduced by aliasing: if the source is an audio signal such as music (a naturally lowpass signal), aliasing would introduce extremely disrupting and spurious high frequency content that would render the signal almost unintelligible. Because of the additive nature of the spectral periodization, no type of post-processing in the digital domain can completely get rid of the aliased components, making the discrete-time data all but unusable.
\BLcase{Undersampling a continuous-time signal with a sampling frequency smaller than twice the maximum positive frequency; aliasing is incurred.}{\dspQuad}{1.5 }{3}
Figure~\ref{fig:is:alias4} shows the result of sampling a non-bandlimited signal with a sampling frequency which is chosen as a tradeoff between alias and number of samples per second. The idea is to disregard the low-energy ``tails'' of the original spectrum so that their alias does not
exceedingly impact the discrete-time spectrum. In the periodized spectrum, copies do overlap and the resulting discrete-time spectrum is an aliased version of the original, which is nevertheless similar to the original because of the low amplitude of the aliased tails. This scenario also occurs when the original signal is bandlimited but out of band noise leaks into the sampler.
\itempar{Sinc Sampling of Non-Bandlimited Signals.}
+\ref{sec:is:antialias}
Figure~\ref{fig:is:alias5} shows the preferred design when sampling non-bandlimited signals. An ideal continuous-time lowpass filter $H(f)$ with cutoff frequency $F_s/2$ is used to limit the spectral support of $x_c(t)$. The periodization of the spectrum after the filter is alias free and the only distortion introduced by the process is the \textit{controlled} loss of high-frequency content induced by the anti-alias filter $H(f)$. This procedure is the sinc sampling operation described in the beginning of this section and, as we showed, it is optimal with respect to the Mean Square Error between original and reconstructed signals.
\caption{Sampling a non-bandlimited continuous-time signal with the use of an anti-alias filter $H(f)$ with cutoff frequency $F_s/2$.}\label{fig:is:alias5}
\section{Discrete-Time Processing of Analog Signals (and vice-versa)}
\label{sec:is:dtctproc}
Sampling and interpolation (or, more precisely, the A/D and D/A conversions strategies which we will see in the next chapter) represent the entry and exit points of the powerful processing paradigm for which DSP is ``famous''. Samplers and interpolators represent the only interfaces with the physical world, while all the processing and the analysis are performed in the abstract, a-dimensional world of a general purpose microprocessor. In this section we will see how to use digital filters to process continuous-time signals and then we will study how the continuous-time world can help us shed some light on the inner workings of some of the more ``esoteric'' digital filters in current use.
\subsection{Impulse Invariance}
Consider the hypothetical machine shown in Figure~\ref{fig:is:ctproc}; it reflect the general structure of a device designed to accept and output real-world (i,e. continuous-time) signals while performing the processing digitally. In most cases, the sampler's and the interpolator's frequencies are the same, and they are chosen as a function of the bandwidth of the class of signals for which the device is intended; let us assume that the input is $F_s$-bandlimited and that the sampling period is $T_s$. For the case in which the processing block is a linear filter $H(z)$, the overall processing chain implements the equivalent of a continuous-time filter. Indeed, from~(\ref{eq:is:noAliasingSpecEq}) and~(\ref{eq:is:interpSpec}) we obtain the chain of relations
which corresponds to device acting as a continuous-time filter with frequency response
\begin{equation}\label{eq:is:dtpcts}
H_c(f) = H(e^{j2\pi f/ F_s}).
\end{equation}
So, for instance, if $H(z)$ is a lowpass filter with cutoff frequency $\pi/3$, the processing chain in Figure~\ref{fig:is:ctproc} would implement an
analog lowpass filter with cutoff frequency $F_s / 6$. By adjusting the design of the filter (and/or the sampling rate) we can implement any desired characteristic purely in the digital domain.
Consider now the processing chain in Figure~\ref{fig:is:dtproc}; as opposed to the previous case, this is a purely virtual setup that we can use to understand how to perform certain operations that are simple in the continuous-time domain but not straightforward in discrete time. In the processing chain, the interpolator converts a discrete-time input to a continuous-time signal, a filter is applied and finally the result is sampled again. Linear filters cannot increase the bandwidth of a signal, so the rates at interpolator and sampler can safely be the same. Since the timebase $T_s$ can be arbitrary, let's pick $T_s = 1$ . With this we can use~(\ref{eq:is:interpSpec}) and~(\ref{eq:is:noAliasingSpecEq}) to write
In other words, the relationship between the discrete-time input and output is the same as the relationship between their continuous-time counterparts, and determined by the characteristics of the continuous-time filter; the overall effect is that of a digital filter with frequency response
We know that a discrete-time delay of $d$ samples is a linear system with transfer function $H(z) = z^{-d}$ (or, alternatively, $H(e^{j\omega}) = e^{-j\omega d}$). If $d$ is {\em not} an integer, the transfer function is still formally valid and we have stated in the past that it implements a so-called {\em fractional delay,} i.e.\ a delay which lies ``in between'' two integer delays. The duality framework will allow us to understand exactly how this works.
Remember that in continuous time we can delay a signal by any amount and that the Fourier transform of a delayed signal is
\[
\mbox{FT}\left(x_c(t-d)\right) = e^{-j2\pi d f}X_c(f)
\]
that is, the delay operator can be seen as the result of a continuous-time filter with frequency response $H_c(f) = e^{-j2\pi d f}$. If we use this filter in
Figure~\ref{fig:is:dtproc}, we have that $y_c(t) = x_c(t - d)$ and the same relationship is valid for the associated discrete-time signals. Indeed, the overall transfer function implemented by the system is, according to~\ref{eq:is:dtpcts}
The implicit action of a fractional delay is therefore to ``resample'' the signal at intermediated point between the original sampling instants. We can appreciate this also by considering the impulse response
\[
h[n] = \sinc(n - d);
\]
if we write out the convolution explicitly, we have
which is the original signal, sinc-interpolated with $T_s =1$, and computed in $nT_s-d$.
\itempar{A Digital Differentiator}\index{differentiator!exact}
In Section~\ref{elopsSec} we introduced an approximation to the differentiator operator in discrete time as the first-order difference between neighboring samples. To obtain a better approximation, consider differentiation in continuous time; a classical result in Fourier analysis states that, under very broad conditions, if $x(t)$ is a differentiable function with Fourier transform $X(f)$, then\footnote{This is relatively easy to prove by applying integration by
part to the expression for the Fourier transform of $x'(t)$.}
\[
\textrm{FT}\left(x'(t)\right) = j2\pi f X(f)
\]
The differentiation operation in continuous time acts therefore as a filter with frequency response $H(f) = j2\pi f$. If we use this filter in Figure~\ref{fig:is:dtproc} we obtain the equivalent discrete-time transfer function
\begin{equation}\label{eq:is:diffFR}
H(e^{j\omega}) = j\omega
\end{equation}
which implements a true differentiator in discrete time: this should be interpreted in the sense that the digital differentiator computed the first derivative of the underlying interpolated continuous-time signal, but \textit{without an explicit computation of the latter!} Of course there is no free lunch: the impulse response, which we can compute via an inverse DTFT, is
this is an infinite, two-sided sequence, which places the digital differentiator in the category of ideal filters; nevertheless, good FIR approximations can be obtained using the Parks-McClellan algorithm. The filter's characteristics are illustrated in Figure~\ref{fig:is:diffint}.