As we have seen, a continuous-time signal can be converted to a discrete-time sequence via sampling. By changing the value of the sampling rate we can obtain an arbitrary number of discrete-time signals from the same original continuous-time source; the number of samples per second will increase or decrease linearly with the sampling rate and, according to whether the conditions of the sampling theorem are satisfied or not, the resulting discrete-time sequences will be an exact representation of the original signal or will be affected by aliasing. Mutirate theory studies the relationship between such sequences; or, in other words, it addresses the question of whether we can transform a sampled representation into another with a different underlying sampling frequency purely from within discrete time.
The primary application of multirate theory is digital sampling rate conversion, an operation that becomes necessary when the original sampling frequency of a stored signal is incompatible with the working rate of the available D/A device. But we will see that multirate also plays an important role in a variety of applied signal processing algorithms; in particular, the technique of \textit{oversampling} is often used in situations where an increase of the data rate is used to improve the performance of the analog elements in the processing chain. Since speeding up digital computations is in general much cheaper than using high-performance analog circuits, oversampling is commonly used in consumer-level devices. Other applications of multirate range from efficient filter design, to spectral shaping for communication systems, to data compression standards. And, finally, from a more theoretical point of view, multirate is a fundamental ingredient of advanced analysis techniques which go under the name of time-frequency decompositions.
\section{Downsampling}\index{downsampling|(}
\label{sec:mr:down}
Downsampling by~$N$\index{downsampling|mie}\index{decimation} (also called subsampling or decimation\footnote{%
Technically, decimation means $9$ out of $10$ and refers to a roman custom of killing every $10$th soldier of a defeated army\ldots})
produces a lower-rate sequence by keeping only one out of $N$ samples in the original signal. If we denote by $\mathcal{S}_N$ the downsampling operator\footnote{
We use the letter $\mathcal{S}$ rather than $\mathcal{D}$ since the latter indicates the delay operator.},
Downsampling, as shown in Figure~\ref{fig:mr:down} effectively {\em discards}$N-1$ out of $N$ samples and therefore a loss of information may be incurred. Indeed, in terms of the underlying sampling frequency, decimation produces the signal that would have been obtained by sampling $N$ times more slowly. The potential problems with this data reduction will take the form of aliasing, as we will see shortly.
One of the consequences of the lack of time-invariance is that complex sinusoids are not eigensequences for the downsampling operator; for instance, if $x[n]= e^{j \pi n} =(-1)^n$, we have
\begin{equation}
\label{eq:mr:12}
x_{2\downarrow}[n] = x[2n] = e^{j 2\pi n} = 1;
\end{equation}
in other words, the highest digital frequency has been mapped to the lowest frequency. This looks very much like aliasing, as we will now explore in detail.
\caption{Downsampling by $4$ in the time domain: original signal (top panel); samples ``killed'' by the downsampler (middle panel); downsampled signal (bottom panel). Note the difference in time indexes between top and bottom panels.}\label{fig:mr:down}
where $\xi_N[n]$ is a ``killer sequence'' defined as
\[
\xi_N[n]=
\begin{cases}
1 & \mbox{ for $n$ multiple of $N$} \\
0 & \mbox{ otherwise}
\end{cases}
\]
and shown in Figure~\label{fig:mr:xi}; indeed, $\xi_N[n]$ will ``kill off'' all the terms in the sum for which the index is not a multiple of $N$. The sequence $\xi_N[n]$ is $N$-periodic and one way to represent it is as the inverse DFS of size $N$ of a vector of all ones. In other words,
we have the scaled sum of $N$ superimposed copies of the original spectrum $X(e^{j\omega})$ where each copy is shifted in frequency by a multiple of $2\pi/N$. We are in a situation similar to that of equation~(\ref{eq:is:periodizedFT}) where sampling created a periodization of the underlying spectrum; here the spectra are already inherently $2\pi$-periodic, and downsampling creates $N-1$ additional interleaved copies. The final spectrum $X_{N\downarrow} (e^{j\omega})$ is simply a stretched version of $A(e^{j\omega})$, so that the interval $[-\pi/N, \pi/N]$ becomes $[-\pi, \pi]$.
Because of the superposition, aliasing\index{aliasing!in multirate} can occur; this is a consequence of the potential loss of information that occurs when samples are discarded. For baseband signals, it is easy to verify that in order for the spectral copies in~(\ref{eq:mr:dss}) not to overlap, the maximum (positive) frequency $\omega_M$ of the original spectrum\footnote{
Here, for simplicity, we are imagining a lowpass real signal whose spectral magnitude is symmetric. More complex cases exist and some examples will be described next.}
must be less than $\pi/N$; this is the \emph{non-aliasing condition} for the downsampling operator. Conceptually, fulfillment of the non-aliasing condition indicates that the discrete-time representation of the original signal is intrinsically redundant; $(N-1)/N$ of the information can be safely discarded and this is mirrored by the fact that only $1/N$ of the spectral frequency support is nonzero. We will see shortly that, in this case, the original signal can be perfectly reconstructed with an upsampling and filtering operation.\index{downsampling|)}
\subsection{Examples}
In Figures~\ref{fig:mr:exA} to~\ref{fig:mr:exE}) the top panel shows the original spectrum $X(e^{j\omega})$; the second panel shows the same spectrum but plotted over a wider interval so as to make its periodic nature explicit; the third panel shows in different colors the individual terms in the sum in~(\ref{eq:mr:nonscaled}); the fourth panel shows $A(e^{j\omega})$\emph{before} scaling and stretching by $N$; finally, the last panel shows $X_{N\downarrow}(e^{j\omega})$ over the usual $[-\pi, \pi]$ interval.
\itempar{Downsampling by 2.} If the downsampling factor is $2$, the \ztrans\ and the Fourier transform of the output are simply
Figure~\ref{fig:mr:exA} shows an example for a lowpass signal whose maximum frequency is $\omega_M =\pi/2$ (i.e.\ a half-band signal). The non-aliasing condition is fulfilled and, in the superposition, the two shifted versions of the spectrum do not overlap. As the frequency axis stretches by a factor of $2$, the original half-band signal becomes full band.
Figure~\ref{fig:mr:exB} shows an example in which the non-aliasing condition is violated. In this case, $\omega_M =2\pi/3 > \pi/2$ and the spectral copies do overlap. We can see that, as a consequence, the downsampled signal loses its lowpass characteristics. Information is irretrievably lost and the original signal cannot be reconstructed.
\caption{Downsampling by $3$; the highest frequency is larger than $\pi/3$ (here, $\omega_M =2\pi/3$) and aliasing occurs. Notice how three spectral replicas contribute to the final spectrum.}\label{fig:mr:exC}
\itempar{Downsampling a Highpass Signal.} Figure~\ref{fig:mr:exD} shows an example of downsampling by $2$ applied to a half-band {\em highpass} signal. Since the signal occupies only the upper half of the $[0, \pi]$ frequency band (and, symmetrically, only the lower half of the $[-\pi, 0]$ interval), the interleaved copies do not overlap and, technically, there is no aliasing. The shape of the signal, however, is changed by the downsampling operation and what started out as a highpass signal is transformed into a lowpass signal. To make the details of the transformation clearer in this example we have used a {\em complex-valued} highpass signal for which the positive and negative parts of the spectrum have different shapes; it is apparent how the original left and right spectral halves are end up in reverse positions in the final result. The original signal can be exactly reconstructed (since there is no destructive overlap between spectral copies) but the required procedure is a bit more creative and will be left as an exercise.
\caption{Downsampling by~$2$ of a {\em complex} highpass signal; the asymmetric spectrum helps to understand how non-destructive aliasing works.}\label{fig:mr:exD}
\subsection{Antialiasing Filters in Downsampling} We have seen in Section~\ref{sec:is:antialias} that, in order to control the error when sampling a non-bandlimited signal, our best strategy is to bandlimit the signal using a lowpass filter. The same holds when downsampling by $N$ a signal whose spectral support extends beyond $\pi/N$: before downsampling we should apply a lowpass with cutoff $\omega_c =\pi/N$ as shown in Figure~\ref{fig:mr:downfilt}. While a loss of information is still unavoidable, the filtering operation allows us to control said loss and prevent the disrupting effects of aliasing.
An example of the processing chain is shown in Figure~\ref{fig:mr:exE} for a downsampling factor of~$2$; a half-band lowpass \index{half-band filter} filter is used to truncate the signal's spectrum outside of the $[-\pi/2, \pi/2]$ interval and then downsampling proceeds as usual with non-overlapping spectral copies.
Upsampling by $N$ produces a higher-rate sequence by creating $N$ samples for each sample in the original signal. In its simplest form, an upsampler just inserts $N-1$ zeros after every input sample, as shown in Figure~\ref{fig:mr:up}. If we denote by $\mathcal{U}_N$ the upsampling operator, we have
\caption{Upsampling by $4$ in the time domain: original signal (top panel); upsampled signal, where 3 zeros have been appended to each original sample (bottom panel). Note the difference in time indexes between top and bottom panels. }\label{fig:mr:up}
Upsampling is a much ``nicer'' operation than downsampling since no information is lost; the original signal can always be exactly recovered by applying a congruent downsampling operation:
so that upsampling is simply a contraction of the frequency axis by a factor of $N$. The inherent $2\pi$-periodicity of the spectrum must be taken into account so that, in this contraction, the periodic repetitions of the base spectrum are ``pulled in'' the $[-\pi, \pi]$ interval. The effects of upsampling on a signal's spectrum are shown graphically for a simple signal in Figures~\ref{fig:mr:upA} and~\ref{fig:mr:upB}; in all figures the top panel shows the original spectrum $X(e^{j\omega})$ over $[-\pi, \pi]$; the middle panel shows the same spectrum over a wider range to make the $2\pi$-periodicity explicitly; the last panel shows the upsampled spectrum $X_{N\uparrow}(e^{j\omega})$, highlighting the rescaling of the $[-N\pi, N\pi]$ interval.\index{upsampling|)} As a rule of thumb, upsampling ``brings in'' exactly $N$ copies of the original spectrum over the $[-\pi, \pi]$ interval even if, in the case of an even upsampling factor, one copy is split between the negative and positive frequencies.
The upsampled signal in~(\ref{eq:mr:up}), with its $N-1$ zeros between original samples, exhibits two problems. In the time domain, the upsampled signal looks ``jumpy'' because of the periodic zero runs. This is the discrete-time equivalent to a lack of smoothness, since the signal keeps dropping back to zero, and it is apparent in the bottom panel of Figure~\ref{fig:mr:up}. In the frequency domain, simple upsampling has ``brought in'' copies of the original spectrum in the $[-\pi, \pi]$ interval, creating spurious high frequency content. These two issues are actually one and the same and they can be solved, as we will see, by using an appropriate filter.
The problem of filling the gaps between nonzero samples in an upsampled sequence is, in many ways, similar to the discrete- to continuous-time interpolation problem of Section~\ref{sec:is:interp}, except that now we are operating entirely in discrete time. If we adapt the interpolation schemes that we have already studied, we have the following cases\index{interpolation!in multirate}:
In this discrete-time interpolation scheme, also known as \emph{piecewise-constant interpolation}, after upsampling by $N$, we use a filter with impulse response
\begin{equation}\label{eq:mr:zoh}
h_0 [n] =
\begin{cases}
1 &\ n = 0,1, \ldots, N-1 \\
0 &\mbox{ otherwise}
\end{cases}
\end{equation}
which is shown in Figure~\ref{fig:mr:zoh}-(a). This interpolation filter simply repeats the last original input samples $N$ times, giving a staircase approximation as shown in the top panel of Figure~\ref{fig:mr:upinterp}.
We know that, in continuous time, the smoothest interpolation is obtained by using a sinc function. This holds in discrete-time as well, and the resulting interpolation filter is the discrete-time sinc:
\begin{equation}
h[n] = \sinc\left(\frac{n}{N}\right)
\end{equation}
Note that the sinc above is equal to one for $n =0$ and is equal to zero at all integer multiples of $N$, $n = kN$; this fulfills the interpolation condition, that is, after interpolation, the output equals the input at multiples of $N$: $(h \ast x_{N\downarrow})[kN]= x_{N\downarrow}[kN]= x[k]$.
The three impulse responses above are all lowpass filters; in particular, the sinc interpolator is an ideal lowpass with cutoff frequency $\omega_c =\pi/N$ while the others are approximations of the same. As a consequence, the effect of the interpolator in the frequency domain is the removal of the $N-1$ spectral copies ``drawn in'' the $[-\pi, \pi]$ interval by the upsampler. An example is shown in
Figure~\ref{fig:mr:upfilt} where the signal in Figure~\ref{upsamplingFigC} is filtered by an ideal lowpass filter with cutoff $\pi/4$.
It turns out that the smoothest possible interpolation in the time domain corresponds to the perfect removal of the spectral repetitions in the frequency domain. Interpolating with a zero-order or a first-order kernel, by contrast, only attenuates the replicas instead of performing a full removal, as we can readily see by considering their frequency responses. Since we are in discrete-time, however, there are no difficulties associated to the design of a digital lowpass filter which closely approximates an ideal filter, so that alternate kernel designs (such as optimal FIRs) can be employed.
This is in contrast to the design of discrete---to continuous---time interpolators, which are analog designs. That is why sampling rate changes are much more attractive in the discrete-time domain.
\section{Fractional Sampling Rate Changes}
The conversion from one sampling rate to another can always take the ``obvious'' route of interpolating to continuous time and resampling the resulting signal at the desired rate; but this would require dedicated analog equipment and would introduce some quality loss because of the inevitable analog noise. We have just seen, however, that we can increase or decrease the implicit rate of a sequence by an integer factor entirely in the discrete-time domain, by using an upsampler or a downsampler and a digital lowpass filter. For fractional sampling rate changes we simply need to cascade the two operators.
The order in which upsampling and downsampling are performed in a rate changer is crucial since, in general, the operators are not commutative. It is easy to appreciate this fact by means of
Intuitively it's clear that, since no information is lost when using an upsampler, in a fractional sampling rate change the upsampling operation will come first. Typically, a rate change by $N/M$ is obtained by cascading an upsampler by~$N$, a lowpass filter, and a downsampler by~$M$. Since normally the upsampler is followed by a lowpass with cutoff $\pi/N$ while the downsampler is preceded by a lowpass with cutoff $\pi/M$, we can use a single lowpass whose cutoff frequency is the minimum of the two. A block diagram of this system is shown in Figure~\ref{fig:mr:frac}.\index{fractional sampling rate change}\index{rational sampling rate change}
As an example, suppose we want to increase the rate of a sequence originally sampled at 24~KHz up to 32~KHz. For this we need a fractional change of $32/24$ which, after simplification, corresponds to an upsampler by 4 followed by a downsampler by 3, as shown in the top panel of Figure~\ref{fig:mr:fracex}; the lowpass filter's cutoff frequency is $\pi/4$ and, in this case, the lowpass filter acts solely as an interpolator since the overall rate is increased. Conversely, if we want to convert a 32~KHz signal to a 24~KHz, that is, apply a sampling rate change by $3/4$, we can use the cascade shown in the bottom panel of Figure~\ref{fig:mr:fracex}; the cutoff frequency of the filter does not change but the filter, in this case, acts as an anti-alias.
If the ratio between the two sampling frequencies cannot be decomposed into a ratio of small coprime factors, the intermediate rates in a fractional rate changer can be prohibitively high. That was the rationale, for instance, behind an infamous engineering decision taken by the audio industry in the early 90's when the first digital audio recorders (call DAT) were introduced on the market. In order to make it impossible for users to create perfect copies of CDs on digital tapes, the sampling frequency of the recorders was set to 48~KHz. Since CDs are encoded at 44.1~KHz, this resulted in a required fractional change rate of 160/147. At the time, a 160-fold upsampling was simply not practical to implement so users would have to necessarily go through the analog route to copy CDs.
In practice, however, fractional resamplers can be implemented using local interpolation techniques. To understand the procedure, let's first analyze the practical version of a subsample interpolator and then apply the idea to a resampler.
\itempar{Subsample Interpolation.}
\index{subsample interpolation}
Consider an $F_s$-bandlimited continuous-time signal $x(t)$ and its sampled version $x[n]= x(nT_s)$, with $T_s \leq1/F_s$. Given a fractional delay $\tau T_s$, with $|\tau|< 1/2$, we want to determine the sequence
\[
x_\tau[n]= x(nT_s +\tau T_s)
\]
using only discrete-time processing; for simplicity, let's just assume $T_s=1$. We know from Section~\ref{sec:is:duality} that the theoretical way to obtain $x_\tau[n]$ from $x[n]$ is to use an ideal fractional delay filter:
As we have seen in Section~\ref{sec:is:sincinterp}, the sinc interpolator originates as the limit of polynomial interpolation when the number of interpolation points goes to infinity. In this case we can work backwards, and replace the sinc with a low-degree, \textit{local} Lagrange interpolation as in Section~\ref{sec:is:lagrange}. Given $L$ samples to the left and the right of $x[n]$, we can build the continuous-time signal\index{Lagrange interpolation}
where $L_k^{(N)}(t)$ is the $k$-th Lagrange polynomial of order $2N$ defined in~(\ref{eq:is:lagPoly}). With this, we can use the approximation
\begin{equation}\label{eq:mr:subApprox}
x_\tau[n] = x_L(n; \tau) \approx x(n + \tau).
\end{equation}
As an example, consider setting $L=1$, that is, using a quadratic Lagrange interpolation; the corresponding three Lagrange polynomials are shown in Figure~\ref{fig:mr:lagPoly} while the interpolation and approximation procedure are shown graphically in Figure~\ref{fig:mr:approx}.
\caption{Local quadratic Lagrange interpolation $x_L(n; t)$ around $n$ and approximation $x_\tau[n]= x_L(n; \tau)$ for $\tau=0.3$}\label{fig:mr:approx}
which is the convolution, computed in $n$, of the input signal with the $(2N+1)$-tap FIR
\begin{equation}\label{eq:mr:FIRcoeffs}
\hat{d}_\tau[n] = \begin{cases}
L_k^{(N)}(\tau) &\mbox{for $|n| \leq L$}\\
0 &\mbox{otherwise}
\end{cases}
\end{equation}
For instance, for a quadratic interpolation as in Figure~\ref{fig:mr:approx}, the nonzero coefficients are
\begin{align*}
\hat{d}_\tau [-1] & = \tau\frac{\tau -1 }{2}\\
\hat{d}_\tau [0] & = - (\tau +1)(\tau -1) \\
\hat{d}_\tau [ 1] & = \tau\frac{\tau +1}{2}
\end{align*}
The FIR interpolator is expressed in noncausal form purely out of convenience; in practical implementations an additional delay would be used to make the whole processing chain causal.
\itempar{Local resampling.} The principle of the subsample interpolator we just described can be used to perform fractional resampling. Again, consider an $F_s$-bandlimited continuous-time signal $x(t)$ and its sampled version $x_1[n]= x(n/F_1)$, with $F_1\geq F_s$. Given a second sampling frequency $F_2\geq F_s$, we want to estimate $x_2[n]= x(n/F_2)$ using only $x_1[n]$. We have:
\begin{equation}%\label{eq:mr:FIRcoeffs}
x_2[n] = x(n/F_2) = x
\end{equation}
\end{document}
Interestingly enough, however, if the downsampling and
upsampling factors $N$ and $M$ are coprime, the operators do