Consider a real function $x(t)$ for which the Fourier transform is well defined:
\begin{equation}\label{eq:is:exSampeq1}
X(f) = \int_{-\infty}^{\infty}x(t)\, e^{-j2\pi f t}\, dt
\end{equation}
Suppose that we only possess a sampled version of $x(t)$, that is, we only know the numeric value of $x(t)$ at times multiples of a sampling interval $T_s$ and that we want to obtain an approximation of the Fourier transform above.
Assume we do not know about the DTFT; an intuitive (and standard) place to start is to express the Fourier integral as a Riemann sum:
\begin{equation}\label{eq:is:exSampeq2}
X(f) \approx\hat{X}(f) = \sum_{n=-\infty}^{\infty} T_s x(nT_s) \, e^{-j 2\pi f n T_s }
\end{equation}
an expression that only uses the known sampled values of $x(t)$. In order to understand whether~(\ref{eq:is:exSampeq2}) is a good approximation consider the periodization of $X(f)$:
\begin{equation}\label{eq:is:exSampeq3}
\tilde{X}(f) = \sum_{k=-\infty}^{\infty} X\left( f + kF_s \right)
\end{equation}
in which $X(f)$ is repeated (with overlap) with period $F_s$. We will show that:
\[
\hat{X}(f)=\tilde{X}(f)
\]
that is, the Riemann approximation is equivalent to a periodization \index{periodization} of the original Fourier transform; in mathematics this is known as a particular form of the {\em Poisson sum formula}\index{Poisson sum formula}.
Consider the periodic nature of $\tilde{X}(j\Omega)$ and remember that any periodic function $s(\tau)$ of period $L$ admits a \emph{Fourier series}\index{Fourier series} expansion:
To prove our result we will consider the periodic nature of $\hat{X}(f)$ and compute its Fourier \textit{series} expansion coefficients (that is, we take a Fourier transform of a Fourier transform). Replacing $L$ by $F_s =1/ T_s$ in~(\ref{eq:is:fseEx}) we can write
\begin{align*}
A_n &= (1/F_s) \int_{-F_s/2}^{F_s/2}\tilde{X}(f) \, e^{-j(2\pi/F_s) f n}\, df \\[3mm]
&= T_s \int_{-F_s/2}^{F_s/2}\sum_{k=-\infty}^{+\infty} X\left( f + kF_s \right) e^{-j2\pi f nT_s }\, df \\
\end{align*}
By inverting integral and summation, which we can do if the Fourier transform~(\ref{eq:is:exSampeq2}) is well defined:
so that by replacing the values for all the $A_n$ in~(\ref{eq:is:fseEx}) we obtain $\tilde{X}(f)=\hat{X}(f)$.
What we just found is another derivation of the aliasing\index{aliasing} formula. Intuitively, there is a duality between the time domain and the frequency domain in that a discretization of the time domain leads to a periodization of the frequency domain; similarly, a discretization of the frequency domain leads to a periodization of the time domain (think of the DFS and see also Exercise~\ref{ex:is:aliasTimeEx}).
A fundamental result of spectral analysis states that a function cannot have finite support both in time and in frequency; in other words, a signal cannot be both time-limited and band-limited. This can be easily shown by contradiction using the sampling theorem and the properties of the $z$-transform. Let's assume that the continuous-time signal $x_c(t)$ is $2f_0$-bandlimited (that is, $X_c(f)=0$ for $|f| > f_0$) and that there also exist a value $t_0 > 0$ so that
\[
x_c(t)=0\quad\mbox{for } |t| > t_0.
\]
Since the signal is bandlimited, we know that it can be perfectly represented by a sequence of equally spaced samples, provided that the sampling rate satisfies $F_s \ge2f_0$. Let's for instance pick $F_s =4f_0$ and call $x[n]= x_c(nT_s)$ the resulting discrete-time signal for $T_s =1/(4f_0)$. Using~(\ref{eq:is:DTFTsampled}), the DTFT of the sampled sequence over $[-\pi, \pi]$ is simply the rescaled continuous-time spectrum between $[-2f_0, 2f_0]$:
On the other hand, we assumed that $x_c(t)$ is also time-limited so the sequence $x[n]$ is going to have a finite support and its $z$-transform will contain only a finite number of terms:
\[
X(z)=\sum_{n=-M}^{M} x[n] z^{-n}
\]
where
\[
M =\bigg\lfloor\frac{t_0}{T_s} \bigg\rfloor.
\]
Since the DTFT is $X(z)$ for $z=e^{j\omega}$, because of~(\ref{eq:is:timefreq}) we have that $X(z)=0$ over a finite interval; but,
since the $z$-transform is a finite-degree polynomial, it will necessarily be zero everywhere (see also Example~\ref{ex:fil:impossIdealProof}). And so the only signal that can be both time-limited and bandlimited is the null signal.
The trick of periodizing a function and then computing its Fourier series expansion comes very handy also in proving that a function cannot be both bandlimited and time-limited (that is, have a finite support both in time and in frequency). The proof is by contradiction: assume $x(t)$ has finite time support, i.e. there exists a time $T_0$ such that
\[
x(t)=0\quad\mbox{for } |t| > T_0;
\]
assume that $x(t)$ has a well-defined Fourier transform $X(f)$ and that it is {\em also} bandlimited so that we can find a frequency $f_0$ for which
\[
x(f)=0\quad\mbox{for } |f| > f_0.
\]
Consider now the periodization\index{periodization} of the function in time with period $S$:
\[
\tilde{x}(t)=\sum_{k=-\infty}^{\infty} x(t - kS);
\]
since $x(t)=0$ for $|t| > T_0$, if we choose $S > 2T_0$ the copies in the sum do not overlap, as shown in Figure~\ref{fig:is:tlvsblFig}. If we compute the Fourier series expansion~(\ref{eq:is:fsecEx}) for the $S$-periodic function $\tilde{x}(t)$ we have
this indicated that the Fourier series coefficients of the periodized function are samples of the Fourier transform of the original function (another duality between periodization and sampling). Since we assumed that $f(t)$ is bandlimited, there will be only a finite number of nonzero $A_n$ coefficients; indeed
\[
A_n =0\quad\mbox{for } |n| > \lfloor f_0 S \rfloor= N_0
\]
and therefore we can write the reconstruction formula~(\ref{eq:is:fseEx}) as:
Now consider the complex-valued polynomial of degree $2N_0+1$
\[
P(z)=\sum_{n =-N_0}^{N_0} A_n z^n
\]
obviously $P\bigl(e^{j(2\pi/S)t} \bigr)=\tilde{x}(t)$ but we also know that $\tilde{x}(t)$ is identically zero over the $[T_0\,,\, S-T_0]$ interval, as shown in Figure~\ref{fig:is:tlvsblFig}. However, a finite-degree polynomial $P(z)$ has only a finite number of roots\index{roots!of complex polynomial} and
therefore it cannot be identically zero over an interval unless it is zero everywhere (see also Example~\ref{ex:fil:impossIdealProof}). Hence, either $x(t)=0$ everywhere or $x(t)$ cannot be both bandlimited and time-limited.
The sampling theorem is often credited to C.\ Shannon, and indeed it appears with a embryonic proof in his foundational 1948 paper ``A Mathematical Theory of Communication'', \textit{Bell System Technical Journal\/}, Vol. 27, 1948, pp. 379-423 and pp. 623-656.
Contemporary treatments can be found in all signal processing books, but also in more mathematical texts, such as S.\ Mallat's \textit{A Wavelet Tour of Signal Processing\/} (Academic Press, 1998). These more modern treatments take a Hilbert space point of view, which allows the extension of sampling theorems to more general spaces than just bandlimited functions. More recently, a renewed interest in sampling theory has been spurred by applications such as nonuniform sampling or the sampling of signals that, although not bandlimited, possess a finite rate of innovation (FRI sampling).