Page MenuHomec4science

non_linear.tex
No OneTemporary

File Metadata

Created
Thu, Mar 13, 11:13

non_linear.tex

\chapter{Non-linear system modelling}
Some music elements are non-linear, like for instance distortion effects, or amplifiers. In the last years, simulated such effects and amplifiers have become quite popular due to their cheap price and since they don't need no recording material to work with.\footnote{See reviews like \url{https://www.youtube.com/watch?v=GQvfVaHyHeE}} They allow not only to get rid of all physical gear while touring, such as cabinets or microphones, but they are usable directly in a Digital Audio Workstation as plugins which facilitates mixing considerably. \\
\begin{figure}[h]
\centering
\includegraphics[width=10cm]{figures/amplitube.png}
\caption{IK Multimedia's \textbf{AmpliTube} is one leading amplifier modelling plugin. It replicates existing amplifiers, allowing the user to tune all the amp parameters, as well as the positioning of the microphones and cabinet.}
\label{amplitube}
\end{figure}
More specifically, \textbf{system identification} is the process of automatically tuning a digital model's parameters to match at the best an existing system from which we can measure the output for a given input (for instance, an existing amplifier model replicated as a VST effect, as shown in figure \ref{amplitube}). For a LTI system, identification can be straight-forwardly done by feeding the real world system a unit impulse, and measuring the corresponding filtered IR. There unfortunately is no equivalent magic trick for modelling any non-linear system by using one precise impulse, and to this day there is no one state-of-the-art method for identifying a non-linear system. Over the last 50 years, many methods have been proposed to tackle this problem.\cite{review_disto} They can generally be classified in two classes of methods: white-box and black-box.\\
\textbf{White-box methods} consist in designing a digital system trying to replicate the electronics of the physical one. This requires knowing and understanding how the physical system works. Then one can simulate each electronic component, using for instance an electronic circuit simulator. These methods require a lot of knowledge to set up, and are different for each system to be modelled. They will be left aside in this class, as their implementation typically needs a deep dive into electronics, and are fit for one given system only.\\
\textbf{Black-box methods} on the other hand assume to know nothing about the physical system's components. They instead use a fixed model and tune its parameters to fit the physical system's behaviour as good as possible. This is typically done by feeding a carefully chosen input to the physical system, then measuring its output, and comparing both to understand what transformation has been applied. A black-box method can hence be used with many different physical system, and lot of implementations have been described.\cite{app10020638} In practice, one typically places a microphone in front of an amplifier's cabinet, and then feeds the amplifier a chosen audio sequence. The output of the amplifier is then recorded and compared to the input. This strategy allows to identify an amplifier by only using it for several minutes. \\
In this chapter, we will review several techniques for modelling and identifying non-linear systems. All input signals are generally normalized between $[-1,1]$ before processing, and the output is also limited between $[-1,1]$. Any output out of this range is truncated back to the closest boundary.
\section{Amplifier and distortion theory}
Before looking at actual implementations of nonlinear functions, let's review quickly some basic aspects of distortion, and what to be careful with when implementing such effects.
\subsection{Clipping}
The origin of nonlinear behaviour in music is physical. When amplifying a signal, there is a limit to the values the signal can take imposed by the amplifier itself, as well as by the speaker. Traditional amplifiers, such as lamp or tube amplifiers, increase the power of a signal using as the name suggests, vacuum tubes or lamp. Without going into the physical details, the electronics of these amplifiers can manage a finite quantity of power (or equivalently a maximum output amplitude), above which the signal will not be amplified "properly" and will be crushed back in the working range of the amplifier. This crushing is called \textbf{clipping} and is responsible for giving the sound a harsh, noisy texture. A visual example of clipping can be observed in figure \ref{clipping}. \\
The simplest way of creating a non-linearity in a digital environment is to use the effect called \textbf{clipping}. Clipping occurs when a signal exceeds the bounds of the support it is in. In our case, all our audio signals have always been bounded between $-1$ and $+1$, with $-1$ corresponding to the most retracted position the speaker can take when producing a sound, and $+1$ being the most pushed out position it can take. Any value beyond this range cannot be reproduced by the loudspeaker, and will be naturally (or in our case, numerically) pushed back to the closest boundary. This crushing of the exceeding data to $+1$ or $-1$ is clipping. Acoustically, it sounds like a distorted, aggressive version of the signal. One distinguishes 2 types of clipping: hard and soft clipping.
\subsubsection{Hard clipping}
Hard-clipping consists in simply replacing any sample outside the $[-1,1]$ range to either $-1$ or $+1$. To manually create a hard-clipping distortion, we simply amplify the signal with a constant factor $\alpha$, and clip all the excess data back into the range.
$$
f_\alpha(x) = \begin{cases}
-1, &\text{if } x \leq -1/\alpha \\
\alpha x, &\text{if } -1/\alpha \leq x \leq 1/\alpha \\
+1, &\text{if } x \geq 1/\alpha
\end{cases}
$$
A special case of hard-clipping distortion is the so called \textbf{infinite clipping} distortion, where all the samples are pushed to the closest value in the set $\{-1, 1\}$. This function hence resembles the $sign$ function, although differs from it by the fact that the input $0$ is going to be mapped to $1$, unlike the $sign$ function where the output of $0$ stays $0$.
$$
f(x) = \begin{cases}
-1, &\text{if } x < 0 \\
+1, &\text{if } x \geq 0
\end{cases}
$$
\begin{figure}[H]
\center
\begin{dspPlot}[xout=true,sidegap=0]{-1,1}{-1.1,1.1}
\dspFunc[linecolor=yellow!60]{x 0 ge {1} {-1} ifelse}
\dspFunc[linecolor=blue!60]{x .1 ge {1} {x -.1 le {-1} {10 x mul} ifelse} ifelse}
\dspFunc[linecolor=green!60]{x .5 ge {1} {x -.5 le {-1} {2 x mul} ifelse} ifelse}
\dspFunc[linecolor=red!60]{x 1 ge {1} {x -1 le {-1} {1 x mul} ifelse} ifelse}
\end{dspPlot}
\caption{Hard-clipping distortion for $\alpha=\{1, 2, 10\}$, and infinite-clipping.}
\end{figure}
\subsubsection{Soft clipping}
To fix the hard edge "problem" of hard-clipping, we can use a smooth, continuous function in $[-1, 1]$ that resembles the hard-clipping function (typically a function looking like an "S"). There are many such functions, and a lot have been used in different implementations of digital amplifiers and distortions over time. Such functions include
\begin{align}\label{arctan_disto}
f_\alpha(x) = \frac{ \arctan ( \alpha \cdot x ) }{\arctan ( \alpha ) }
\end{align}
or Yamaha's patented polynomial distortion \footnote{ \url{https://patents.google.com/patent/US5570424A/en}}
\begin{align}\label{poly_disto}
f(x) = \frac{3x}{2} \left( 1 - \frac{x^2}{3} \right) .
\end{align}
While increasing the power of the low amplitudes, it reduces the range of the high amplitudes. It can be tuned by a parameter $\alpha$ that controls the intensity of the distortion: the higher $\alpha$ the harder the clipping.
\begin{figure}[H]
\center
\begin{dspPlot}[xout=true,sidegap=0]{-1,1}{-1.1,1.1}
\dspFunc[linecolor=blue!60]{x abs 0.01 atan x x abs div mul 1 0.01 atan div}
\dspFunc[linecolor=green!60]{x abs 0.1 atan x x abs div mul 1 0.1 atan div}
\dspFunc[linecolor=red!60]{x abs 1 atan x x abs div mul 1 1 atan div}
\end{dspPlot}
\caption{The cited popular soft-clipping distortion \eqref{arctan_disto} for $\alpha=\{1, 10, 100\}$.}
\end{figure}
In figure \ref{clipping}, we compare the effect of a hard-clip and a sort-clip on a sweep signal. The smaller the dynamic range, the more intense the induced distortion.
\begin{figure}[H]
\centering
\includegraphics[width=.9\textwidth]{figures/clipping.pdf}
\caption{Soft clipping \eqref{arctan_disto} and hard clipping comparison on a simple sweep signal.}
\label{clipping}
\end{figure}
\subsection{Distortion harmonics}
Frequency-wise speaking, distorting a signal adds an infinite number of harmonics to it. The more distorted the signal, the more vertical slopes it will contain, requiring higher and higher sine frequencies to additively reproduce the signal. Similarly, the square wave seen in the chapter about synthesizers can be seen as an infinite-clipping distorted sine wave, and indeed contained an infinite number of harmonics. Which harmonics are added or not however, depends on the shape of the distortion function, and more specifically on its parity. We hence distinguish between \textbf{odd and even distortions}.
\subsubsection{Odd harmonics}
Most distortion functions, such as all the functions presented above, are odd functions (more than that, they are centrally symmetric around the origin). In other words, they preserve the sign of each sample of the signal they are applied on. This property has for consequence to generate only odd harmonics, i.e. harmonics whose frequency is an odd multiple of the fundamental. Figure \ref{odd_disto} shows the generated harmonics by equation \eqref{arctan_disto} on a sine wave. Remark that harmonics whose frequency exceeds the Nyquist frequency (in this case 2000Hz) are mirrored back into the Nyquist range. This unwanted effect is caused aliasing and will be fixed in the next section.
\begin{figure}[H]
\centering
\includegraphics[width=.8\textwidth]{figures/odd_disto.pdf}
\caption{Harmonics generated by passing a 600Hz sine wave into the odd soft-clipping distortion \eqref{arctan_disto}, when $f_s=4000$Hz.}
\label{odd_disto}
\end{figure}
\subsubsection{Even harmonics}
Conversely, using an even function (symmetrical with respect to the x axis) as a distortion function will generate even harmonics only. Since the original signal is itself an odd harmonic (number 1), we have to add the original signal back to the distorted signal in order not to loose the fundamental frequency. We can adapt the odd distortion \eqref{arctan_disto} into an even distortion, by taking its absolute value with $\alpha$ inverted, then adding the original signal back in, then normalizing, such as
\begin{align}\label{even_arctan_disto}
f_\alpha(x) = x + \bigg\lvert \frac{ \arctan ( x / \alpha ) }{\arctan ( 1 / \alpha ) } \bigg\rvert - 1 ,
\end{align}
where the $-1$ places the output back in $[-1,+1]$.
\begin{figure}[H]
\center
\begin{dspPlot}[xout=true,sidegap=0]{-1,1}{-1.1,1.1}
\dspFunc[linecolor=blue!60]{x abs 0.01 atan x x abs div mul 1 0.01 atan div abs x add -1 add}
\dspFunc[linecolor=green!60]{x abs 0.1 atan x x abs div mul 1 0.1 atan div abs x add -1 add}
\dspFunc[linecolor=red!60]{x abs 1 atan x x abs div mul 1 1 atan div abs x add -1 add}
\end{dspPlot}
\caption{The cited even soft-clipping distortion \eqref{even_arctan_disto} for $\alpha=\{.01, .1, 1\}$.}
\end{figure}
This generates the even harmonics shown in figure \ref{even_disto}. Here again, aliasing must be corrected.
\begin{figure}[H]
\centering
\includegraphics[width=.8\textwidth]{figures/even_disto.pdf}
\caption{Harmonics generated by passing a 600Hz sine wave into the even soft-clipping distortion \eqref{arctan_disto}, when $f_s=4000$Hz.}
\label{even_disto}
\end{figure}
\subsection{Aliasing}
Nonlinearly reshaping a signal does add harmonics that can often exceed the Nyquist frequency. Aliasing occurs when frequencies above the Nyquist frequency are sampled and become "mirrored" on the frequency axis with respect to the Nyquist frequency. They become audible and add unwanted noise to the signal. \\
Let's look again at figure \ref{odd_disto} as an example, where $f_s=4000$Hz. We see that the distortion has created odd harmonics that have bounced back on the right side of the plot (because the function itself is odd). The original sine had frequency $f_n=600$Hz. Its first odd harmonic hence is centered on $3 f_c = 1800$Hz. The second odd harmonic is centered on $5 f_c=3000$Hz but since this value is above the Nyquist frequency, the overflow in wrapped back. The overflow amount if $5 f_c - f_s/2 = 1000$Hz meaning that this harmonic will appear $1000$Hz left of the Nyquist frequency, hence here $1000$Hz. And so on.
On the table below are listed the first 5 odd harmonics of the signal, that all can be seen in \ref{odd_disto}.
\begin{figure}[H]
\begin{center}
\begin{tabular}{ |c c c| }
\hline
Harmonic number & Analog frequency & Frequency after wrap-up \\
\hline
\hline
1 & 600Hz & 600Hz \\
3 & 1800Hz & 1800Hz \\
5 & 3000Hz & 1000Hz \\
7 & 4200Hz & 200Hz \\
9 & 5400Hz & 1400Hz \\
\hline
\end{tabular}
\end{center}
\caption{Harmonics of the odd distortion from figure \ref{odd_disto}.}
\end{figure}
The same effect happens with the even harmonics in figure \ref{even_disto}. \\
\begin{figure}[H]
\begin{center}
\begin{tabular}{ |c c c| }
\hline
Harmonic number & Analog frequency & Frequency after wrap-up \\
\hline
\hline
1 & 600Hz & 600Hz \\
2 & 1200Hz & 1200Hz \\
4 & 2400Hz & 1600Hz \\
6 & 3600Hz & 400Hz \\
8 & 4800Hz & 800Hz \\
10 & 6000Hz & 2000Hz \\
\hline
\end{tabular}
\end{center}
\caption{Harmonics of the even distortion from figure \ref{even_disto}.}
\end{figure}
Adding a low-pass filter after the non-linear function is useless as the exceeding harmonics have already been mirrored back into the frequency spectrum. It is hence necessary to act directly during the nonlinear process. This is done by \textbf{oversampling} the signal before applying the nonlinearity, and by \textbf{undersampling} it after. By doing so, we increase the Nyquist frequency and hence have a lot more space "on the right" of the frequency plot for the harmonics to spread before they are bounced back to the left. When undersampling, we reduce the Nyquist frequency back to its original value, and simply discard everything that is greater than it. Hence, most of the harmonics simply disappear and no more (noticeable) aliasing occurs! This is shown in figure \ref{oversample} where 2 levels of oversampling are applied on a same sine signal prior to being distorted. \\
\begin{figure}[h]
\centering
\includegraphics[width=.49\textwidth]{figures/oversample2.pdf}
\includegraphics[width=.49\textwidth]{figures/oversample10.pdf}
\caption{Comparison of 2-times oversampling versus 10-times oversampling on a distorted sine wave. On the last row, one can notice the (almost complete) disappearance of the unwanted mirrored harmonics in the 10-times oversample, while they are still noticeable in the 2-times oversample.}
\label{oversample}
\end{figure}
Note that technically, it is impossible to completely get rid of aliasing since the harmonics spread up to infinity, however as their intensity quickly decreases, they become unnoticeable even with a relatively small oversampling factor.
\section{Static waveshapers}
Static waveshapers are the simplest family of non-linear models. They consist in applying one non-linear function to each sample of an incoming signal. Static waveshapers are memoryless and don't change behaviour depending on the input signal. \\
Identification of a static waveshaper can be done by feeding the physical system a white-noise whose envelope's amplitude linearly varies from $[0,0]$ to $[-1,1]$ and recording the corresponding output's envelope's amplitude and map it between $[-1,1]$. Then, a model such as the polynomial regression can be trained on the input-output set and a polynomial waveshaper can be generated. Note that although these implementations are computationally efficient, they limit the shape of the function to be a mathematical one.
\subsection{Arithmetic implementation}
The simplest implementation of a waveshaper consists in algebraically passing each input sample $x$ into a static non-linear function $f(x):[-1,1] \rightarrow [-1,1]$ as we did until now. This requires a dedicated chip able to process a possibly large set of mathematical functions (the arctangent used in \eqref{odd_disto} is by far not the most common math function). Approximating such functions numerically can be expensive in time and required power.
\subsection{Lookup tables}
To overcome these cost limitations, patents have been proposed as soon as 1991\footnote{\url{https://patents.google.com/patent/US4991218}} using lookup tables. A lookup table is simply a list of input-output tuples stored in memory. The table's entries should together cover the whole input domain $[-1,1]$. For each input sample, the lookup table is queried to return the corresponding output. The main advantage of this method is that it is very simple to configure in order to obtain any (quantized) non-linear shape. The main drawback is that a sufficiently fine grained lookup table is very memory costly, so generally a small lookup table is used. \\
For instance, the distortion \eqref{odd_disto} can be quantized in a size $n$ lookup table by allocating $n$ outputs uniformly in the output domain. The corresponding input ranges are computed by taking the minimum distance to the mathematical function. In figure \ref{lookup_fct}, the soft-clipping distortion \eqref{odd_disto} is quantized into 3 output values, which correspond to -1, 1, and +1.
\begin{figure}[H]
\center
\begin{dspPlot}[xout=true,sidegap=0]{-1,1}{-1.1,1.1}
\def\quantize{dup 0 gt {-0.5}{0.5} ifelse sub truncate}
\dspFunc[linestyle=dotted,linewidth=1pt]{x abs 0.1 atan x x abs div mul 1 0.1 atan div}
\dspFunc[linecolor=red!60,linewidth=2pt]{x abs 0.1 atan x x abs div mul 1 0.1 atan div \quantize}
\end{dspPlot}
\begin{tabular}{ | c | c | }
\hline
input & output \\
\hline \hline
{[}-1, -0.09{]} & -1 \\
{[}-0.09, +0.09{]} & 0 \\
{[}+0.09, +1{]} & +1 \\
\hline
\end{tabular}
\caption{Simple lookup table version of the soft-clipping distortion \eqref{even_arctan_disto}, with 3 input-output tuples.}
\label{lookup_fct}
\end{figure}
\subsection{Multiband waveshaping}
One simple improvement of static waveshaping is multiband waveshaping. it simply consists in first diving the signal into several frequency bands, then applying a different non-linearity to each band, and finally sum them up together again. Although this is conceptually simple, it can achieve perceptively nice results.
\begin{figure}[H]
\centering
\includegraphics[width=14cm]{figures/tritone.PNG}
\caption{The \textbf{SM Tritone Multiband Waveshaper} by Simon Robitaille, a popular VST allowing to perform 3-band waveshaping.\protect\footnotemark}
\label{multi_waveshaper}
\end{figure}
\footnotetext{\url{https://www.reasonstudios.com/shop/rack-extension/sm-tritone-multiband-waveshaper/}}
\section{Wiener-Hammerstein box model}
One of the main issues with the static waveshaper is that it is static, i.e. it does not have memory and can't capture dynamics of a system over time. A lot of music devices, such as distortion or amplifiers, are known to have a dynamic behaviour and hence cannot be properly modelled with a static nonlinearity. \\
One proposed solution to this issue is the Wiener-Hammerstein model (a combination of the Wiener model and the Hammerstein model). Both models are gray-box and make the assumption that it is possible to approximate a \textbf{nonlinear dynamic system} by chaining several blocks in series, that process the dynamic behaviour and the nonlinearity independently. The Wiener-Hammerstein model is hence composed of a LTI filter modeling a first linear dynamic behaviour, followed by a static nonlinearity, again followed by a second LTI filter.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[H]
\center
\begin{dspBlocks}{1}{.6}
% 1 2 3 4 5
$x[n]~~$ & \BDfilterMulti{Pre-amp\\LTI} & \BDfilterMulti{Static\\waveshaper} & \BDfilterMulti{Cabinet\\LTI} & $~~y[n]$
\ncline{->}{1,1}{1,2}\ncline{->}{1,2}{1,3}
\ncline{->}{1,3}{1,4}\ncline{->}{1,4}{1,5}
\end{dspBlocks}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
In the case of a guitar amplifier, the first LTI system can be seen as a modelling of a pre-amplifier, the nonlinearity as the overdrive of the amplifier, and the second LTI system as the cabinet response (the loudspeaker). The advantage of this technique is that it is a lot simpler to create and identify LTI models than nonlinear models (a simple impulse response is enough!). One can hence record the impulse response of a speaker, and of a pre-amplifier, and convolve them with a guitar sound for the LTI systems, while using a static waveshaper for the middle nonlinearity.\footnote{It is possible to find online datasets of many guitar amplifiers and cabinets IRs for free. For instance, the website \url{https://overdriven.fr/overdriven/index.php/irdownloads/\#Free_Guitar_Cabinet_Impulse_Responses} provides many free IRs corresponding to different amplifiers.}
\begin{figure}[h]
\centering
\includegraphics[width=.6\textwidth]{figures/IR_cab.pdf}
\caption{Impulse response of the custom Zilla Fatboy 212 WGS ET65 cabinet\protect\footnotemark.}
\label{IR_cab}
\end{figure}
\footnotetext{\url{https://www.zillacabs.com/2x12-cabinets}}

Event Timeline