+ "<p style=\"font-size: 30pt; font-weight: bold; color: #B51F1F;\">Dynamic Range Compression</p>"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 76,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%matplotlib inline\n",
+ "import ipywidgets as widgets\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "from IPython.display import Audio\n",
+ "from scipy.io import wavfile\n",
+ "\n",
+ "import import_ipynb\n",
+ "#from Helpers import *\n",
+ "\n",
+ "import matplotlib\n",
+ "figsize=(10,5)\n",
+ "matplotlib.rcParams.update({'font.size': 16})"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 59,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "fs=44100"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Fundamentals\n",
+ "\n",
+ "<img src=\"img/RS124.jpg\" alt=\"The RS124, the Beatles compressor at Abbey Road\" style=\"float: right; width: 500px; margin: 20px 0px;\"/>\n",
+ "\n",
+ "A dynamic range compressor is a nonlinear device used to limit the amplitude excursion of an audio signal. The peak-to-peak range is reduced adaptively by applying a time-varying attenuation factor that depends on:\n",
+ " * the desired amount of compression\n",
+ " * the *reactivity* of the compressor to the input level\n",
+ " * the target p-p value\n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Applications\n",
+ "\n",
+ "Typical use cases include:\n",
+ " * reduce spiky transients that would cause distortion (e.g. in recording bass)\n",
+ " * compensate for varying distance between source and microphone\n",
+ " * increase the overall loudness of a music piece"
+ " * perceptual loudness is related to RMS (average power)\n",
+ " * dynamic range is related to peak amplitude (peak power)\n",
+ " * PAPR: peak-to-average ratio\n",
+ " * loud tracks have smaller PAPR values\n",
+ " \n",
+ "A compressor reduces the PAPR"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In this notebook, we will implement a **Dynamic range compressor**, often abreviated simply compressor. It is an audio production effect and device, that, as its name suggests, reduces the dynamic range of an audio signal. This is useful to bring different levels of sound to a more uniform range of volume. For instance, it is useful when recording voice, given that the singer may move while singing and hence the microphone may capture the voice more or less loud during the recoring. Using a compressor can allow to bring the voice to a constant level on the whole recording.\n",
+ "To implement a very basic compressor, one could simply use a map table to change the values of amplitudes greater or lower than a threshold, however this would lead to audibly distorting the signal. A compressor is hence built using several components "
+ "The strategy to avoid distortion and pumping involves:\n",
+ " * start applying the attenuation to the signal gradually according to a user-definable *attack* time $\\tau_A$\n",
+ " * when the instantaneous attenuation drops to zero dB, decrease its value gradually instead; the rate of decay is determined by a user-definable *release* time\n",
+ " \n",
+ "This is achieved via a pair of leaky integrators applied to the signal $x_L[n]$\n",
+ "\n",
+ "Applying an instantaneous attenuation factor to the input would cause an effect called \"pumping\", where the amplitude envelope of the output varies too quickly \n",
+ "\n",
+ "The difference $x - g(x)$ is the theoretical input attenuation (in dB) returned by the gain computer. \n",
+ "\n",
+ "In order to avoid \"crushing\" the transients in the input, the attenuation is integrated over time before being applied. This operation is performed by the so-called level detector, which is dependent on two parameters:\n",
+ "\n",
+ " * the attack time\n",
+ " * the release time\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Using a static compression has some drawacks. The main one is that the sound gets weak as all the transients are systemmatically getting crushed, removing an important part of the sound. Since the attack of a sound is determining for the perception we have of it, it is necessary to keep the transients while compressing the sustain of the sounds. This is done by adding an additional control to the compressor, the **attack time** $at$. The attack time is the time the compressor will take to start having an effect, each time that the input sounds enter the \"on\" region (above the threshold). This attack time will allow to let the transient unchanged through the compressor.\n",
+ "\n",
+ "Complementarily, a control called **release time** $rt$ sets how fast the compressor goes back to the inactive mode as soon as the input signal gets below the threshold. The attack time is generally set to be around 10 to 100ms while the reslease time is often set between 100 and 1000ms. \n",
+ "We can now observe the influence of different attack and release times on a simple square pulse. The square pulse is here used as an example of **envelope** for a signal (not the signal itself!). It is interesting to see that if the attack time gets long, the signal does not have enough time to reach the value of the input signal before the input signal changes value again. The release curve has hence to be computed from where the signal was, to avoid jumps."
+ " ax.plot(peak_detector(square, at, rt) , label=\"Signal with envelope\")\n",
+ " plt.title(\"Attack-release envelope\")\n",
+ " plt.xlabel(\"Time [samples]\")\n",
+ " plt.ylabel(\"Envelope amplitude\")\n",
+ " plt.legend(loc=\"best\")\n",
+ " plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## 2. The compressor (itself)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let us now move to implementing the compressor itself. A compressor is often implemented in the following way: first the input signal $x$ is copied to a parallel processing circuit called the **sidechain**, where all the computation is done about the final gain of the signal over time. Then, the sidechain is multiplied back with the original signal before being outputed. Everything to be implemented hence happens inside the sidechain.\n",
+ "\n",
+ "As you have seen above, all the computations are done in the dB domain, in order to match the human perception of the sound. The first step of the sidechain processing is hence to convert the copy of the signal $x$ to dBs. We call this dB signal $x_G$. Similarly, the last step before multiplying the sidechain back with the signal is to convert it back to the linear domain.\n",
+ "\n",
+ "$x_G$ is then sent to the gain computer where it recieves a static compression, that does not depend on time. The statically compressed signal is called $y_G$. We then compute the difference between $x_G$ and $y_G$ in order to obtain the values in dB of the amplitudes that must be reduced only, that is the amplitudes belonging to the region above the threshold, and we call it $x_L$. \n",
+ "\n",
+ "$x_L$ is subsequencly sent to the peak detector to avoid compressing the transients. The output is called $y_L$ and contains the amplitudes above the threshold, but with the attack-release envelope this time active. This value is inverted as it corresponds to the quantity of amplitude that has to be removed from the signal, i.e. the amount of volume to be reduced. \n",
+ "\n",
+ "One last parameter, called the makeup $M$, is simply an additional constant gain to be added to the signal to compensate for the lost amplitude after compression. We add $M$ to $-y_L$, before converting this back to linear. Finally, this is multiplied to the original signal $x$.\n",
+ "\u001b[1;31mNameError\u001b[0m: name 'open_audio' is not defined"
+ ]
+ }
+ ],
+ "source": [
+ "speech = open_audio('samples/speech.wav')\n",
+ "Audio(\"samples/speech.wav\", autoplay=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "After compressing the speech, and increasing the makeup gain, the louder samples are reduces, while the quiter samples are increased, bringing everything closer to the threshold."
Binary files /dev/null and b/notebooks/COM418-staging/DynamicRangeCompression/snd/sm.wav differ
diff --git a/teaching/COM303/lectures/10 - from ideal to design/2_approximation.tex b/teaching/COM303/lectures/10 - from ideal to design/2_approximation.tex
index fdf62c2..c617e15 100644
--- a/teaching/COM303/lectures/10 - from ideal to design/2_approximation.tex
+++ b/teaching/COM303/lectures/10 - from ideal to design/2_approximation.tex
@@ -1,796 +1,818 @@
\documentclass[aspectratio=169]{beamer}
\def\stylepath{../styles}
\usepackage{\stylepath/com303}
\begin{document}
%\def\N{4 }
%\begin{frame} \frametitle{Mainlobe and sidelobes}