{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# `mcsm-benchs`: Using user-provided signals and performance metric\n",
"\n",
"`mcsm-benchs` can be helpful to create benchmarks with signals and performance metrics provided by the user.\n",
"This can be useful, for instance, when dealing with real-world signals and performance metrics that do not need a noiseless version of the signal.\n",
"\n",
"In this notebook, we create a benchmark with real-world audio signals and a performance metric that is computed based on the original signal and the processed one.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from numpy import pi as pi\n",
"import pandas as pd\n",
"from matplotlib import pyplot as plt\n",
"from mcsm_benchs.Benchmark import Benchmark\n",
"from mcsm_benchs.ResultsInterpreter import ResultsInterpreter\n",
"from mcsm_benchs.SignalBank import SignalBank\n",
"from utils import spectrogram_thresholding, get_stft\n",
"\n",
"from IPython.display import Audio"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Creating a dictionary of methods\n",
"\n",
"Let's create a dictionary of methods to benchmark. As as example, we will compare two strategies for spectrogram thresholding.\n",
"The first one is hard thresholding, in which the thresholding function is defined as:\n",
"The second one is soft thresholding, here defined as:\n",
"\n",
"These two approaches are implemented in the python function ```thresholding(signal, lam, fun='hard')``` function, which receives a signal to clean, a positional argument ```lam``` and a keyword argument ```fun``` that can be either ```hard``` or ```soft```.\n",
" \n",
"Our dictionary of methods will consist then in two methods: hard thresholding and soft thresholding.\n",
"For both approaches, let's use a value of ```lam=1.0``` for now."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"\n",
"def method_1(noisy_signal, *args, **kwargs):\n",
" # If additional input parameters are needed, they can be passed in a tuple using \n",
" # *args or **kwargs and then parsed.\n",
" xr = spectrogram_thresholding(noisy_signal,1.0,fun='hard')\n",
" return xr\n",
"\n",
"def method_2(noisy_signal, *args, **kwargs):\n",
" # If additional input parameters are needed, they can be passed in a tuple using \n",
" # *args or **kwargs and then parsed.\n",
" xr = spectrogram_thresholding(noisy_signal,2.0,fun='soft') \n",
" return xr\n",
"\n",
"# Create a dictionary of the methods to test.\n",
"my_methods = {\n",
" 'Hard_Thr': method_1, \n",
" 'Soft_Thr': method_2,\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Creating a dictionary of user-provided signals\n",
"\n",
"We load two synthesized speech signals, and fix the length to `N=2**13` samples.\n",
"With these signals, we create a dictionary, where the key is going to be used as an identifier of the signal in the benchmark final results."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Loading signals and creating dictionary\n",
"N = 2**13\n",
"signals_dic = {\n",
" 'speech_1': np.loadtxt('6_female.csv')[0:N],\n",
" 'speech_2': np.loadtxt('6_male.csv')[0:N]\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Listen to the signals\n",
"fs = 16000\n",
"Audio(signals_dic['speech_2'], rate=fs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Defining a performance metric\n",
"\n",
"We use the Perceptual Evaluation of Speech Quality (PESQ) metric as a performance metric.\n",
"\n",
"To do this, we first create a wrapper `perf_fun(...)` of the function `pesq(...)`.\n",
"Performarmance metrics must follow the signature `perf_fun(x, xest, **kwargs)`, where\n",
"- `x` is the original signal (without added noise).\n",
"- `xest` is the output of a denoising approach.\n",
"- `**kwargs` is used to receive a number of extra parameters passed by the benchmark class when the function `perf_fun(...)` is called."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from pesq import pesq\n",
"# Create a wrapper function for PESQ.\n",
"# Normalize by the PESQ of the original signal.\n",
"perfuns = {'pesq1':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb')/pesq(fs,x,x,'nb'),\n",
" 'pesq2':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb'),}\n",
"perfuns['pesq1'](signals_dic['speech_1'],signals_dic['speech_1'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are ready to instantiate a `Benchmark` object and run a test using the proposed methods and parameters. The benchmark constructor receives a name of a task (which defines the performance function of the test), a dictionary of the methods to test, the desired length of the signals used in the simulation, a dictionary of different parameters that should be passed to the methods, an array with different values of SNR to test, and the number of repetitions that should be used for each test. Once the object is created, use the class method `run_test()` to start the experiments.\n",
"\n",
"*Remark 1: You can use the ```verbosity``` parameter to show less or more messages during the progress of the experiments. There are 6 levels of verbosity, from ```verbosity=0``` (indicate just the start and the end of the experiments) to ```verbostiy = 5``` (show each method and parameter progress)*\n",
"\n",
"*Remark 2: Parallelize the experiments is also possible by passing the parameter ```parallelize = True```. *"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running benchmark...\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 3/3 [00:03<00:00, 1.21s/it]\n",
"100%|██████████| 3/3 [00:03<00:00, 1.07s/it]\n"
]
}
],
"source": [
"benchmark = Benchmark(task = 'denoising',\n",
" N = N,\n",
" methods = my_methods, \n",
" SNRin = [0,10,20], \n",
" repetitions = 10,\n",
" signal_ids=signals_dic, # Input user-defined signals\n",
" verbosity=0,\n",
" obj_fun=perfuns, # Define a performance metric\n",
" )\n",
" \n",
"results_dic=benchmark.run() # Run the benchmark"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.microsoft.datawrangler.viewer.v0+json": {
"columns": [
{
"name": "index",
"rawType": "int64",
"type": "integer"
},
{
"name": "Method",
"rawType": "object",
"type": "string"
},
{
"name": "Parameter",
"rawType": "object",
"type": "string"
},
{
"name": "Signal_id",
"rawType": "object",
"type": "string"
},
{
"name": "Repetition",
"rawType": "int64",
"type": "integer"
},
{
"name": "0",
"rawType": "float64",
"type": "float"
},
{
"name": "10",
"rawType": "float64",
"type": "float"
},
{
"name": "20",
"rawType": "float64",
"type": "float"
}
],
"conversionMethod": "pd.DataFrame",
"ref": "c3aa84e4-d375-411c-9c2f-e9e9467ee13b",
"rows": [
[
"40",
"Hard_Thr",
"((), {})",
"speech_1",
"0",
"1.088252305984497",
"1.4201626777648926",
"2.316756248474121"
],
[
"41",
"Hard_Thr",
"((), {})",
"speech_1",
"1",
"1.0934972763061523",
"1.4408046007156372",
"2.4149160385131836"
],
[
"42",
"Hard_Thr",
"((), {})",
"speech_1",
"2",
"1.0854753255844116",
"1.4442657232284546",
"2.3834280967712402"
],
[
"43",
"Hard_Thr",
"((), {})",
"speech_1",
"3",
"1.0844477415084839",
"1.4300222396850586",
"2.3095014095306396"
],
[
"44",
"Hard_Thr",
"((), {})",
"speech_1",
"4",
"1.0884238481521606",
"1.4676872491836548",
"2.410154104232788"
],
[
"45",
"Hard_Thr",
"((), {})",
"speech_1",
"5",
"1.082132339477539",
"1.4230505228042603",
"2.281771421432495"
],
[
"46",
"Hard_Thr",
"((), {})",
"speech_1",
"6",
"1.1162354946136475",
"1.5638952255249023",
"2.5827765464782715"
],
[
"47",
"Hard_Thr",
"((), {})",
"speech_1",
"7",
"1.0995392799377441",
"1.5001294612884521",
"2.494778871536255"
],
[
"48",
"Hard_Thr",
"((), {})",
"speech_1",
"8",
"1.118208646774292",
"1.5957214832305908",
"2.7281980514526367"
],
[
"49",
"Hard_Thr",
"((), {})",
"speech_1",
"9",
"1.093117356300354",
"1.4787007570266724",
"2.5309834480285645"
],
[
"60",
"Hard_Thr",
"((), {})",
"speech_2",
"0",
"1.1423429250717163",
"1.5649162530899048",
"2.7233963012695312"
],
[
"61",
"Hard_Thr",
"((), {})",
"speech_2",
"1",
"1.1195447444915771",
"1.5034854412078857",
"2.566995620727539"
],
[
"62",
"Hard_Thr",
"((), {})",
"speech_2",
"2",
"1.1279807090759277",
"1.538451075553894",
"2.6909079551696777"
],
[
"63",
"Hard_Thr",
"((), {})",
"speech_2",
"3",
"1.132774829864502",
"1.5284581184387207",
"2.68407940864563"
],
[
"64",
"Hard_Thr",
"((), {})",
"speech_2",
"4",
"1.148443579673767",
"1.618736743927002",
"2.7857391834259033"
],
[
"65",
"Hard_Thr",
"((), {})",
"speech_2",
"5",
"1.1286216974258423",
"1.544001579284668",
"2.7128565311431885"
],
[
"66",
"Hard_Thr",
"((), {})",
"speech_2",
"6",
"1.1422004699707031",
"1.637576937675476",
"2.893669843673706"
],
[
"67",
"Hard_Thr",
"((), {})",
"speech_2",
"7",
"1.135804295539856",
"1.6201083660125732",
"2.9238059520721436"
],
[
"68",
"Hard_Thr",
"((), {})",
"speech_2",
"8",
"1.1628708839416504",
"1.6910293102264404",
"3.0214061737060547"
],
[
"69",
"Hard_Thr",
"((), {})",
"speech_2",
"9",
"1.1408934593200684",
"1.6020331382751465",
"2.900660276412964"
],
[
"50",
"Soft_Thr",
"((), {})",
"speech_1",
"0",
"1.603140950202942",
"2.514249086380005",
"3.585965633392334"
],
[
"51",
"Soft_Thr",
"((), {})",
"speech_1",
"1",
"1.6848340034484863",
"2.5231692790985107",
"3.656338930130005"
],
[
"52",
"Soft_Thr",
"((), {})",
"speech_1",
"2",
"1.6007087230682373",
"2.4177064895629883",
"3.6167256832122803"
],
[
"53",
"Soft_Thr",
"((), {})",
"speech_1",
"3",
"1.5973451137542725",
"2.612795114517212",
"3.5188703536987305"
],
[
"54",
"Soft_Thr",
"((), {})",
"speech_1",
"4",
"1.6161836385726929",
"2.6622891426086426",
"3.5719614028930664"
],
[
"55",
"Soft_Thr",
"((), {})",
"speech_1",
"5",
"1.4298336505889893",
"2.302718162536621",
"3.5348238945007324"
],
[
"56",
"Soft_Thr",
"((), {})",
"speech_1",
"6",
"1.6649541854858398",
"2.593343496322632",
"3.627356767654419"
],
[
"57",
"Soft_Thr",
"((), {})",
"speech_1",
"7",
"1.5404071807861328",
"2.2859041690826416",
"3.5640995502471924"
],
[
"58",
"Soft_Thr",
"((), {})",
"speech_1",
"8",
"1.5308319330215454",
"2.4137256145477295",
"3.6679325103759766"
],
[
"59",
"Soft_Thr",
"((), {})",
"speech_1",
"9",
"1.5807113647460938",
"2.2893879413604736",
"3.050673007965088"
],
[
"70",
"Soft_Thr",
"((), {})",
"speech_2",
"0",
"1.7368583679199219",
"2.8953075408935547",
"3.5835580825805664"
],
[
"71",
"Soft_Thr",
"((), {})",
"speech_2",
"1",
"1.7840569019317627",
"2.8154115676879883",
"3.5237953662872314"
],
[
"72",
"Soft_Thr",
"((), {})",
"speech_2",
"2",
"1.8165462017059326",
"3.0099735260009766",
"3.5584914684295654"
],
[
"73",
"Soft_Thr",
"((), {})",
"speech_2",
"3",
"1.8111028671264648",
"3.057044744491577",
"3.592641830444336"
],
[
"74",
"Soft_Thr",
"((), {})",
"speech_2",
"4",
"1.7999805212020874",
"2.716383457183838",
"3.569941282272339"
],
[
"75",
"Soft_Thr",
"((), {})",
"speech_2",
"5",
"1.6246907711029053",
"3.0139896869659424",
"3.6378655433654785"
],
[
"76",
"Soft_Thr",
"((), {})",
"speech_2",
"6",
"1.8974900245666504",
"2.854654312133789",
"3.4270081520080566"
],
[
"77",
"Soft_Thr",
"((), {})",
"speech_2",
"7",
"1.5105435848236084",
"2.5043632984161377",
"3.5780460834503174"
],
[
"78",
"Soft_Thr",
"((), {})",
"speech_2",
"8",
"2.0005927085876465",
"3.1368255615234375",
"3.5498127937316895"
],
[
"79",
"Soft_Thr",
"((), {})",
"speech_2",
"9",
"1.992071270942688",
"2.981548309326172",
"3.4559719562530518"
]
],
"shape": {
"columns": 7,
"rows": 40
}
},
"text/html": [
"