mcsm-benchs
: Using user-provided signals and performance metric
mcsm-benchs
can be helpful to create benchmarks with signals and performance metrics provided by the user. This can be useful, for instance, when dealing with real-world signals and performance metrics that do not need a noiseless version of the signal.
In this notebook, we create a benchmark with real-world audio signals and a performance metric that is computed based on the original signal and the processed one.
[4]:
import numpy as np
from numpy import pi as pi
import pandas as pd
from matplotlib import pyplot as plt
from mcsm_benchs.Benchmark import Benchmark
from mcsm_benchs.ResultsInterpreter import ResultsInterpreter
from mcsm_benchs.SignalBank import SignalBank
from utils import spectrogram_thresholding, get_stft
from IPython.display import Audio
1. Creating a dictionary of methods
Let’s create a dictionary of methods to benchmark. As as example, we will compare two strategies for spectrogram thresholding. The first one is hard thresholding, in which the thresholding function is defined as: The second one is soft thresholding, here defined as:
These two approaches are implemented in the python function thresholding(signal, lam, fun='hard')
function, which receives a signal to clean, a positional argument lam
and a keyword argument fun
that can be either hard
or soft
.
Our dictionary of methods will consist then in two methods: hard thresholding and soft thresholding. For both approaches, let’s use a value of lam=1.0
for now.
[5]:
def method_1(noisy_signal, *args, **kwargs):
# If additional input parameters are needed, they can be passed in a tuple using
# *args or **kwargs and then parsed.
xr = spectrogram_thresholding(noisy_signal,1.0,fun='hard')
return xr
def method_2(noisy_signal, *args, **kwargs):
# If additional input parameters are needed, they can be passed in a tuple using
# *args or **kwargs and then parsed.
xr = spectrogram_thresholding(noisy_signal,2.0,fun='soft')
return xr
# Create a dictionary of the methods to test.
my_methods = {
'Hard_Thr': method_1,
'Soft_Thr': method_2,
}
2. Creating a dictionary of user-provided signals
We load two synthesized speech signals, and fix the length to N=2**13
samples. With these signals, we create a dictionary, where the key is going to be used as an identifier of the signal in the benchmark final results.
[6]:
# Loading signals and creating dictionary
N = 2**13
signals_dic = {
'speech_1': np.loadtxt('6_female.csv')[0:N],
'speech_2': np.loadtxt('6_male.csv')[0:N]
}
[7]:
# Listen to the signals
fs = 16000
Audio(signals_dic['speech_2'], rate=fs)
[7]:
3. Defining a performance metric
We use the Perceptual Evaluation of Speech Quality (PESQ) metric as a performance metric.
To do this, we first create a wrapper perf_fun(...)
of the function pesq(...)
. Performarmance metrics must follow the signature perf_fun(x, xest, **kwargs)
, where
x
is the original signal (without added noise).xest
is the output of a denoising approach.**kwargs
is used to receive a number of extra parameters passed by the benchmark class when the functionperf_fun(...)
is called.
[8]:
from pesq import pesq
# Create a wrapper function for PESQ.
# Normalize by the PESQ of the original signal.
perfuns = {'pesq1':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb')/pesq(fs,x,x,'nb'),
'pesq2':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb'),}
perfuns['pesq1'](signals_dic['speech_1'],signals_dic['speech_1'])
[8]:
1.0
Now we are ready to instantiate a Benchmark
object and run a test using the proposed methods and parameters. The benchmark constructor receives a name of a task (which defines the performance function of the test), a dictionary of the methods to test, the desired length of the signals used in the simulation, a dictionary of different parameters that should be passed to the methods, an array with different values of SNR to test, and the number of repetitions that should be used for each test.
Once the object is created, use the class method run_test()
to start the experiments.
Remark 1: You can use the ``verbosity`` parameter to show less or more messages during the progress of the experiments. There are 6 levels of verbosity, from ``verbosity=0`` (indicate just the start and the end of the experiments) to ``verbostiy = 5`` (show each method and parameter progress)
Remark 2: Parallelize the experiments is also possible by passing the parameter ``parallelize = True``.
[9]:
benchmark = Benchmark(task = 'denoising',
N = N,
methods = my_methods,
SNRin = [0,10,20],
repetitions = 10,
signal_ids=signals_dic, # Input user-defined signals
verbosity=0,
obj_fun=perfuns, # Define a performance metric
)
results_dic=benchmark.run() # Run the benchmark
Running benchmark...
100%|██████████| 3/3 [00:03<00:00, 1.21s/it]
100%|██████████| 3/3 [00:03<00:00, 1.07s/it]
[10]:
results_df = benchmark.get_results_as_df() # This formats the results on a DataFrame
results_df[1]
[10]:
Method | Parameter | Signal_id | Repetition | 0 | 10 | 20 | |
---|---|---|---|---|---|---|---|
40 | Hard_Thr | ((), {}) | speech_1 | 0 | 1.088252 | 1.420163 | 2.316756 |
41 | Hard_Thr | ((), {}) | speech_1 | 1 | 1.093497 | 1.440805 | 2.414916 |
42 | Hard_Thr | ((), {}) | speech_1 | 2 | 1.085475 | 1.444266 | 2.383428 |
43 | Hard_Thr | ((), {}) | speech_1 | 3 | 1.084448 | 1.430022 | 2.309501 |
44 | Hard_Thr | ((), {}) | speech_1 | 4 | 1.088424 | 1.467687 | 2.410154 |
45 | Hard_Thr | ((), {}) | speech_1 | 5 | 1.082132 | 1.423051 | 2.281771 |
46 | Hard_Thr | ((), {}) | speech_1 | 6 | 1.116235 | 1.563895 | 2.582777 |
47 | Hard_Thr | ((), {}) | speech_1 | 7 | 1.099539 | 1.500129 | 2.494779 |
48 | Hard_Thr | ((), {}) | speech_1 | 8 | 1.118209 | 1.595721 | 2.728198 |
49 | Hard_Thr | ((), {}) | speech_1 | 9 | 1.093117 | 1.478701 | 2.530983 |
60 | Hard_Thr | ((), {}) | speech_2 | 0 | 1.142343 | 1.564916 | 2.723396 |
61 | Hard_Thr | ((), {}) | speech_2 | 1 | 1.119545 | 1.503485 | 2.566996 |
62 | Hard_Thr | ((), {}) | speech_2 | 2 | 1.127981 | 1.538451 | 2.690908 |
63 | Hard_Thr | ((), {}) | speech_2 | 3 | 1.132775 | 1.528458 | 2.684079 |
64 | Hard_Thr | ((), {}) | speech_2 | 4 | 1.148444 | 1.618737 | 2.785739 |
65 | Hard_Thr | ((), {}) | speech_2 | 5 | 1.128622 | 1.544002 | 2.712857 |
66 | Hard_Thr | ((), {}) | speech_2 | 6 | 1.142200 | 1.637577 | 2.893670 |
67 | Hard_Thr | ((), {}) | speech_2 | 7 | 1.135804 | 1.620108 | 2.923806 |
68 | Hard_Thr | ((), {}) | speech_2 | 8 | 1.162871 | 1.691029 | 3.021406 |
69 | Hard_Thr | ((), {}) | speech_2 | 9 | 1.140893 | 1.602033 | 2.900660 |
50 | Soft_Thr | ((), {}) | speech_1 | 0 | 1.603141 | 2.514249 | 3.585966 |
51 | Soft_Thr | ((), {}) | speech_1 | 1 | 1.684834 | 2.523169 | 3.656339 |
52 | Soft_Thr | ((), {}) | speech_1 | 2 | 1.600709 | 2.417706 | 3.616726 |
53 | Soft_Thr | ((), {}) | speech_1 | 3 | 1.597345 | 2.612795 | 3.518870 |
54 | Soft_Thr | ((), {}) | speech_1 | 4 | 1.616184 | 2.662289 | 3.571961 |
55 | Soft_Thr | ((), {}) | speech_1 | 5 | 1.429834 | 2.302718 | 3.534824 |
56 | Soft_Thr | ((), {}) | speech_1 | 6 | 1.664954 | 2.593343 | 3.627357 |
57 | Soft_Thr | ((), {}) | speech_1 | 7 | 1.540407 | 2.285904 | 3.564100 |
58 | Soft_Thr | ((), {}) | speech_1 | 8 | 1.530832 | 2.413726 | 3.667933 |
59 | Soft_Thr | ((), {}) | speech_1 | 9 | 1.580711 | 2.289388 | 3.050673 |
70 | Soft_Thr | ((), {}) | speech_2 | 0 | 1.736858 | 2.895308 | 3.583558 |
71 | Soft_Thr | ((), {}) | speech_2 | 1 | 1.784057 | 2.815412 | 3.523795 |
72 | Soft_Thr | ((), {}) | speech_2 | 2 | 1.816546 | 3.009974 | 3.558491 |
73 | Soft_Thr | ((), {}) | speech_2 | 3 | 1.811103 | 3.057045 | 3.592642 |
74 | Soft_Thr | ((), {}) | speech_2 | 4 | 1.799981 | 2.716383 | 3.569941 |
75 | Soft_Thr | ((), {}) | speech_2 | 5 | 1.624691 | 3.013990 | 3.637866 |
76 | Soft_Thr | ((), {}) | speech_2 | 6 | 1.897490 | 2.854654 | 3.427008 |
77 | Soft_Thr | ((), {}) | speech_2 | 7 | 1.510544 | 2.504363 | 3.578046 |
78 | Soft_Thr | ((), {}) | speech_2 | 8 | 2.000593 | 3.136826 | 3.549813 |
79 | Soft_Thr | ((), {}) | speech_2 | 9 | 1.992071 | 2.981548 | 3.455972 |
Generating plots with the Results Interpreter.
[11]:
# Summary interactive plots with Plotly
from plotly.offline import iplot
interpreter = ResultsInterpreter(benchmark)
figs = interpreter.get_summary_plotlys(bars=True)
for fig in figs:
fig.update_layout(yaxis_title="PESQ(x_est)/PESQ(x)")
iplot(fig)
Data type cannot be displayed: application/vnd.plotly.v1+json
Data type cannot be displayed: application/vnd.plotly.v1+json
Checking elapsed time for each method
[12]:
df = interpreter.elapsed_time_summary()
df
[12]:
Average time (s) | Std | |
---|---|---|
speech_1-Hard_Thr-((), {}) | 0.000734 | 0.000064 |
speech_1-Soft_Thr-((), {}) | 0.000864 | 0.000090 |
speech_2-Hard_Thr-((), {}) | 0.000747 | 0.000075 |
speech_2-Soft_Thr-((), {}) | 0.000910 | 0.000103 |