mcsm-benchs: Using user-provided signals and performance metric

mcsm-benchs can be helpful to create benchmarks with signals and performance metrics provided by the user. This can be useful, for instance, when dealing with real-world signals and performance metrics that do not need a noiseless version of the signal.

In this notebook, we create a benchmark with real-world audio signals and a performance metric that is computed based on the original signal and the processed one.

[4]:
import numpy as np
from numpy import pi as pi
import pandas as pd
from matplotlib import pyplot as plt
from mcsm_benchs.Benchmark import Benchmark
from mcsm_benchs.ResultsInterpreter import ResultsInterpreter
from mcsm_benchs.SignalBank import SignalBank
from utils import spectrogram_thresholding, get_stft

from IPython.display import Audio

1. Creating a dictionary of methods

Let’s create a dictionary of methods to benchmark. As as example, we will compare two strategies for spectrogram thresholding. The first one is hard thresholding, in which the thresholding function is defined as: The second one is soft thresholding, here defined as:

These two approaches are implemented in the python function thresholding(signal, lam, fun='hard') function, which receives a signal to clean, a positional argument lam and a keyword argument fun that can be either hard or soft.

Our dictionary of methods will consist then in two methods: hard thresholding and soft thresholding. For both approaches, let’s use a value of lam=1.0 for now.

[5]:

def method_1(noisy_signal, *args, **kwargs): # If additional input parameters are needed, they can be passed in a tuple using # *args or **kwargs and then parsed. xr = spectrogram_thresholding(noisy_signal,1.0,fun='hard') return xr def method_2(noisy_signal, *args, **kwargs): # If additional input parameters are needed, they can be passed in a tuple using # *args or **kwargs and then parsed. xr = spectrogram_thresholding(noisy_signal,2.0,fun='soft') return xr # Create a dictionary of the methods to test. my_methods = { 'Hard_Thr': method_1, 'Soft_Thr': method_2, }

2. Creating a dictionary of user-provided signals

We load two synthesized speech signals, and fix the length to N=2**13 samples. With these signals, we create a dictionary, where the key is going to be used as an identifier of the signal in the benchmark final results.

[6]:
# Loading signals and creating dictionary
N = 2**13
signals_dic = {
    'speech_1': np.loadtxt('6_female.csv')[0:N],
    'speech_2': np.loadtxt('6_male.csv')[0:N]
            }
[7]:
# Listen to the signals
fs = 16000
Audio(signals_dic['speech_2'], rate=fs)
[7]:

3. Defining a performance metric

We use the Perceptual Evaluation of Speech Quality (PESQ) metric as a performance metric.

To do this, we first create a wrapper perf_fun(...) of the function pesq(...). Performarmance metrics must follow the signature perf_fun(x, xest, **kwargs), where

  • x is the original signal (without added noise).

  • xest is the output of a denoising approach.

  • **kwargs is used to receive a number of extra parameters passed by the benchmark class when the function perf_fun(...) is called.

[8]:
from pesq import pesq
# Create a wrapper function for PESQ.
# Normalize by the PESQ of the original signal.
perfuns = {'pesq1':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb')/pesq(fs,x,x,'nb'),
           'pesq2':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb'),}
perfuns['pesq1'](signals_dic['speech_1'],signals_dic['speech_1'])
[8]:
1.0

Now we are ready to instantiate a Benchmark object and run a test using the proposed methods and parameters. The benchmark constructor receives a name of a task (which defines the performance function of the test), a dictionary of the methods to test, the desired length of the signals used in the simulation, a dictionary of different parameters that should be passed to the methods, an array with different values of SNR to test, and the number of repetitions that should be used for each test. Once the object is created, use the class method run_test() to start the experiments.

Remark 1: You can use the ``verbosity`` parameter to show less or more messages during the progress of the experiments. There are 6 levels of verbosity, from ``verbosity=0`` (indicate just the start and the end of the experiments) to ``verbostiy = 5`` (show each method and parameter progress)

Remark 2: Parallelize the experiments is also possible by passing the parameter ``parallelize = True``.

[9]:
benchmark = Benchmark(task = 'denoising',
                        N = N,
                        methods = my_methods,
                        SNRin = [0,10,20],
                        repetitions = 10,
                        signal_ids=signals_dic, # Input user-defined signals
                        verbosity=0,
                        obj_fun=perfuns, # Define a performance metric
                        )

results_dic=benchmark.run() # Run the benchmark
Running benchmark...
100%|██████████| 3/3 [00:03<00:00,  1.21s/it]
100%|██████████| 3/3 [00:03<00:00,  1.07s/it]
[10]:
results_df = benchmark.get_results_as_df() # This formats the results on a DataFrame
results_df[1]
[10]:
Method Parameter Signal_id Repetition 0 10 20
40 Hard_Thr ((), {}) speech_1 0 1.088252 1.420163 2.316756
41 Hard_Thr ((), {}) speech_1 1 1.093497 1.440805 2.414916
42 Hard_Thr ((), {}) speech_1 2 1.085475 1.444266 2.383428
43 Hard_Thr ((), {}) speech_1 3 1.084448 1.430022 2.309501
44 Hard_Thr ((), {}) speech_1 4 1.088424 1.467687 2.410154
45 Hard_Thr ((), {}) speech_1 5 1.082132 1.423051 2.281771
46 Hard_Thr ((), {}) speech_1 6 1.116235 1.563895 2.582777
47 Hard_Thr ((), {}) speech_1 7 1.099539 1.500129 2.494779
48 Hard_Thr ((), {}) speech_1 8 1.118209 1.595721 2.728198
49 Hard_Thr ((), {}) speech_1 9 1.093117 1.478701 2.530983
60 Hard_Thr ((), {}) speech_2 0 1.142343 1.564916 2.723396
61 Hard_Thr ((), {}) speech_2 1 1.119545 1.503485 2.566996
62 Hard_Thr ((), {}) speech_2 2 1.127981 1.538451 2.690908
63 Hard_Thr ((), {}) speech_2 3 1.132775 1.528458 2.684079
64 Hard_Thr ((), {}) speech_2 4 1.148444 1.618737 2.785739
65 Hard_Thr ((), {}) speech_2 5 1.128622 1.544002 2.712857
66 Hard_Thr ((), {}) speech_2 6 1.142200 1.637577 2.893670
67 Hard_Thr ((), {}) speech_2 7 1.135804 1.620108 2.923806
68 Hard_Thr ((), {}) speech_2 8 1.162871 1.691029 3.021406
69 Hard_Thr ((), {}) speech_2 9 1.140893 1.602033 2.900660
50 Soft_Thr ((), {}) speech_1 0 1.603141 2.514249 3.585966
51 Soft_Thr ((), {}) speech_1 1 1.684834 2.523169 3.656339
52 Soft_Thr ((), {}) speech_1 2 1.600709 2.417706 3.616726
53 Soft_Thr ((), {}) speech_1 3 1.597345 2.612795 3.518870
54 Soft_Thr ((), {}) speech_1 4 1.616184 2.662289 3.571961
55 Soft_Thr ((), {}) speech_1 5 1.429834 2.302718 3.534824
56 Soft_Thr ((), {}) speech_1 6 1.664954 2.593343 3.627357
57 Soft_Thr ((), {}) speech_1 7 1.540407 2.285904 3.564100
58 Soft_Thr ((), {}) speech_1 8 1.530832 2.413726 3.667933
59 Soft_Thr ((), {}) speech_1 9 1.580711 2.289388 3.050673
70 Soft_Thr ((), {}) speech_2 0 1.736858 2.895308 3.583558
71 Soft_Thr ((), {}) speech_2 1 1.784057 2.815412 3.523795
72 Soft_Thr ((), {}) speech_2 2 1.816546 3.009974 3.558491
73 Soft_Thr ((), {}) speech_2 3 1.811103 3.057045 3.592642
74 Soft_Thr ((), {}) speech_2 4 1.799981 2.716383 3.569941
75 Soft_Thr ((), {}) speech_2 5 1.624691 3.013990 3.637866
76 Soft_Thr ((), {}) speech_2 6 1.897490 2.854654 3.427008
77 Soft_Thr ((), {}) speech_2 7 1.510544 2.504363 3.578046
78 Soft_Thr ((), {}) speech_2 8 2.000593 3.136826 3.549813
79 Soft_Thr ((), {}) speech_2 9 1.992071 2.981548 3.455972

Generating plots with the Results Interpreter.

[11]:
# Summary interactive plots with Plotly
from plotly.offline import  iplot
interpreter = ResultsInterpreter(benchmark)
figs = interpreter.get_summary_plotlys(bars=True)
for fig in figs:
    fig.update_layout(yaxis_title="PESQ(x_est)/PESQ(x)")
    iplot(fig)

Data type cannot be displayed: application/vnd.plotly.v1+json

Data type cannot be displayed: application/vnd.plotly.v1+json

Checking elapsed time for each method

[12]:
df = interpreter.elapsed_time_summary()
df
[12]:
Average time (s) Std
speech_1-Hard_Thr-((), {}) 0.000734 0.000064
speech_1-Soft_Thr-((), {}) 0.000864 0.000090
speech_2-Hard_Thr-((), {}) 0.000747 0.000075
speech_2-Soft_Thr-((), {}) 0.000910 0.000103