`mcsm-benchs`: Using user-provided signals and performance metric

mcsm-benchs can be helpful to create benchmarks with signals and performance metrics provided by the user. This can be useful, for instance, when dealing with real-world signals and performance metrics that do not need a noiseless version of the signal.

In this notebook, we create a benchmark with real-world audio signals and a performance metric that is computed based on the original signal and the processed one.

[4]:

import numpy as np
from numpy import pi as pi
import pandas as pd
from matplotlib import pyplot as plt
from mcsm_benchs.Benchmark import Benchmark
from mcsm_benchs.ResultsInterpreter import ResultsInterpreter
from mcsm_benchs.SignalBank import SignalBank
from utils import spectrogram_thresholding, get_stft

from IPython.display import Audio

1. Creating a dictionary of methods

Let’s create a dictionary of methods to benchmark. As as example, we will compare two strategies for spectrogram thresholding. The first one is hard thresholding, in which the thresholding function is defined as: The second one is soft thresholding, here defined as:

These two approaches are implemented in the python function thresholding(signal, lam, fun='hard') function, which receives a signal to clean, a positional argument lam and a keyword argument fun that can be either hard or soft.

Our dictionary of methods will consist then in two methods: hard thresholding and soft thresholding. For both approaches, let’s use a value of lam=1.0 for now.

[5]:

def method_1(noisy_signal, *args, **kwargs):
    # If additional input parameters are needed, they can be passed in a tuple using
    # *args or **kwargs and then parsed.
    xr = spectrogram_thresholding(noisy_signal,1.0,fun='hard')
    return xr

def method_2(noisy_signal, *args, **kwargs):
    # If additional input parameters are needed, they can be passed in a tuple using
    # *args or **kwargs and then parsed.
    xr = spectrogram_thresholding(noisy_signal,2.0,fun='soft')
    return xr

# Create a dictionary of the methods to test.
my_methods = {
    'Hard_Thr': method_1,
    'Soft_Thr': method_2,
    }

2. Creating a dictionary of user-provided signals

We load two synthesized speech signals, and fix the length to N=2**13 samples. With these signals, we create a dictionary, where the key is going to be used as an identifier of the signal in the benchmark final results.

[6]:

# Loading signals and creating dictionary
N = 2**13
signals_dic = {
    'speech_1': np.loadtxt('6_female.csv')[0:N],
    'speech_2': np.loadtxt('6_male.csv')[0:N]
            }

[7]:

# Listen to the signals
fs = 16000
Audio(signals_dic['speech_2'], rate=fs)

[7]:

3. Defining a performance metric

We use the Perceptual Evaluation of Speech Quality (PESQ) metric as a performance metric.

To do this, we first create a wrapper perf_fun(...) of the function pesq(...). Performarmance metrics must follow the signature perf_fun(x, xest, **kwargs), where

x is the original signal (without added noise).
xest is the output of a denoising approach.
**kwargs is used to receive a number of extra parameters passed by the benchmark class when the function perf_fun(...) is called.

[8]:

from pesq import pesq
# Create a wrapper function for PESQ.
# Normalize by the PESQ of the original signal.
perfuns = {'pesq1':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb')/pesq(fs,x,x,'nb'),
           'pesq2':lambda x,xest,**kwargs: pesq(fs,x,xest,'nb'),}
perfuns['pesq1'](signals_dic['speech_1'],signals_dic['speech_1'])

[8]:

1.0

Now we are ready to instantiate a Benchmark object and run a test using the proposed methods and parameters. The benchmark constructor receives a name of a task (which defines the performance function of the test), a dictionary of the methods to test, the desired length of the signals used in the simulation, a dictionary of different parameters that should be passed to the methods, an array with different values of SNR to test, and the number of repetitions that should be used for each test. Once the object is created, use the class method run_test() to start the experiments.

Remark 1: You can use the ``verbosity`` parameter to show less or more messages during the progress of the experiments. There are 6 levels of verbosity, from ``verbosity=0`` (indicate just the start and the end of the experiments) to ``verbostiy = 5`` (show each method and parameter progress)

Remark 2: Parallelize the experiments is also possible by passing the parameter ``parallelize = True``.

[9]:

benchmark = Benchmark(task = 'denoising',
                        N = N,
                        methods = my_methods,
                        SNRin = [0,10,20],
                        repetitions = 10,
                        signal_ids=signals_dic, # Input user-defined signals
                        verbosity=0,
                        obj_fun=perfuns, # Define a performance metric
                        )

results_dic=benchmark.run() # Run the benchmark

Running benchmark...

100%|██████████| 3/3 [00:03<00:00,  1.21s/it]
100%|██████████| 3/3 [00:03<00:00,  1.07s/it]

[10]:

results_df = benchmark.get_results_as_df() # This formats the results on a DataFrame
results_df[1]

[10]:

	Method	Parameter	Signal_id	Repetition	0	10	20
40	Hard_Thr	((), {})	speech_1	0	1.088252	1.420163	2.316756
41	Hard_Thr	((), {})	speech_1	1	1.093497	1.440805	2.414916
42	Hard_Thr	((), {})	speech_1	2	1.085475	1.444266	2.383428
43	Hard_Thr	((), {})	speech_1	3	1.084448	1.430022	2.309501
44	Hard_Thr	((), {})	speech_1	4	1.088424	1.467687	2.410154
45	Hard_Thr	((), {})	speech_1	5	1.082132	1.423051	2.281771
46	Hard_Thr	((), {})	speech_1	6	1.116235	1.563895	2.582777
47	Hard_Thr	((), {})	speech_1	7	1.099539	1.500129	2.494779
48	Hard_Thr	((), {})	speech_1	8	1.118209	1.595721	2.728198
49	Hard_Thr	((), {})	speech_1	9	1.093117	1.478701	2.530983
60	Hard_Thr	((), {})	speech_2	0	1.142343	1.564916	2.723396
61	Hard_Thr	((), {})	speech_2	1	1.119545	1.503485	2.566996
62	Hard_Thr	((), {})	speech_2	2	1.127981	1.538451	2.690908
63	Hard_Thr	((), {})	speech_2	3	1.132775	1.528458	2.684079
64	Hard_Thr	((), {})	speech_2	4	1.148444	1.618737	2.785739
65	Hard_Thr	((), {})	speech_2	5	1.128622	1.544002	2.712857
66	Hard_Thr	((), {})	speech_2	6	1.142200	1.637577	2.893670
67	Hard_Thr	((), {})	speech_2	7	1.135804	1.620108	2.923806
68	Hard_Thr	((), {})	speech_2	8	1.162871	1.691029	3.021406
69	Hard_Thr	((), {})	speech_2	9	1.140893	1.602033	2.900660
50	Soft_Thr	((), {})	speech_1	0	1.603141	2.514249	3.585966
51	Soft_Thr	((), {})	speech_1	1	1.684834	2.523169	3.656339
52	Soft_Thr	((), {})	speech_1	2	1.600709	2.417706	3.616726
53	Soft_Thr	((), {})	speech_1	3	1.597345	2.612795	3.518870
54	Soft_Thr	((), {})	speech_1	4	1.616184	2.662289	3.571961
55	Soft_Thr	((), {})	speech_1	5	1.429834	2.302718	3.534824
56	Soft_Thr	((), {})	speech_1	6	1.664954	2.593343	3.627357
57	Soft_Thr	((), {})	speech_1	7	1.540407	2.285904	3.564100
58	Soft_Thr	((), {})	speech_1	8	1.530832	2.413726	3.667933
59	Soft_Thr	((), {})	speech_1	9	1.580711	2.289388	3.050673
70	Soft_Thr	((), {})	speech_2	0	1.736858	2.895308	3.583558
71	Soft_Thr	((), {})	speech_2	1	1.784057	2.815412	3.523795
72	Soft_Thr	((), {})	speech_2	2	1.816546	3.009974	3.558491
73	Soft_Thr	((), {})	speech_2	3	1.811103	3.057045	3.592642
74	Soft_Thr	((), {})	speech_2	4	1.799981	2.716383	3.569941
75	Soft_Thr	((), {})	speech_2	5	1.624691	3.013990	3.637866
76	Soft_Thr	((), {})	speech_2	6	1.897490	2.854654	3.427008
77	Soft_Thr	((), {})	speech_2	7	1.510544	2.504363	3.578046
78	Soft_Thr	((), {})	speech_2	8	2.000593	3.136826	3.549813
79	Soft_Thr	((), {})	speech_2	9	1.992071	2.981548	3.455972

Generating plots with the Results Interpreter.

[11]:

# Summary interactive plots with Plotly
from plotly.offline import  iplot
interpreter = ResultsInterpreter(benchmark)
figs = interpreter.get_summary_plotlys(bars=True)
for fig in figs:
    fig.update_layout(yaxis_title="PESQ(x_est)/PESQ(x)")
    iplot(fig)

Data type cannot be displayed: application/vnd.plotly.v1+json

Checking elapsed time for each method

[12]:

df = interpreter.elapsed_time_summary()
df

[12]:

	Average time (s)	Std
speech_1-Hard_Thr-((), {})	0.000734	0.000064
speech_1-Soft_Thr-((), {})	0.000864	0.000090
speech_2-Hard_Thr-((), {})	0.000747	0.000075
speech_2-Soft_Thr-((), {})	0.000910	0.000103

mcsm-benchs: Using user-provided signals and performance metric

1. Creating a dictionary of methods

2. Creating a dictionary of user-provided signals

3. Defining a performance metric

Generating plots with the Results Interpreter.

Checking elapsed time for each method

`mcsm-benchs`: Using user-provided signals and performance metric