Skip to main content

Use case : Machine Learning Validation of models using treble SDK

ml-validation

The Treble SDK is a cutting-edge solution for efficiently evaluating and optimizing audio machine learning (ML) algorithms. It leverages high-fidelity simulations to assess algorithm performance and provides near-real-time, device-specific output rendering through post-processing. With a seamless and automated evaluation process enabled by its Python-based programmatic interface, the SDK can handle a vast array of acoustic scenarios. By replacing expensive and time-consuming physical measurements with virtual prototyping, Treble SDK significantly accelerates development cycles and enhances the accuracy of audio ML solutions.


In this tutorial we show how to setup a high quality physically accurate dataset, ideal for validation audio algorithms.

NOTE This notebook does not contain the data needed to run an actual validation. It's serves as an example of how to use Treble SDK for this example use case.


The following documentation is presented as Python code running inside a Jupyter Notebook. To run it yourself you can copy/type each individual cell or directly download the full notebook, including all required files.

Quick guide : Calculate PESQ on a dataset

This is an example demo of how you can use the treble SDK data to evaluate and valide your audio algorithm and ML processing.

# Include some external dependencies
from pesq import pesq
from glob import glob
from os.path import join
import scipy.io.wavfile as wav
from scipy import signal
import numpy as np
from pathlib import Path
from treble_tsdk.tsdk import TSDK
from treble_tsdk import tsdk_namespace as treble
from treble_tsdk import display_data as dd
import random

# Our working directory for tsdk files.
# Defaults to your home directory /tmp/tsdk, feel free to change it to what you like.
base_dir = Path.home() / "tmp" / "tsdk"
base_dir.mkdir(parents=True, exist_ok=True)

tsdk = TSDK()

project_name = "Machine Learning Validation"
project = tsdk.get_or_create_project(
name=project_name,
description="Tutorial demonstrating the usage of spatial audio with Treble SDK",
)
%env TSDK_DASH_PLOTTING=1

# Specify 3 directories with equally many files: the anechoic directory should include anechoic files,
# reverberant is the anechoic audio convolved with the RIR's and
# the dereverberated directory should contain files that were dereverberated with the algorithm you developed
anechoic_dir = base_dir / "anechoic"
reverberant_dir = base_dir / "reverberant"
dereverberated_dir = base_dir / "dereverberated"
---------------------------------------------------------------------------

ModuleNotFoundError Traceback (most recent call last)

Cell In[1], line 2
1 # Include some external dependencies
----> 2 from pesq import pesq
3 from glob import glob
4 from os.path import join


ModuleNotFoundError: No module named 'pesq'

Helper functions

Here we setup up helper functions. Due note that the user has to add in his own dataset and audio processing. This intended to show how the SDK can be used for ML Validation.

def get_anechoic_data(dataset, outdir):
# This function should load audio, write it to outdir and return audio in numpy array
# here the user should implement what dataset he owns
raise NotImplementedError

def dereverberation_processing(audio, outdir):
# here the user should process the audio with his own algorithms.
raise NotImplementedError


def read_wavs_from_directory(directory):
# Read all .wav files from the specified directory
wav_files = glob(join(directory, "*.wav"))
all_audio = []
for wav_file in wav_files:
fs, audio = wav.read(wav_file)
all_audio.append(audio)

return all_audio, fs


Simulation

Let's select a meeting room from the geometry library. Next we insert a speech source and receivers from the suggested positions and do the material assignment.

rooms = tsdk.geometry_library.get_dataset(dataset="MeetingRoom")
room = rooms[15]

source = room.position_suggestions[0]
receivers = room.position_suggestions[1:]

speech_directivity = tsdk.source_directivity_library.query(name="Speech")[0]
source_properties = treble.SourceProperties(
azimuth_angle=0.0,
elevation_angle=0.0,
source_directivity=speech_directivity
)
source = treble.Source(
x=source.position[0],
y=source.position[1],
z=source.position[2],
source_type=treble.SourceType.directive,
label="Speech_source_1",
source_properties=source_properties)

source_list = [source]

receiver_list = [
treble.Receiver(
x=receiver.position[0],
y=receiver.position[1],
z=receiver.position[2],
receiver_type=treble.ReceiverType.mono,
label=receiver.name,
)
for receiver in receivers
]



material_assignment = []
mat_dict = {}
for layer in room.layer_names:

if layer == "Monitor":
mat_list = tsdk.material_library.search("window")
else:
mat_list = tsdk.material_library.search(layer.split()[0])
material_assignment.append(treble.MaterialAssignment(layer,random.choice(mat_list)))

dd.as_table(material_assignment)

Now we setup a high fidelity wave based only simulation running up to 8 kHz

sim_def = treble.SimulationDefinition(
name="Validation_room",
model=room,
material_assignment=material_assignment,
source_list=source_list,
receiver_list=receiver_list,
energy_decay_threshold=35,
crossover_frequency=8000,
simulation_type=treble.SimulationType.dg
)

Add the simulation to the project and wait for an estimate of the computational resources required


simulation = project.add_simulation(definition=sim_def)
simulation.wait_for_estimate()

# Display the cost estimate for the simulation
dd.as_tree(simulation.estimate())

Start the simulation

project.start_simulations()

View the progress

project.as_live_progress()

When the simulation has finished we can download the results

results_dir = base_dir / "irs"
results = simulation.download_results(destination_directory=results_dir)

Create the validation dataset

for receiver in simulation.receivers:
anechoic_audio = get_anechoic_data(dataset="wsj",outdir=anechoic_dir)
ir = results.get_mono_ir(source=simulation.sources[0],receiver=receiver)
reverberant_audio = signal.convolve(ir,anechoic_audio)
dereverberated_audio = dereverberation_processing(audio=reverberant_audio,outdir=reverberant_dir)

Run the valdation, using pesq as the metric.


# Read .wav files from both directories
anechoic, fs = read_wavs_from_directory(anechoic_dir)
dereverberated, fs = read_wavs_from_directory(dereverberated_dir)

# For each anechoic file, compare it with the corresponding dereverberated file and calculate a PESQ score
all_pesq = []
for file_idx, __ in enumerate(anechoic):
this_pesq = pesq(fs, anechoic[file_idx], dereverberated[file_idx], "wb")
all_pesq.append(this_pesq)

# Calculate the average PESQ score over the entire dataset
avg_pesq = np.mean(all_pesq)
print(f"Average PESQ score over the entire dataset: {avg_pesq:.02f}.")
The following documentation is presented as Python code running inside a Jupyter Notebook. To run it yourself you can copy/type each individual cell or directly download the full notebook, including all required files.