.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here ` to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_examples_plot_train_denoiser.py:
=============================
Train a Denoising Autoencoder
=============================
Train a denoising autoencoder with clean and noisy acoustic features.
To see how soundpy implements this, see `soundpy.models.builtin.denoiser_train`,
`soundpy.builtin.denoiser_feats` and `soundpy.builtin.create_denoise_data`.
.. code-block:: default
import os, sys
import inspect
currentdir = os.path.dirname(os.path.abspath(
inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
parparentdir = os.path.dirname(parentdir)
packagedir = os.path.dirname(parparentdir)
sys.path.insert(0, packagedir)
import matplotlib.pyplot as plt
import IPython.display as ipd
package_dir = '../../../'
os.chdir(package_dir)
sp_dir = package_dir
Let's import soundpy for handling sound
.. code-block:: default
import soundpy as sp
As well as the deep learning component of soundpy
.. code-block:: default
from soundpy import models as spdl
Prepare for Training: Data Organization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Designate path relevant for accessing audiodata
I will load previously extracted features (sample data), see `soundpy.feats.save_features_datasets` or `soundpy.builtin.denoiser_feats`
.. code-block:: default
feature_extraction_dir = '{}audiodata2/example_feats_models/'.format(sp_dir)+\
'denoiser/example_feats_fbank/'
What is in this folder?
.. code-block:: default
feature_extraction_dir = sp.utils.check_dir(feature_extraction_dir)
files = list(feature_extraction_dir.glob('*.*'))
for f in files:
print(f.name)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
test_data_clean_fbank.npy
dataset_audio_assignments.csv
train_data_noisy_fbank.npy
audiofiles_datasets_clean.csv
log_extraction_settings.csv
clean_audio.csv
test_data_noisy_fbank.npy
val_data_noisy_fbank.npy
noisy_audio.csv
train_data_clean_fbank.npy
audiofiles_datasets_noisy.csv
val_data_clean_fbank.npy
The .npy files contain the features themselves, in train, validation, and
test datasets:
.. code-block:: default
files = list(feature_extraction_dir.glob('*.npy'))
for f in files:
print(f.name)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
test_data_clean_fbank.npy
train_data_noisy_fbank.npy
test_data_noisy_fbank.npy
val_data_noisy_fbank.npy
train_data_clean_fbank.npy
val_data_clean_fbank.npy
The .csv files contain information about how the features were extracted
.. code-block:: default
files = list(feature_extraction_dir.glob('*.csv'))
for f in files:
print(f.name)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
dataset_audio_assignments.csv
audiofiles_datasets_clean.csv
log_extraction_settings.csv
clean_audio.csv
noisy_audio.csv
audiofiles_datasets_noisy.csv
We'll have a look at which features were extracted and other settings:
.. code-block:: default
feat_settings = sp.utils.load_dict(
feature_extraction_dir.joinpath('log_extraction_settings.csv'))
for key, value in feat_settings.items():
print(key, ' --> ', value)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
dur_sec --> 3
feature_type --> fbank noisy
feat_type --> fbank
complex_vals --> False
sr --> 22050
num_feats --> 40
n_fft --> 441
win_size_ms --> 20
frame_length --> 441
percent_overlap --> 0.5
frames_per_sample --> 11
labeled_data --> False
visualize --> True
input_shape --> (28, 11, 40)
desired_shape --> (308, 40)
use_librosa --> True
center --> True
mode --> reflect
subsection_data --> False
divide_factor --> 5
kwargs --> {}
For more about these settings, see `soundpy.feats.save_features_datasets`.
We'll have a look at the audio files that were assigned
to the train, val, and test datasets.
.. code-block:: default
audio_datasets = sp.utils.load_dict(
feature_extraction_dir.joinpath('audiofiles_datasets_clean.csv'))
count = 0
for key, value in audio_datasets.items():
print(key, ' --> ', value)
count += 1
if count > 5:
break
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
train --> ['../../../../mini-audio-datasets/denoise/clean/S_20_03.wav', '../../../../mini-audio-datasets/denoise/clean/S_07_09.wav', '../../../../mini-audio-datasets/denoise/clean/S_29_06.wav', '../../../../mini-audio-datasets/denoise/clean/S_16_07.wav', '../../../../mini-audio-datasets/denoise/clean/S_04_10.wav', '../../../../mini-audio-datasets/denoise/clean/S_01_03.wav']
val --> ['../../../../mini-audio-datasets/denoise/clean/S_01_02.wav', '../../../../mini-audio-datasets/denoise/clean/S_17_06.wav']
test --> ['../../../../mini-audio-datasets/denoise/clean/S_18_08.wav', '../../../../mini-audio-datasets/denoise/clean/S_18_01.wav']
Built-In Functionality: soundpy does everything for you
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For more about this, see `soundpy.builtin.denoiser_train`.
.. code-block:: default
model_dir, history = spdl.denoiser_train(
feature_extraction_dir = feature_extraction_dir,
epochs = 10)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
The model will be trained 10 epochs per training session.
Total possible epochs: 10
TRAINING SESSION 1
Training on:
../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/train_data_noisy_fbank.npy
../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/train_data_clean_fbank.npy
Epoch 1/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6914
2/6 [=========>....................] - ETA: 0s - loss: 0.6911
3/6 [==============>...............] - ETA: 0s - loss: 0.6903
4/6 [===================>..........] - ETA: 0s - loss: 0.6898
5/6 [========================>.....] - ETA: 0s - loss: 0.6894
6/6 [==============================] - ETA: 0s - loss: 0.6882
Epoch 00001: val_loss improved from inf to 0.68425, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 156ms/step - loss: 0.6882 - val_loss: 0.6843
Epoch 2/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6845
2/6 [=========>....................] - ETA: 0s - loss: 0.6844
3/6 [==============>...............] - ETA: 0s - loss: 0.6830
4/6 [===================>..........] - ETA: 0s - loss: 0.6824
5/6 [========================>.....] - ETA: 0s - loss: 0.6818
6/6 [==============================] - ETA: 0s - loss: 0.6800
Epoch 00002: val_loss improved from 0.68425 to 0.67458, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 137ms/step - loss: 0.6800 - val_loss: 0.6746
Epoch 3/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6746
2/6 [=========>....................] - ETA: 0s - loss: 0.6745
3/6 [==============>...............] - ETA: 0s - loss: 0.6724
4/6 [===================>..........] - ETA: 0s - loss: 0.6715
5/6 [========================>.....] - ETA: 0s - loss: 0.6708
6/6 [==============================] - ETA: 0s - loss: 0.6681
Epoch 00003: val_loss improved from 0.67458 to 0.66083, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 142ms/step - loss: 0.6681 - val_loss: 0.6608
Epoch 4/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6598
2/6 [=========>....................] - ETA: 0s - loss: 0.6597
3/6 [==============>...............] - ETA: 0s - loss: 0.6568
4/6 [===================>..........] - ETA: 0s - loss: 0.6554
5/6 [========================>.....] - ETA: 0s - loss: 0.6544
6/6 [==============================] - ETA: 0s - loss: 0.6509
Epoch 00004: val_loss improved from 0.66083 to 0.64138, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 141ms/step - loss: 0.6509 - val_loss: 0.6414
Epoch 5/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6380
2/6 [=========>....................] - ETA: 0s - loss: 0.6376
3/6 [==============>...............] - ETA: 0s - loss: 0.6337
4/6 [===================>..........] - ETA: 0s - loss: 0.6317
5/6 [========================>.....] - ETA: 0s - loss: 0.6303
6/6 [==============================] - ETA: 0s - loss: 0.6262
Epoch 00005: val_loss improved from 0.64138 to 0.61368, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 137ms/step - loss: 0.6262 - val_loss: 0.6137
Epoch 6/10
1/6 [====>.........................] - ETA: 0s - loss: 0.6063
2/6 [=========>....................] - ETA: 0s - loss: 0.6050
3/6 [==============>...............] - ETA: 0s - loss: 0.6002
4/6 [===================>..........] - ETA: 0s - loss: 0.5971
5/6 [========================>.....] - ETA: 0s - loss: 0.5952
6/6 [==============================] - ETA: 0s - loss: 0.5906
Epoch 00006: val_loss improved from 0.61368 to 0.57419, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 139ms/step - loss: 0.5906 - val_loss: 0.5742
Epoch 7/10
1/6 [====>.........................] - ETA: 0s - loss: 0.5608
2/6 [=========>....................] - ETA: 0s - loss: 0.5577
3/6 [==============>...............] - ETA: 0s - loss: 0.5524
4/6 [===================>..........] - ETA: 0s - loss: 0.5478
5/6 [========================>.....] - ETA: 0s - loss: 0.5450
6/6 [==============================] - ETA: 0s - loss: 0.5408
Epoch 00007: val_loss improved from 0.57419 to 0.51923, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 134ms/step - loss: 0.5408 - val_loss: 0.5192
Epoch 8/10
1/6 [====>.........................] - ETA: 0s - loss: 0.4977
2/6 [=========>....................] - ETA: 0s - loss: 0.4912
3/6 [==============>...............] - ETA: 0s - loss: 0.4858
4/6 [===================>..........] - ETA: 0s - loss: 0.4794
5/6 [========================>.....] - ETA: 0s - loss: 0.4756
6/6 [==============================] - ETA: 0s - loss: 0.4732
Epoch 00008: val_loss improved from 0.51923 to 0.44650, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 134ms/step - loss: 0.4732 - val_loss: 0.4465
Epoch 9/10
1/6 [====>.........................] - ETA: 0s - loss: 0.4141
2/6 [=========>....................] - ETA: 0s - loss: 0.4031
3/6 [==============>...............] - ETA: 0s - loss: 0.3988
4/6 [===================>..........] - ETA: 0s - loss: 0.3909
5/6 [========================>.....] - ETA: 0s - loss: 0.3870
6/6 [==============================] - ETA: 0s - loss: 0.3891
Epoch 00009: val_loss improved from 0.44650 to 0.36376, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 135ms/step - loss: 0.3891 - val_loss: 0.3638
Epoch 10/10
1/6 [====>.........................] - ETA: 0s - loss: 0.3176
2/6 [=========>....................] - ETA: 0s - loss: 0.3024
3/6 [==============>...............] - ETA: 0s - loss: 0.3015
4/6 [===================>..........] - ETA: 0s - loss: 0.2936
5/6 [========================>.....] - ETA: 0s - loss: 0.2914
6/6 [==============================] - ETA: 0s - loss: 0.3017
Epoch 00010: val_loss improved from 0.36376 to 0.29168, saving model to ../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms/model_autoencoder_denoise_9m3d13h26m6s116ms.h5
6/6 [==============================] - 1s 138ms/step - loss: 0.3017 - val_loss: 0.2917
Finished training the model. The model and associated files can be found here:
../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms
Where the model and logs are located:
.. code-block:: default
model_dir
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
PosixPath('../../../audiodata2/example_feats_models/denoiser/example_feats_fbank/model_autoencoder_denoise_9m3d13h26m6s116ms')
Let's plot how the model performed (on this mini dataset)
.. code-block:: default
import matplotlib.pyplot as plt
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper right')
plt.savefig('loss.png')
.. image:: /auto_examples/images/sphx_glr_plot_train_denoiser_001.png
:alt: model loss
:class: sphx-glr-single-img
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 12.956 seconds)
.. _sphx_glr_download_auto_examples_plot_train_denoiser.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_train_denoiser.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_train_denoiser.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_