.. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_implement_denoiser.py: ================================= Implement a Denoising Autoencoder ================================= Implement denoising autoencoder to denoise a noisy speech signal. To see how soundpy implements this, see `soundpy.models.builtin.denoiser_run`. Let's import soundpy and other packages .. code-block:: default import soundpy as sp import numpy as np # for playing audio in this notebook: import IPython.display as ipd As well as the deep learning component of soundpy .. code-block:: default from soundpy import models as spdl Prepare for Implementation: Data Organization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Set path relevant for audio data for this example .. code-block:: default sp_dir = '../../../' Set model pathway ~~~~~~~~~~~~~~~~~ Currently, this expects a model saved with weights, with a .h5 extension. (See `model` below) The soundpy repo offers a pre-trained denoiser, which we'll use. .. code-block:: default model = '{}audiodata/models/'.format(sp_dir)+\ 'denoiser/example_denoiser_stft.h5' # ensure is a pathlib.PosixPath object print(model) model = sp.utils.string2pathlib(model) model_dir = model.parent .. rst-class:: sphx-glr-script-out Out: .. code-block:: none ../../../audiodata/models/denoiser/example_denoiser_stft.h5 What is in this folder? .. code-block:: default files = list(model_dir.glob('*.*')) for f in files: print(f.name) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none example_denoiser_stft.h5 log_extraction_settings.csv log.csv Provide dictionary with feature extraction settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If soundpy extracts features for you, a 'log_extraction_settings.csv' file will be saved, which includes relevant feature settings for implementing the model; see `soundpy.feats.save_features_datasets` .. code-block:: default feat_settings = sp.utils.load_dict( model_dir.joinpath('log_extraction_settings.csv')) for key, value in feat_settings.items(): print(key, ' --> ', value) # change objects that were string to original format import ast try: feat_settings[key] = ast.literal_eval(value) except ValueError: pass except SyntaxError: pass .. rst-class:: sphx-glr-script-out Out: .. code-block:: none dur_sec --> 3 feature_type --> stft noisy feat_type --> stft complex_vals --> True sr --> 22050 num_feats --> 177 n_fft --> 352 win_size_ms --> 16 frame_length --> 352 percent_overlap --> 0.5 window --> hann frames_per_sample --> 11 labeled_data --> False visualize --> True input_shape --> (35, 11, 177) desired_shape --> (385, 177) use_librosa --> True center --> True mode --> reflect subsection_data --> True divide_factor --> 10 For the purposes of plotting, let's use some of the settings defined: .. code-block:: default feature_type = feat_settings['feature_type'] sr = feat_settings['sr'] Provide new audio for the denoiser to denoise! ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We'll use sample speech from the soundpy repo: .. code-block:: default speech = sp.string2pathlib('{}audiodata/python.wav'.format(sp_dir)) s, sr = sp.loadsound(speech, sr=sr) Let's add some white noise (10 SNR) .. code-block:: default s_n = sp.augment.add_white_noise(s, sr=sr, snr=10) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:1027: UserWarning: Warning: voice-activity-detection works best with sample rates above 44100 Hz. Current `sr` set at 22050. warnings.warn(msg) /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/dsp.py:2782: UserWarning: Warning: VAD works best with sample rates above 44100 Hz. warnings.warn(msg) /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/dsp.py:769: UserWarning: Warning: `soundpy.dsp.clip_at_zero` found no samples close to zero. Clipping was not applied. warnings.warn(msg) What does the noisy audio sound like? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default ipd.Audio(s_n,rate=sr) .. only:: builder_html .. raw:: html

What does the noisy audio look like? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default sp.plotsound(s_n, sr = sr, feature_type='signal', subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_001.png :alt: SIGNAL Features :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:117: UserWarning: Due to matplotlib using AGG backend, cannot display plot. Therefore, the plot will be saved here: current working directory warnings.warn(msg) What does the clean audio sound like? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default ipd.Audio(s,rate=sr) .. only:: builder_html .. raw:: html

What does the clean audio look like? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default sp.plotsound(s, sr = sr, feature_type='signal', subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_002.png :alt: SIGNAL Features :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:117: UserWarning: Due to matplotlib using AGG backend, cannot display plot. Therefore, the plot will be saved here: current working directory warnings.warn(msg) Built-In Denoiser Functionality ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We just need to feed the model path, the noisy sample path, and the feature settings dictionary we looked at above. .. code-block:: default y, sr = spdl.denoiser_run(model, s_n, feat_settings) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none WARNING:tensorflow:Model was constructed with shape (None, 11, 177, 1) for input Tensor("conv2d_1_input:0", shape=(None, 11, 177, 1), dtype=float32), but it was called on an input with incompatible shape (None, 35, 11, 177). /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/models/builtin.py:758: UserWarning: WARNING: adjustments to feature extraction in a more recent SoundPy version may result in imperfect feature alignmnet with a model trained with features generated with a previous SoundPy version. Sincerest apologies! warnings.warn(msg) How does the output sound? ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default ipd.Audio(y,rate=sr) .. only:: builder_html .. raw:: html

How does is the output look? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: default sp.plotsound(y, sr=sr, feature_type = feature_type, subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_003.png :alt: STFT NOISY Features :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:117: UserWarning: Due to matplotlib using AGG backend, cannot display plot. Therefore, the plot will be saved here: current working directory warnings.warn(msg) How do the features compare? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ STFT features of the noisy input speech: .. code-block:: default sp.plotsound(s_n, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db', title = 'Noisy input: STFT features', subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_004.png :alt: Noisy input: STFT features :class: sphx-glr-single-img STFT features of the output .. code-block:: default sp.plotsound(y, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db', title = 'Denoiser Output: STFT features', subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_005.png :alt: Denoiser Output: STFT features :class: sphx-glr-single-img STFT features of the clean version of the audio: .. code-block:: default sp.plotsound(s, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db', title = 'Clean "target" audio: STFT features', subprocess=True) .. image:: /auto_examples/images/sphx_glr_plot_implement_denoiser_006.png :alt: Clean "target" audio: STFT features :class: sphx-glr-single-img It's not perfect but for a pretty simple implementation, the noise is gone and you can hear the person speaking. Pretty cool! .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 5.950 seconds) .. _sphx_glr_download_auto_examples_plot_implement_denoiser.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_implement_denoiser.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_implement_denoiser.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_