.. only:: html
.. note::
:class: sphx-glr-download-link-note
Click :ref:`here ` to download the full example code
.. rst-class:: sphx-glr-example-title
.. _sphx_glr_auto_examples_plot_featureprep_denoiser.py:
=======================================================
Feature Extraction for Denoising: Clean and Noisy Audio
=======================================================
Extract acoustic features from clean and noisy datasets for
training a denoising model, e.g. a denoising autoencoder.
To see how soundpy implements this, see `soundpy.builtin.denoiser_feats`.
.. code-block:: default
import os, sys
import inspect
currentdir = os.path.dirname(os.path.abspath(
inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
parparentdir = os.path.dirname(parentdir)
packagedir = os.path.dirname(parparentdir)
sys.path.insert(0, packagedir)
import soundpy as sp
import IPython.display as ipd
package_dir = '../../../'
os.chdir(package_dir)
sp_dir = package_dir
Prepare for Extraction: Data Organization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I will use a mini denoising dataset as an example
.. code-block:: default
# Example noisy data:
data_noisy_dir = '{}../mini-audio-datasets/denoise/noisy'.format(sp_dir)
# Example clean data:
data_clean_dir = '{}../mini-audio-datasets/denoise/clean'.format(sp_dir)
# Where to save extracted features:
data_features_dir = './audiodata/example_feats_models/denoiser/'
Choose Feature Type
~~~~~~~~~~~~~~~~~~~
We can extract 'mfcc', 'fbank', 'powspec', and 'stft'.
if you are working with speech, I suggest 'fbank', 'powspec', or 'stft'.
.. code-block:: default
feature_type = 'stft'
sr = 22050
Set Duration of Audio
~~~~~~~~~~~~~~~~~~~~~
How much audio in seconds used from each audio file.
the speech samples are about 3 seconds long.
.. code-block:: default
dur_sec = 3
Option 1: Built-In Functionality: soundpy does everything for you
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Define which data to use and which features to extract.
NOTE: beacuse of the very small dataset, will set
`perc_train` to a lower level than 0.8. (Otherwise, will raise error)
Everything else is based on defaults. A feature folder with
the feature data will be created in the current working directory.
(Although, you can set this under the parameter `data_features_dir`)
`visualize` saves periodic images of the features extracted.
This is useful if you want to know what's going on during the process.
.. code-block:: default
perc_train = 0.6 # with larger datasets this would be around 0.8
extraction_dir = sp.denoiser_feats(
data_clean_dir = data_clean_dir,
data_noisy_dir = data_noisy_dir,
sr = sr,
feature_type = feature_type,
dur_sec = dur_sec,
perc_train = perc_train,
visualize=True);
extraction_dir
.. image:: /auto_examples/images/sphx_glr_plot_featureprep_denoiser_001.png
:alt: test STFT features: label NOISY
:class: sphx-glr-single-img
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:352: UserWarning: Some files did not match those acceptable by this program. (i.e. non-audio files) The number of files not included: 3
warnings.warn(message)
/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2383: UserWarning:
WARNING: `win_size_ms` was not set. Setting it to 20 ms
warnings.warn(msg)
/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2390: UserWarning:
WARNING: `percent_overlap` was not set. Setting it to 0.5
warnings.warn(msg)
16% through train stft feature extraction
33% through train stft feature extraction
50% through train stft feature extraction
66% through train stft feature extraction
83% through train stft feature extraction
100% through train stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_clean.npy
50% through val stft feature extraction
100% through val stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_clean.npy
50% through test stft feature extraction
100% through test stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_clean.npy
16% through train stft feature extraction
33% through train stft feature extraction
50% through train stft feature extraction
66% through train stft feature extraction
83% through train stft feature extraction
100% through train stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_noisy.npy
50% through val stft feature extraction
100% through val stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_noisy.npy
50% through test stft feature extraction
100% through test stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_noisy.npy
Finished! Total duration: 2.32 seconds.
PosixPath('audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms')
The extracted features, extraction settings applied, and
which audio files were assigned to which datasets
will be saved in the `extraction_dir` directory
Logged Information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Let's have a look at the files in the extraction_dir. The files ending
with .npy extension contain the feature data; the .csv files contain
logged information.
.. code-block:: default
featfiles = list(extraction_dir.glob('*.*'))
for f in featfiles:
print(f.name)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
val_data_noisy.npy
dataset_audio_assignments.csv
val_data_clean.npy
test_data_noisy.npy
audiofiles_datasets_clean.csv
test_data_clean.npy
log_extraction_settings.csv
train_data_clean.npy
clean_audio.csv
noisy_audio.csv
train_data_noisy.npy
audiofiles_datasets_noisy.csv
Feature Settings
~~~~~~~~~~~~~~~~~~
Since much was conducted behind the scenes, it's nice to know how the features
were extracted, for example, the sample rate and number of frequency bins applied, etc.
.. code-block:: default
feat_settings = sp.utils.load_dict(
extraction_dir.joinpath('log_extraction_settings.csv'))
for key, value in feat_settings.items():
print(key, ' ---> ', value)
.. rst-class:: sphx-glr-script-out
Out:
.. code-block:: none
dataset_dirs ---> ['../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy']
feat_base_shape ---> (299, 221)
feat_model_shape ---> (299, 221)
complex_vals ---> True
context_window --->
frames_per_sample --->
labeled_data ---> False
decode_dict --->
visualize ---> True
vis_every_n_frames ---> 50
subsection_data ---> False
divide_factor ---> 5
total_audiofiles ---> 10
kwargs ---> {'sr': 22050, 'feature_type': 'stft', 'dur_sec': 3, 'win_size_ms': 20, 'percent_overlap': 0.5}
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 0 minutes 2.400 seconds)
.. _sphx_glr_download_auto_examples_plot_featureprep_denoiser.py:
.. only :: html
.. container:: sphx-glr-footer
:class: sphx-glr-footer-example
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download Python source code: plot_featureprep_denoiser.py `
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download Jupyter notebook: plot_featureprep_denoiser.ipynb `
.. only:: html
.. rst-class:: sphx-glr-signature
`Gallery generated by Sphinx-Gallery `_