.. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_featureprep_denoiser.py: ======================================================= Feature Extraction for Denoising: Clean and Noisy Audio ======================================================= Extract acoustic features from clean and noisy datasets for training a denoising model, e.g. a denoising autoencoder. To see how soundpy implements this, see `soundpy.builtin.denoiser_feats`. .. code-block:: default import os, sys import inspect currentdir = os.path.dirname(os.path.abspath( inspect.getfile(inspect.currentframe()))) parentdir = os.path.dirname(currentdir) parparentdir = os.path.dirname(parentdir) packagedir = os.path.dirname(parparentdir) sys.path.insert(0, packagedir) import soundpy as sp import IPython.display as ipd package_dir = '../../../' os.chdir(package_dir) sp_dir = package_dir Prepare for Extraction: Data Organization ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I will use a mini denoising dataset as an example .. code-block:: default # Example noisy data: data_noisy_dir = '{}../mini-audio-datasets/denoise/noisy'.format(sp_dir) # Example clean data: data_clean_dir = '{}../mini-audio-datasets/denoise/clean'.format(sp_dir) # Where to save extracted features: data_features_dir = './audiodata/example_feats_models/denoiser/' Choose Feature Type ~~~~~~~~~~~~~~~~~~~ We can extract 'mfcc', 'fbank', 'powspec', and 'stft'. if you are working with speech, I suggest 'fbank', 'powspec', or 'stft'. .. code-block:: default feature_type = 'stft' sr = 22050 Set Duration of Audio ~~~~~~~~~~~~~~~~~~~~~ How much audio in seconds used from each audio file. the speech samples are about 3 seconds long. .. code-block:: default dur_sec = 3 Option 1: Built-In Functionality: soundpy does everything for you ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Define which data to use and which features to extract. NOTE: beacuse of the very small dataset, will set `perc_train` to a lower level than 0.8. (Otherwise, will raise error) Everything else is based on defaults. A feature folder with the feature data will be created in the current working directory. (Although, you can set this under the parameter `data_features_dir`) `visualize` saves periodic images of the features extracted. This is useful if you want to know what's going on during the process. .. code-block:: default perc_train = 0.6 # with larger datasets this would be around 0.8 extraction_dir = sp.denoiser_feats( data_clean_dir = data_clean_dir, data_noisy_dir = data_noisy_dir, sr = sr, feature_type = feature_type, dur_sec = dur_sec, perc_train = perc_train, visualize=True); extraction_dir .. image:: /auto_examples/images/sphx_glr_plot_featureprep_denoiser_001.png :alt: test STFT features: label NOISY :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:352: UserWarning: Some files did not match those acceptable by this program. (i.e. non-audio files) The number of files not included: 3 warnings.warn(message) /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2383: UserWarning: WARNING: `win_size_ms` was not set. Setting it to 20 ms warnings.warn(msg) /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2390: UserWarning: WARNING: `percent_overlap` was not set. Setting it to 0.5 warnings.warn(msg) 16% through train stft feature extraction 33% through train stft feature extraction 50% through train stft feature extraction 66% through train stft feature extraction 83% through train stft feature extraction 100% through train stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_clean.npy 50% through val stft feature extraction 100% through val stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_clean.npy 50% through test stft feature extraction 100% through test stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_clean.npy 16% through train stft feature extraction 33% through train stft feature extraction 50% through train stft feature extraction 66% through train stft feature extraction 83% through train stft feature extraction 100% through train stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_noisy.npy 50% through val stft feature extraction 100% through val stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_noisy.npy 50% through test stft feature extraction 100% through test stft feature extraction Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_noisy.npy Finished! Total duration: 2.32 seconds. PosixPath('audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms') The extracted features, extraction settings applied, and which audio files were assigned to which datasets will be saved in the `extraction_dir` directory Logged Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Let's have a look at the files in the extraction_dir. The files ending with .npy extension contain the feature data; the .csv files contain logged information. .. code-block:: default featfiles = list(extraction_dir.glob('*.*')) for f in featfiles: print(f.name) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none val_data_noisy.npy dataset_audio_assignments.csv val_data_clean.npy test_data_noisy.npy audiofiles_datasets_clean.csv test_data_clean.npy log_extraction_settings.csv train_data_clean.npy clean_audio.csv noisy_audio.csv train_data_noisy.npy audiofiles_datasets_noisy.csv Feature Settings ~~~~~~~~~~~~~~~~~~ Since much was conducted behind the scenes, it's nice to know how the features were extracted, for example, the sample rate and number of frequency bins applied, etc. .. code-block:: default feat_settings = sp.utils.load_dict( extraction_dir.joinpath('log_extraction_settings.csv')) for key, value in feat_settings.items(): print(key, ' ---> ', value) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none dataset_dirs ---> ['../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy'] feat_base_shape ---> (299, 221) feat_model_shape ---> (299, 221) complex_vals ---> True context_window ---> frames_per_sample ---> labeled_data ---> False decode_dict ---> visualize ---> True vis_every_n_frames ---> 50 subsection_data ---> False divide_factor ---> 5 total_audiofiles ---> 10 kwargs ---> {'sr': 22050, 'feature_type': 'stft', 'dur_sec': 3, 'win_size_ms': 20, 'percent_overlap': 0.5} .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 2.400 seconds) .. _sphx_glr_download_auto_examples_plot_featureprep_denoiser.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_featureprep_denoiser.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_featureprep_denoiser.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_