Feature Extraction for Denoising: Clean and Noisy Audio

Extract acoustic features from clean and noisy datasets for training a denoising model, e.g. a denoising autoencoder.

To see how soundpy implements this, see soundpy.builtin.denoiser_feats.

import os, sys
import inspect
currentdir = os.path.dirname(os.path.abspath(
    inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(currentdir)
parparentdir = os.path.dirname(parentdir)
packagedir = os.path.dirname(parparentdir)
sys.path.insert(0, packagedir)

import soundpy as sp
import IPython.display as ipd
package_dir = '../../../'
os.chdir(package_dir)
sp_dir = package_dir

Prepare for Extraction: Data Organization

I will use a mini denoising dataset as an example

# Example noisy data:
data_noisy_dir = '{}../mini-audio-datasets/denoise/noisy'.format(sp_dir)
# Example clean data:
data_clean_dir = '{}../mini-audio-datasets/denoise/clean'.format(sp_dir)
# Where to save extracted features:
data_features_dir = './audiodata/example_feats_models/denoiser/'

Choose Feature Type

We can extract ‘mfcc’, ‘fbank’, ‘powspec’, and ‘stft’. if you are working with speech, I suggest ‘fbank’, ‘powspec’, or ‘stft’.

feature_type = 'stft'
sr = 22050

Set Duration of Audio

How much audio in seconds used from each audio file. the speech samples are about 3 seconds long.

dur_sec = 3

Option 1: Built-In Functionality: soundpy does everything for you

Define which data to use and which features to extract. NOTE: beacuse of the very small dataset, will set perc_train to a lower level than 0.8. (Otherwise, will raise error) Everything else is based on defaults. A feature folder with the feature data will be created in the current working directory. (Although, you can set this under the parameter data_features_dir) visualize saves periodic images of the features extracted. This is useful if you want to know what’s going on during the process.

perc_train = 0.6 # with larger datasets this would be around 0.8
extraction_dir = sp.denoiser_feats(
    data_clean_dir = data_clean_dir,
    data_noisy_dir = data_noisy_dir,
    sr = sr,
    feature_type = feature_type,
    dur_sec = dur_sec,
    perc_train = perc_train,
    visualize=True);
extraction_dir
test STFT features: label NOISY

Out:

/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:352: UserWarning: Some files did not match those acceptable by this program. (i.e. non-audio files) The number of files not included: 3
  warnings.warn(message)
/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2383: UserWarning:
WARNING: `win_size_ms` was not set. Setting it to 20 ms
  warnings.warn(msg)
/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/feats.py:2390: UserWarning:
WARNING: `percent_overlap` was not set. Setting it to 0.5
  warnings.warn(msg)

16% through train stft feature extraction
33% through train stft feature extraction
50% through train stft feature extraction
66% through train stft feature extraction
83% through train stft feature extraction
100% through train stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_clean.npy


50% through val stft feature extraction
100% through val stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_clean.npy


50% through test stft feature extraction
100% through test stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_clean.npy


16% through train stft feature extraction
33% through train stft feature extraction
50% through train stft feature extraction
66% through train stft feature extraction
83% through train stft feature extraction
100% through train stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/train_data_noisy.npy


50% through val stft feature extraction
100% through val stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/val_data_noisy.npy


50% through test stft feature extraction
100% through test stft feature extraction
Features saved at audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms/test_data_noisy.npy


Finished! Total duration: 2.32 seconds.

PosixPath('audiodata/example_feats_models/denoiser/features_9m3d13h26m1s569ms')

The extracted features, extraction settings applied, and which audio files were assigned to which datasets will be saved in the extraction_dir directory

Logged Information

Let’s have a look at the files in the extraction_dir. The files ending with .npy extension contain the feature data; the .csv files contain logged information.

featfiles = list(extraction_dir.glob('*.*'))
for f in featfiles:
    print(f.name)

Out:

val_data_noisy.npy
dataset_audio_assignments.csv
val_data_clean.npy
test_data_noisy.npy
audiofiles_datasets_clean.csv
test_data_clean.npy
log_extraction_settings.csv
train_data_clean.npy
clean_audio.csv
noisy_audio.csv
train_data_noisy.npy
audiofiles_datasets_noisy.csv

Feature Settings

Since much was conducted behind the scenes, it’s nice to know how the features were extracted, for example, the sample rate and number of frequency bins applied, etc.

feat_settings = sp.utils.load_dict(
    extraction_dir.joinpath('log_extraction_settings.csv'))
for key, value in feat_settings.items():
    print(key, ' ---> ', value)

Out:

dataset_dirs  --->  ['../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy', '../../../../mini-audio-datasets/denoise/noisy']
feat_base_shape  --->  (299, 221)
feat_model_shape  --->  (299, 221)
complex_vals  --->  True
context_window  --->
frames_per_sample  --->
labeled_data  --->  False
decode_dict  --->
visualize  --->  True
vis_every_n_frames  --->  50
subsection_data  --->  False
divide_factor  --->  5
total_audiofiles  --->  10
kwargs  --->  {'sr': 22050, 'feature_type': 'stft', 'dur_sec': 3, 'win_size_ms': 20, 'percent_overlap': 0.5}

Total running time of the script: ( 0 minutes 2.400 seconds)

Gallery generated by Sphinx-Gallery