{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n=======================================================\nFeature Extraction for Denoising: Clean and Noisy Audio\n=======================================================\n\nExtract acoustic features from clean and noisy datasets for \ntraining a denoising model, e.g. a denoising autoencoder.\n\nTo see how soundpy implements this, see `soundpy.builtin.denoiser_feats`.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import os, sys\nimport inspect\ncurrentdir = os.path.dirname(os.path.abspath(\n inspect.getfile(inspect.currentframe())))\nparentdir = os.path.dirname(currentdir)\nparparentdir = os.path.dirname(parentdir)\npackagedir = os.path.dirname(parparentdir)\nsys.path.insert(0, packagedir)\n\nimport soundpy as sp \nimport IPython.display as ipd\npackage_dir = '../../../'\nos.chdir(package_dir)\nsp_dir = package_dir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Prepare for Extraction: Data Organization\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I will use a mini denoising dataset as an example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Example noisy data:\ndata_noisy_dir = '{}../mini-audio-datasets/denoise/noisy'.format(sp_dir)\n# Example clean data:\ndata_clean_dir = '{}../mini-audio-datasets/denoise/clean'.format(sp_dir)\n# Where to save extracted features:\ndata_features_dir = './audiodata/example_feats_models/denoiser/'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Choose Feature Type \n~~~~~~~~~~~~~~~~~~~\nWe can extract 'mfcc', 'fbank', 'powspec', and 'stft'.\nif you are working with speech, I suggest 'fbank', 'powspec', or 'stft'.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "feature_type = 'stft'\nsr = 22050" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set Duration of Audio \n~~~~~~~~~~~~~~~~~~~~~\nHow much audio in seconds used from each audio file.\nthe speech samples are about 3 seconds long.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "dur_sec = 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Option 1: Built-In Functionality: soundpy does everything for you\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define which data to use and which features to extract. \nNOTE: beacuse of the very small dataset, will set \n`perc_train` to a lower level than 0.8. (Otherwise, will raise error)\nEverything else is based on defaults. A feature folder with\nthe feature data will be created in the current working directory.\n(Although, you can set this under the parameter `data_features_dir`)\n`visualize` saves periodic images of the features extracted.\nThis is useful if you want to know what's going on during the process.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "perc_train = 0.6 # with larger datasets this would be around 0.8\nextraction_dir = sp.denoiser_feats(\n data_clean_dir = data_clean_dir, \n data_noisy_dir = data_noisy_dir,\n sr = sr,\n feature_type = feature_type, \n dur_sec = dur_sec,\n perc_train = perc_train,\n visualize=True);\nextraction_dir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The extracted features, extraction settings applied, and \nwhich audio files were assigned to which datasets\nwill be saved in the `extraction_dir` directory\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Logged Information\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\nLet's have a look at the files in the extraction_dir. The files ending \nwith .npy extension contain the feature data; the .csv files contain \nlogged information. \n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "featfiles = list(extraction_dir.glob('*.*'))\nfor f in featfiles:\n print(f.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Feature Settings\n~~~~~~~~~~~~~~~~~~\nSince much was conducted behind the scenes, it's nice to know how the features\nwere extracted, for example, the sample rate and number of frequency bins applied, etc.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "feat_settings = sp.utils.load_dict(\n extraction_dir.joinpath('log_extraction_settings.csv'))\nfor key, value in feat_settings.items():\n print(key, ' ---> ', value)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 0 }