{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Implement a Denoising Autoencoder\n\n\nImplement denoising autoencoder to denoise a noisy speech signal.\n\nTo see how soundpy implements this, see `soundpy.models.builtin.denoiser_run`.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's import soundpy and other packages\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import soundpy as sp\nimport numpy as np\n# for playing audio in this notebook:\nimport IPython.display as ipd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As well as the deep learning component of soundpy\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from soundpy import models as spdl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Prepare for Implementation: Data Organization\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set path relevant for audio data for this example\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp_dir = '../../../'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set model pathway\n~~~~~~~~~~~~~~~~~\nCurrently, this expects a model saved with weights, with a .h5 extension.\n(See `model` below)\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The soundpy repo offers a pre-trained denoiser, which we'll use.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "model = '{}audiodata/models/'.format(sp_dir)+\\\n 'denoiser/example_denoiser_stft.h5'\n# ensure is a pathlib.PosixPath object\nprint(model)\nmodel = sp.utils.string2pathlib(model)\nmodel_dir = model.parent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is in this folder?\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "files = list(model_dir.glob('*.*'))\nfor f in files:\n print(f.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Provide dictionary with feature extraction settings\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If soundpy extracts features for you, a 'log_extraction_settings.csv' \nfile will be saved, which includes relevant feature settings for implementing \nthe model; see `soundpy.feats.save_features_datasets`\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "feat_settings = sp.utils.load_dict(\n model_dir.joinpath('log_extraction_settings.csv'))\nfor key, value in feat_settings.items():\n print(key, ' --> ', value)\n # change objects that were string to original format\n import ast\n try:\n feat_settings[key] = ast.literal_eval(value)\n except ValueError:\n pass\n except SyntaxError:\n pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the purposes of plotting, let's use some of the settings defined:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "feature_type = feat_settings['feature_type']\nsr = feat_settings['sr']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Provide new audio for the denoiser to denoise!\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll use sample speech from the soundpy repo:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "speech = sp.string2pathlib('{}audiodata/python.wav'.format(sp_dir))\ns, sr = sp.loadsound(speech, sr=sr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's add some white noise (10 SNR)\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "s_n = sp.augment.add_white_noise(s, sr=sr, snr=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What does the noisy audio sound like?\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ipd.Audio(s_n,rate=sr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What does the noisy audio look like?\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(s_n, sr = sr, feature_type='signal', subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What does the clean audio sound like?\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ipd.Audio(s,rate=sr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What does the clean audio look like?\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(s, sr = sr, feature_type='signal', subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Built-In Denoiser Functionality\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We just need to feed the model path, the noisy sample path, and \nthe feature settings dictionary we looked at above.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "y, sr = spdl.denoiser_run(model, s_n, feat_settings)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How does the output sound?\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ipd.Audio(y,rate=sr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How does is the output look? \n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(y, sr=sr, feature_type = feature_type, subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do the features compare?\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "STFT features of the noisy input speech:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(s_n, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db',\n title = 'Noisy input: STFT features', subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "STFT features of the output\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(y, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db',\n title = 'Denoiser Output: STFT features', subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "STFT features of the clean version of the audio:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sp.plotsound(s, sr=sr, feature_type = 'stft', energy_scale = 'power_to_db',\n title = 'Clean \"target\" audio: STFT features', subprocess=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's not perfect but for a pretty simple implementation, the noise is gone\nand you can hear the person speaking. Pretty cool! \n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 0 }