.. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_dataset_info_formatting.py: ======================================== Audio Dataset Exploration and Formatting ======================================== Examine audio files within a dataset, and reformat them if desired. To see how soundpy implements this, see `soundpy.builtin.dataset_logger` and `soundpy.builtin.dataset_formatter`. Let's import soundpy .. code-block:: default import soundpy as sp Dataset Exploration ^^^^^^^^^^^^^^^^^^^ Designate path relevant for accessing audiodata .. code-block:: default sp_dir = '../../../' I will explore files in a small dataset on my computer with varying file formats. .. code-block:: default dataset_path = '{}audiodata2/'.format(sp_dir) dataset_info_dict = sp.builtin.dataset_logger('{}audiodata2/'.format(sp_dir)); .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:352: UserWarning: Some files did not match those acceptable by this program. (i.e. non-audio files) The number of files not included: 3300 warnings.warn(message) /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:105: UserWarning: WARNING: Most functionality has not been tested with stereo sound. Many functions may fail or not work as expected. Apologies for the inconvenience! warnings.warn(msg) 3% through logging audio file details 6% through logging audio file details 9% through logging audio file details 12% through logging audio file details 16% through logging audio file details 19% through logging audio file details 22% through logging audio file details 25% through logging audio file details 29% through logging audio file details 32% through logging audio file details 35% through logging audio file details 38% through logging audio file details 41% through logging audio file details/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/p3_test/lib/python3.8/site-packages/librosa/core/audio.py:162: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.") 45% through logging audio file details 48% through logging audio file details 51% through logging audio file details 54% through logging audio file details 58% through logging audio file details 61% through logging audio file details 64% through logging audio file details/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/p3_test/lib/python3.8/site-packages/librosa/core/audio.py:162: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.") /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:105: UserWarning: WARNING: Most functionality has not been tested with stereo sound. Many functions may fail or not work as expected. Apologies for the inconvenience! warnings.warn(msg) 67% through logging audio file details 70% through logging audio file details 74% through logging audio file details 77% through logging audio file details 80% through logging audio file details 83% through logging audio file details 87% through logging audio file details 90% through logging audio file details 93% through logging audio file details 96% through logging audio file details 100% through logging audio file details This returns our data in a dictionary, perfect for exploring via Pandas .. code-block:: default import pandas as pd all_data = pd.DataFrame(dataset_info_dict).T all_data.head() .. only:: builder_html .. raw:: html
audio sr num_channels dur_sec format_type bitdepth
../../../audiodata2/dogbark_2channels.wav ../../../audiodata2/dogbark_2channels.wav 48000 2 0.389 WAV PCM_16
../../../audiodata2/python_traffic_pf.wav ../../../audiodata2/python_traffic_pf.wav 48000 1 1.86 WAV DOUBLE
../../../audiodata2/259672__nooc__this-is-not-right.wav ../../../audiodata2/259672__nooc__this-is-not-... 44100 1 2.48454 WAV PCM_16
../../../audiodata2/259672__nooc__this-is-not-right.flac ../../../audiodata2/259672__nooc__this-is-not-... 44100 1 2.48454 FLAC PCM_16
../../../audiodata2/505803__skennison__new-recording.wav ../../../audiodata2/505803__skennison__new-rec... 48000 1 5.63067 WAV PCM_16


Let's have a look at the audio files and how uniform they are: .. code-block:: default print('formats: ', all_data.format_type.unique()) print('bitdepth (types): ', all_data.bitdepth.unique()) print('mean duration (sec): ', all_data.dur_sec.mean()) print('std dev duration (sec): ', all_data.dur_sec.std()) print('min sample rate: ', all_data.sr.min()) print('max sample rate: ', all_data.sr.max()) print('number of channels: ', all_data.num_channels.unique()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none formats: ['WAV' 'FLAC' 'M4A' 'MP3' 'OGG' 'AIFF'] bitdepth (types): ['PCM_16' 'DOUBLE' 'FLOAT' 'unknown' 'PCM_24' 'VORBIS' 'PCM_U8'] mean duration (sec): 3.8668231521103054 std dev duration (sec): 3.394200829629172 min sample rate: 16000 max sample rate: 48000 number of channels: [2 1] For a visual example, let's plot the count of various sample rates. (48000 Hz is high definition sound, 16000 Hz is wideband, and 8000 Hz is narrowband, similar to how speech sounds on the telephone.) .. code-block:: default all_data.groupby('sr').count().plot(kind = 'bar', title = 'Sample Rate Counts') .. image:: /auto_examples/images/sphx_glr_plot_dataset_info_formatting_001.png :alt: Sample Rate Counts :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none Reformat a Dataset ^^^^^^^^^^^^^^^^^^ Let's say we have a dataset that we want to make consistent. We can do that with soundpy .. code-block:: default new_dataset_dir = sp.builtin.dataset_formatter( dataset_path, recursive = True, # we want all the audio, even in nested directories format='WAV', bitdepth = 16, # if set to None, a default bitdepth will be applied sr = 16000, # wideband mono = True, # ensure data all have 1 channel dur_sec = 3, # audio will be limited to 3 seconds zeropad = True, # audio shorter than 3 seconds will be zeropadded new_dir = './example_dir/', # if None, a time-stamped directory will be created for you overwrite = False # can set to True if you want to overwrite files ); .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /home/airos/Projects/github/a-n-rose/Python-Sound-Tool/soundpy/files.py:352: UserWarning: Some files did not match those acceptable by this program. (i.e. non-audio files) The number of files not included: 3300 warnings.warn(message) File example_dir/audiodata2/dogbark_2channels.wav already exists. 3% through reformatting datasetFile example_dir/audiodata2/python_traffic_pf.wav already exists. 6% through reformatting datasetFile example_dir/audiodata2/259672__nooc__this-is-not-right.wav already exists. 9% through reformatting datasetFile example_dir/audiodata2/259672__nooc__this-is-not-right.wav already exists. 12% through reformatting datasetFile example_dir/audiodata2/505803__skennison__new-recording.wav already exists. 16% through reformatting datasetFile example_dir/audiodata2/male_noisy60snr_marvin.wav already exists. 19% through reformatting datasetFile example_dir/audiodata2/python_traffic_wiener.wav already exists. 22% through reformatting datasetFile example_dir/audiodata2/male_noisy30snr_marvin.wav already exists. 25% through reformatting datasetFile example_dir/audiodata2/female_noisy10snr_python.wav already exists. 29% through reformatting datasetFile example_dir/audiodata2/rain.wav already exists. 32% through reformatting datasetFile example_dir/audiodata2/python.wav already exists. 35% through reformatting datasetFile example_dir/audiodata2/python_traffic_bs.wav already exists. 38% through reformatting datasetFile example_dir/audiodata2/left.wav already exists. 41% through reformatting dataset/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/p3_test/lib/python3.8/site-packages/librosa/core/audio.py:162: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.") File example_dir/audiodata2/505803__skennison__new-recording.wav already exists. 45% through reformatting datasetFile example_dir/audiodata2/240674__zajo__you-have-been-denied.wav already exists. 48% through reformatting datasetFile example_dir/audiodata2/male_noisy10snr_marvin.wav already exists. 51% through reformatting datasetFile example_dir/audiodata2/python_traffic.wav already exists. 54% through reformatting datasetFile example_dir/audiodata2/marvin.wav already exists. 58% through reformatting datasetFile example_dir/audiodata2/male_noisy20snr_marvin.wav already exists. 61% through reformatting datasetFile example_dir/audiodata2/female_noisy60snr_python.wav already exists. 64% through reformatting dataset/home/airos/Projects/github/a-n-rose/Python-Sound-Tool/p3_test/lib/python3.8/site-packages/librosa/core/audio.py:162: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn("PySoundFile failed. Trying audioread instead.") File example_dir/audiodata2/244287__kleinhirn2000__toast-glas-langsam.wav already exists. 67% through reformatting datasetFile example_dir/audiodata2/car_horn.wav already exists. 70% through reformatting datasetFile example_dir/audiodata2/wow.wav already exists. 74% through reformatting datasetFile example_dir/audiodata2/240674__zajo__you-have-been-denied.wav already exists. 77% through reformatting datasetFile example_dir/audiodata2/traffic.wav already exists. 80% through reformatting datasetFile example_dir/audiodata2/traffic.wav already exists. 83% through reformatting datasetFile example_dir/audiodata2/female_noisy30snr_python.wav already exists. 87% through reformatting datasetFile example_dir/audiodata2/female_noisy20snr_python.wav already exists. 90% through reformatting datasetFile example_dir/audiodata2/244287__kleinhirn2000__toast-glas-langsam.wav already exists. 93% through reformatting datasetFile example_dir/audiodata2/audio2channels.wav already exists. 96% through reformatting datasetFile example_dir/audiodata2/187915__vasotelvi__transfer-is-complete.wav already exists. 100% through reformatting dataset Let's see what the audio data looks like now: .. code-block:: default dataset_formatted_dict = sp.builtin.dataset_logger(new_dataset_dir, recursive=True); formatted_data = pd.DataFrame(dataset_formatted_dict).T .. rst-class:: sphx-glr-script-out Out: .. code-block:: none 3% through logging audio file details 7% through logging audio file details 11% through logging audio file details 15% through logging audio file details 19% through logging audio file details 23% through logging audio file details 26% through logging audio file details 30% through logging audio file details 34% through logging audio file details 38% through logging audio file details 42% through logging audio file details 46% through logging audio file details 50% through logging audio file details 53% through logging audio file details 57% through logging audio file details 61% through logging audio file details 65% through logging audio file details 69% through logging audio file details 73% through logging audio file details 76% through logging audio file details 80% through logging audio file details 84% through logging audio file details 88% through logging audio file details 92% through logging audio file details 96% through logging audio file details 100% through logging audio file details .. code-block:: default formatted_data.head() .. only:: builder_html .. raw:: html
audio sr num_channels dur_sec format_type bitdepth
example_dir/audiodata2/dogbark_2channels.wav example_dir/audiodata2/dogbark_2channels.wav 8000 1 3 WAV PCM_16
example_dir/audiodata2/python_traffic_pf.wav example_dir/audiodata2/python_traffic_pf.wav 8000 1 3 WAV PCM_16
example_dir/audiodata2/259672__nooc__this-is-not-right.wav example_dir/audiodata2/259672__nooc__this-is-n... 8000 1 3 WAV PCM_16
example_dir/audiodata2/505803__skennison__new-recording.wav example_dir/audiodata2/505803__skennison__new-... 8000 1 3 WAV PCM_16
example_dir/audiodata2/male_noisy60snr_marvin.wav example_dir/audiodata2/male_noisy60snr_marvin.wav 8000 1 3 WAV PCM_16


.. code-block:: default print('audio formats: ', formatted_data.format_type.unique()) print('bitdepth (types): ', formatted_data.bitdepth.unique()) print('mean duration (sec): ', formatted_data.dur_sec.mean()) print('std dev duration (sec): ', formatted_data.dur_sec.std()) print('min sample rate: ', formatted_data.sr.min()) print('max sample rate: ', formatted_data.sr.max()) print('number of channels: ', formatted_data.num_channels.unique()) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none audio formats: ['WAV'] bitdepth (types): ['PCM_16'] mean duration (sec): 3.0 std dev duration (sec): 0.0 min sample rate: 8000 max sample rate: 8000 number of channels: [1] Now all the audio data is sampled at the same rate: 8000 Hz .. code-block:: default formatted_data.groupby('sr').count().plot(kind = 'bar', title = 'Sample Rate Counts') .. image:: /auto_examples/images/sphx_glr_plot_dataset_info_formatting_002.png :alt: Sample Rate Counts :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none There we go! You can reformat only parts of the audio files, e.g. format or bitdepth. If you leave parameters in sp.builtin.dataset_formatter as None, the original settings of the audio file will be maintained (except for bitdepth. A default bitdepth will be applied according to the format of the file); see `soundfile.default_subtype`. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 3.239 seconds) .. _sphx_glr_download_auto_examples_plot_dataset_info_formatting.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_dataset_info_formatting.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_dataset_info_formatting.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_