noize package


noize.buildsmartfilter module

noize.buildsmartfilter.mysmartfilter(name_dataset, headpath, audio_classes_dir, feature_type='mfcc', num_filters=40, sounddata=None, scale=1, segment_length_ms=1000, apply_postfilter=False, augment_data=False, limit=None, use_rand_noisefile=False, force_label=None, classify_noise=True, max_vol=0.4)[source]

Applies feature prep, model training, and filtering to wavfile.

noize.exceptions module

The noize.exceptions module includes customized errors.

noize.exceptions.notsufficientdata_error(numtrain, numval, numtest, expected_numtrain)[source]

noize.templates module

noize.templates.noizeclassifier(classifer_project_name, headpath, target_wavfile=None, audiodir=None, feature_type='fbank', audioclass_wavfile_limit=None)[source]

Example code for implementing NoIze as just a sound classifier.

noize.templates.noizefilter(filter_project_name, headpath, target_wavfile, noise_wavfile=None, scale=1, apply_postfilter=False, max_vol=0.4)[source]

Example code for implementing NoIze as just a noise filter.

Module contents

Framework to build a smart, low-computational noise filter. noize offers low-computational algorithms for training deep learning models as noise classifiers as well as for low-computational noise filters.

class noize.PathSetup(project_name='test_data', smartfilt_headpath='/home/airos/Desktop/testing_ground/default_model/', audiodata_dir=None, feature_type='mfcc', num_filters=40, segment_length_ms=1000)[source]

Bases: object

Manages paths for files specific to this smart filter instance

Based on the headpath and feature settings, directories and files are created. Data pertaining to feature extraction and model training are stored and accessed via paths built by this class instance.


The path to the project’s directory where all feature, model and sound files will be saved. The name of this directory is created by the project_name parameter when initializing this class.

Hint: as both the features and models rely heavily on the data used, include a reference to that data here. Only features and models trained with the same dataset should be allowed to be here.




The path to the directory where audio training data can be found. One should ensure folders exist here, titled according to the sound data stored inside of them.

For example, to train a model on classifying sounds as either a dishwasher, air conditioner, or running toilet, you should have three folders, titled ‘dishwasher’, ‘air_conditioner’ and ‘running_toilet’, respectively.




Once created by the program, the path to the .csv file containing the labels found in the àudiodata_dir and to which integer the labels were encoded.

These pairings, label names (e.g. ‘air_conditioner’, ‘dishwasher’, ‘toilet’) and the integers they are encoded with (0,1,2), is important for training the neural network - it won’t understand letters - and for knowing which label the network categorizes new acoustic data.


None, pathlib.PosixPath


Once created by the program, the path to the .csv file that stores the audio file paths belonging to each audio class. None otherwise.


None, pathlib.PosixPath


The name this program expects to find when looking for the .csv containing audio class labels and the audiofile paths belonging to that class.


The name this program expects to find when looking for the .csv containing the audio class labels their encoded pairings.

featuresNone, True

None if features have not yet been successfully extracted and True if features have been fully extracted from entire dataset and saved.

These are relevant for the training of the CNN model for scene classification.

powspecNone, True

True if audio class average audio spectrum data are collected. None otherwise.

These are values relevant for the noise filtering of future sound data.

modelNone, pathlib.PosixPath

Once a model has been traind on these features and saved, the path and filename of that model. None otherwise.


The generated directory name to store all data from this instance of feature extraction.

This directory is named according to the type of features extracted, the number of filters applied during extraction, as well as the number of seconds of audio data from each audio file used to extract features.

For example, if ‘mfcc’ features with 40 filters are extracted, and a 1.5 second segment of audio data from each audio file is used for that extraction, the directory name is: ‘mfcc_40_1.5’.


The path to the directory titled feature_dirname, generated by the program.


The name this program expects to find when looking for the .csv containing the settings used when calculating the average power spectrum of each audio class. This is relevant for applying the filter: the same settings are ideally used when calculating the power spectrum of the signal that needs filtering.


The path to where the audio class average power spectrum files will be or are located for the entire dataset. These values are calculated independent from the features extracted for machine learning.


The name generated and applied to models trained on these features.


The path to the directory where model and related data will be or are currently stored.

model_settings_pathNone, pathlib.PosixPath

If a model has been trained and saved, the path to the .csv file holding the settings for that specific model.


Checks for feature extraction settings and training data files.

If setting files (i.e. csv files) exist without training data files (i.e. npy files), and a directory for training data has been provided, delete csv files.


Checks for model creation settings and model files.

If setting files (i.e. csv files) exist without model file(s) (i.e. h5 files), delete csv files.


Checks for power spectrum settings and filter data files.

If setting files (i.e. csv files) exist without or with too few data files (should be one data file for each audio class in training data), the setting files and data files will be deleted.


expects model related information to be in same directory as the model


sets the path to the model settings file

If the model already exists, uses that model’s parent directory. Otherwise sets the path to where a new model will be trained and saved.

prep_feat_dirname(feature_type, num_filters, segment_length_ms)[source]
noize.audio2datasets(audio_classes_dir, encoded_labels_path, label_wavfiles_path, perc_train=0.8, limit=None)[source]

Organizes all audio in audio class directories into datasets.

If they don’t already exist, dictionaries with the encoded labels of the audio classes as well as the wavfiles belonging to each class are saved.

  • audio_classes_dir (str, pathlib.PosixPath) – Directory path to where all audio class folders are located.

  • encoded_labels_path (str, pathlib.PosixPath) – path to the dictionary where audio class labels and their encoded integers are stored or will be stored.

  • label_wavfiles_path (str, pathlib.PosixPath) – path to the dictionary where audio class labels and the paths of all audio files belonging to each class are or will be stored.

  • perc_train (int, float) – The percentage or decimal representing the amount of training data compared to the test and validation data (default 0.8)


dataset_audio – Named tuple including three lists of tuples: the train, validation, and test lists, respectively. The tuples within the lists contain the encoded label integer (e.g. 0 instead of ‘air_conditioner’) and the audio paths associated to that class and dataset.

Return type


class noize.PrepFeatures(feature_type='fbank', sampling_rate=48000, num_filters=40, num_mfcc=None, window_size=25, window_shift=12.5, training_segment_ms=1000, num_columns=None, num_images_per_audiofile=None, num_waves=None, feature_sets=None, window_type=None, augment_data=False)[source]

Bases: object


calculates how many feature sets create a full image, given window size, window shift, and desired image length in milliseconds.

extractfeats(sounddata, dur_sec=None, augment_data=None)[source]

Organizes feat extraction of each audiofile according to class attributes.

get_feats(list_waves, dur_sec=None)[source]

collects fbank or mfcc features of entire wavfile list

get_max_samps(filter_features, num_sets)[source]

calculates the maximum number of samples of a particular wave’s features that would also create a full image

get_save_feats(wave_list, directory4features, filename)[source]
samps2feats(y, augment_data=None)[source]

Gets features from section of samples, at varying volumes.

save_class_settings(path, replace=False)[source]

saves class settings to dictionary

noize.run_featprep(filter_class, feature_type='mfcc', num_filters=40, segment_dur_ms=1000, limit=None, augment_data=False, sampling_rate=48000)

Pulls info from ‘filter_class’ instance to then extract, save features

  • filter_class (class) – The class instance holding attributes relating to path structure and filenames necessary for feature extraction

  • feature_type (str, optional) – Acceptable inputs: ‘mfcc’ and ‘fbank’. These are the features that will be extracted from the audio and saved (default ‘mfcc’)

  • num_filters (int, optional) – The number of mel filters used during feature extraction. This number ranges for ‘mfcc’ extraction between 13 and 40 and for ‘fbank’ extraction between 20 and 128. The higher the number, the greater the computational load and memory requirement. (default 40)

  • segment_dur_ms (int, optional) – The length in milliseconds of the acoustic data to extract features from. If 1000 ms, 1 second of acoustic data will be processed; 1 sec of feature data will be extracted. If not enough audio data is present, the feature data will be zero padded. (default 1000)


  • feats_class (class) – The class instance holding attributes relating to the current feature extraction session

  • filter_class (class) – The updated class instance holding attributes relating to path structure


Loads prev extracted feature settings into new feature class instance

This is useful if one wants to extract new features that match the dimensions and settings of previously extracted features.


feature_info (dict, class) – Either a dictionary or a class instance that holds the path attribute to a dictionary.


feats_class – Feature extraction class instance with the same settings as the settings dictionary

Return type


class noize.WienerFilter(smooth_factor=0.98, first_iter=None, max_vol=0.4)[source]

Bases: noize.filterfun.filters.FilterSettings

Interactive class to explore Wiener filter settings on audio signals.

These class methods implement research based algorithms with low computational cost, aimed for noise reduction via mobile phone.


Value applied in Wiener filter that smooths the application of ‘gain’; default set according to previous research. (default 0.98)




Keeps track if first_iter is relevant in filtering. If True, filtering has just started, and calculations made for filtering cannot use information from previous frames; if False, calculations for filtering use information from previous frames; if None, no difference is applied when processing the 1st vs subsequent frames. (default None)


bool, optional


The number of total subsections within the total number of samples belonging to the target signal (i.e. wavfile being filtered). Until target_subframes is calculated, it is set to None. (default None)


int, None


The number of total subsections within the total number of samples belonging to the noise signal. If noise power spectrum is used, this doesn’t need to be calculated. Until noise_subframes is calculated, it is set to None. (default None)


int, None


Once calculated, the attenuation values to be applied to the fft for noise reduction. Until calculated, None. (default None)


ndarray, None


The maximum volume allowed for the filtered signal. (default 0.4)


float, int


ensures volume of filtered signal is within the bounds of the original

get_samples(wavfile, dur_sec=None)[source]

Load signal and save original volume

  • wavfile (str) – Path and name of wavfile to be loaded

  • dur_sec (int, float optional) – Max length of time in seconds (default None)


samples – Array containing signal amplitude values in time domain

Return type



Loads and checks shape compatibility of averaged power values


path_npy (str, pathlib.PosixPath) – Path to .npy file containing power information.


power_values – The power values as long as they have the shape (self.num_fft_bins, 1)

Return type


save_filtered_signal(output_file, samples, overwrite=False)[source]
set_num_subframes(len_samples, is_noise=False)[source]

Sets the number of target or noise subframes available for processing

  • len_samples (int) – The total number of samples in a given signal

  • is_noise (bool) – If False, subframe number saved under self.target_subframes, otherwise self.noise_subframes (default False)


Return type


set_volume(samples, max_vol=0.4, min_vol=0.15)[source]

Records and limits the maximum amplitude of original samples.

This enables the output wave to be within a range of volume that does not go below or too far above the orignal maximum amplitude of the signal.

  • samples (ndarray) – The original samples of a signal (1 dimensional), of any length

  • max_vol (float) – The maximum volume level. If a signal has values higher than this number, the signal is curtailed to remain at and below this number.

  • min_vol (float) – The minimum volume level. If a signal has only values lower than this number, the signal is amplified to be at this number and below.


Return type


noize.welch2class(path_class, dur_ms=1000, augment_data=False)

Uses class’s path settings to set up Welch’s method for audio classes.

The settings applied for average power spectrum collection are also saved in a .csv file.

  • path_class (class) – Class with attributes for necessary paths to load relevant wavfiles and save average power spectrum values.

  • dur_ms (int, float) – Time in milliseconds for the Welch’s method / average power spectrum calculation to be applied for each wavfile. (default 1000)

  • augment_data (bool) – Whether or not the sound data should be augmented. If True, the sound data will be processed three times: with low energy, mid energy, and high energy. (default False)


Return type


noize.save_class_noise(path_class, feature_class, num_each_audioclass=1, dur_ms=1000)

Saves dur_ms of num_each_audioclass wavfiles of each audio class.

This is an option for using noise data that comes from an audio class but is not an average of the entire class. It is raw sample data from one or more random noise wavfiles from each class.

  • path_class (class) – Class with attributes for necessary paths to load relevant wavfiles and save sample values.

  • feature_class (class) – Class with attributes for sampling rate used in feature extraction and/or filtering. This is useful to maintain consistency in sampling rate throughout the modules.

  • num_each_audioclass (int) – The number of random wavfiles from each audio class chosen for raw sample collection. (default 1)

  • dur_ms (int, float) – Time in milliseconds of raw sample data to be saved. (default 1000)


Return type


noize.filtersignal(output_filename, wavfile, noise_file=None, scale=1, apply_postfilter=False, duration_ms=1000, max_vol=0.4)[source]

Apply Wiener filter to signal using noise. Saves at output_filename.

  • output_filename (str) – path and name the filtered signal is to be saved

  • wavfile (str) – the filename to the signal for filtering; if None, a signal will be generated (default None)

  • noise_file (str optional) – path to either noise wavfile or .npy file containing average power spectrum values or noise samples. If None, the beginning of the wavfile will be used for noise data. (default None)

  • scale (int or float) – The scale at which the filter should be applied. (default 1) Note: scale cannot be set to 0.

  • apply_postfilter (bool) – Whether or not the post filter should be applied. The post filter reduces musical noise (i.e. distortion) in the signal as a byproduct of filtering.

  • duration_ms (int or float) – The amount of time in milliseconds to use from noise to apply the Welch’s method to. In other words, how much of the noise to use when approximating the average noise power spectrum.

  • max_vol (int or float) – The maximum volume level of the filtered signal.


Return type