Train or Load Sound Classifier

Submodules

noize.models.cnn module

class noize.models.cnn.ClassifySound(sounddata, filter_class, feature_class, model_class)[source]

Bases: object

Takes new audio and classifies it with classifier.

check_for_best_model()[source]
extract_feats()[source]
get_label()[source]
load_assigned_avepower(label_encoded, raw_samples=False)[source]
load_modelsettings()[source]
class noize.models.cnn.SoundClassifier(modelname, models_dir, features_dir, encoded_labels_path, feature_session, modelsettings_path, newmodel=True)[source]

Bases: object

Build a mobile compatible CNN to classify noise / acoustic scene

References

A. Sehgal and N. Kehtarnavaz, “A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection,” in IEEE Access, vol. 6, pp. 9017-9026, 2018.

build_cnn_model()[source]

set up model architecture using Keras Sequential() class

build_cnn_reduced()[source]

Reduces layers of CNN until the model can be built

If the number of filters for ‘mfcc’ or ‘fbank’ is in the lower range (i.e. 13 or so), this causes issues with the default settings of the cnn architecture. The architecture was built with at least 40 filters being applied during feature extraction. To deal with this problem, the number of CNN layers are reduced.

compile_model(model, optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])[source]
create_model_path(models_dir, modelname, newmodel=True)[source]
load_labels()[source]
load_train_val_data(test_model=False)[source]

loads training data and trains model

Expects at least train and validation data, test data optional

save_class_settings(overwrite=False)[source]

saves class settings to dictionary

set_model_params(color_scale=1, num_layers=None, feature_maps=None, kernel_size=None, halve_feature_maps=True, strides=2, dense_hidden_units=100, epochs=100, activation_layer='relu', activation_output='softmax', dropout=0.25)[source]
set_up_callbacks(early_stop=True, patience=15, csv_log=True, csv_filename=None, save_bestmodel=True, best_modelname=None, monitor='val_loss', verbose=1, save_best_only=True, mode='min')[source]
train_scene_classifier()[source]

loads all data and trains model

advantage of loading all data is it is easier to scale and normalize.

noize.models.cnn.buildclassifier(filter_class)[source]
noize.models.cnn.loadclassifier(filter_class)[source]
noize.models.cnn.prepdata_ml(matrix, is_train=True, scalars=None)[source]

Module contents

class noize.models.SoundClassifier(modelname, models_dir, features_dir, encoded_labels_path, feature_session, modelsettings_path, newmodel=True)[source]

Bases: object

Build a mobile compatible CNN to classify noise / acoustic scene

References

A. Sehgal and N. Kehtarnavaz, “A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection,” in IEEE Access, vol. 6, pp. 9017-9026, 2018.

build_cnn_model()[source]

set up model architecture using Keras Sequential() class

build_cnn_reduced()[source]

Reduces layers of CNN until the model can be built

If the number of filters for ‘mfcc’ or ‘fbank’ is in the lower range (i.e. 13 or so), this causes issues with the default settings of the cnn architecture. The architecture was built with at least 40 filters being applied during feature extraction. To deal with this problem, the number of CNN layers are reduced.

compile_model(model, optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])[source]
create_model_path(models_dir, modelname, newmodel=True)[source]
load_labels()[source]
load_train_val_data(test_model=False)[source]

loads training data and trains model

Expects at least train and validation data, test data optional

save_class_settings(overwrite=False)[source]

saves class settings to dictionary

set_model_params(color_scale=1, num_layers=None, feature_maps=None, kernel_size=None, halve_feature_maps=True, strides=2, dense_hidden_units=100, epochs=100, activation_layer='relu', activation_output='softmax', dropout=0.25)[source]
set_up_callbacks(early_stop=True, patience=15, csv_log=True, csv_filename=None, save_bestmodel=True, best_modelname=None, monitor='val_loss', verbose=1, save_best_only=True, mode='min')[source]
train_scene_classifier()[source]

loads all data and trains model

advantage of loading all data is it is easier to scale and normalize.

class noize.models.ClassifySound(sounddata, filter_class, feature_class, model_class)[source]

Bases: object

Takes new audio and classifies it with classifier.

check_for_best_model()[source]
extract_feats()[source]
get_label()[source]
load_assigned_avepower(label_encoded, raw_samples=False)[source]
load_modelsettings()[source]
noize.models.buildclassifier(filter_class)[source]
noize.models.loadclassifier(filter_class)[source]