Feeding large datasets to models¶
-
class
soundpy.models.dataprep.
Generator
(data_matrix1, data_matrix2=None, timestep=None, axis_timestep=0, normalize=True, apply_log=False, context_window=None, axis_context_window=- 2, labeled_data=False, gray2color=False, zeropad=True, desired_input_shape=None, combine_axes_0_1=False)[source]¶ Bases:
object
Methods
Shapes, norms, and feeds data depending on labeled or non-labeled data.
-
__init__
(data_matrix1, data_matrix2=None, timestep=None, axis_timestep=0, normalize=True, apply_log=False, context_window=None, axis_context_window=- 2, labeled_data=False, gray2color=False, zeropad=True, desired_input_shape=None, combine_axes_0_1=False)[source]¶ This generator pulls data out in sections (i.e. batch sizes). Prepared for 3 dimensional data.
Note: Keras adds a dimension to input to represent the “Tensor” that #handles the input. This means that sometimes you have to add a shape of (1,) to the shape of the data.
- Parameters
data_matrix1 (
np.ndarray [size=(num_samples
,batch_size
,num_frames
,num_features)
or(num_samples
,num_frames
,num_features+label_column)]
) – The training data. This can contain the feature and label data or just the input feature data.data_matrix2 (
np.ndarray [size = (num_samples
,) `data_matrix1
.shape]`, optional) – Either label data for data_matrix1 or, for example, the clean version of data_matrix1 if training an autoencoder. (default None)normalize (
bool
) – If False, the data has already been normalized and won’t be normalized by the generator. (default True)apply_log (
bool
) – If True, log will be applied to the data.timestep (
int
) – The number of frames to constitute a timestep.axis_timestep (
int
) – The axis to apply the timestep to. (default 0)context_window (
int
) – The size of context_window or number of samples padding a central frame. This may be useful for models training on small changes occuring in the signal, e.g. to break up the image of sound into smaller parts.axis_context_window (
int
) – The axis to apply_context_window, if context_window is not None. Ideally should be in axis preceding feature column. (default -2)zeropad (
bool
) – If features should be zeropadded in reshaping functions.desired_input_shape (
int
ortuple
, optional) – The desired number of features or shape of data to feed a neural network. If type int, only the last column of features will be adjusted (zeropadded or limited). If tuple, the entire data shape will be adjusted (all columns). If the int or shape is larger than that of the data provided, data will be zeropadded. If the int or shape is smaller, the data will be restricted. (default None)
-
The models.dataprep module covers functionality for feeding features to models.
-
class
soundpy.models.dataprep.
Generator
(data_matrix1, data_matrix2=None, timestep=None, axis_timestep=0, normalize=True, apply_log=False, context_window=None, axis_context_window=- 2, labeled_data=False, gray2color=False, zeropad=True, desired_input_shape=None, combine_axes_0_1=False)[source]¶ Bases:
object
Methods
Shapes, norms, and feeds data depending on labeled or non-labeled data.
-
class
soundpy.models.dataprep.
GeneratorFeatExtraction
(datalist, datalist2=None, model_name=None, normalize=True, apply_log=False, randomize=True, random_seed=None, desired_input_shape=None, timestep=None, axis_timestep=0, context_window=None, axis_context_window=- 2, batch_size=1, gray2color=False, visualize=False, vis_every_n_items=50, visuals_dir=None, decode_dict=None, dataset='train', augment_dict=None, label_silence=False, vad_start_end=False, **kwargs)[source]¶ Bases:
soundpy.models.dataprep.Generator
Methods
Extracts features and feeds them to model according to desired_input_shape.
-
soundpy.models.dataprep.
randomize_augs
(aug_dict, random_seed=None)[source]¶ Creates copy of dict and chooses which augs applied randomly.
Can apply random seed for number of augmentations applied and shuffling order of possible augmentations.
-
soundpy.models.dataprep.
augment_features
(sound, sr, add_white_noise=False, snr=[5, 10, 20], speed_increase=False, speed_decrease=False, speed_perc=0.15, time_shift=False, shufflesound=False, num_subsections=3, harmonic_distortion=False, pitch_increase=False, pitch_decrease=False, num_semitones=2, vtlp=False, bilinear_warp=True, augment_settings_dict=None, random_seed=None)[source]¶ Randomly applies augmentations to audio. If no augment_settings_dict, defaults applied.
-
soundpy.models.dataprep.
get_input_shape
(kwargs_get_feats, labeled_data=False, frames_per_sample=None, use_librosa=True, mode='reflect')[source]¶
-
soundpy.models.dataprep.
make_gen_callable
(_gen)[source]¶ Prepares Python generator for tf.data.Dataset.from_generator
Bug fix: Python generator fails to work in Tensorflow 2.2.0 +
- Parameters
_gen (
generator
) – The generator function to feed to a deep neural network.- Returns
x (
np.ndarray [shape=(batch_size
,num_frames
,num_features
,1)]
) – The feature datay (
np.ndarray [shape=(1,1)]
) – The label for the feature data.
References
Shu, Nicolas (2020) https://stackoverflow.com/a/62186572 CC BY-SA 4.0