Filters: Wiener and Band Spectral Subtraction¶
Filters module covers functions related to the filtering out of noise of a target signal.
- 
class soundpy.filters.FilterSettings(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, zeropad=None)[source]¶
- Bases: - object- Basic settings for filter related classes to inherit from. - 
sr¶
- Desired sampling rate of audio; audio will be resampled to match if audio has other sampling rate. (default 48000) - Type
 
 - 
frame_length¶
- Number of audio samples in each frame: frame_dur multiplied with sr, divided by 1000. (default 960) - Type
 
 - 
overlap_length¶
- Number of overlapping audio samples between subsequent frames: frame_length multiplied by percent_overlap, floored. (default 480) - Type
 
 - 
num_fft_bins¶
- The number of frequency bins used when calculating the fft. Currently the frame_length is used to set num_fft_bins. - Type
 
 - 
zeropad¶
- If False, only full frames of audio data are processed. If True, the last partial frame will be zeropadded. (default False) - Type
- bool, optional
 
 - Methods - Returns window acc. 
- 
- 
class soundpy.filters.Filter(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=None, zeropad=None)[source]¶
- Bases: - soundpy.filters.FilterSettings- Interactive class to explore Wiener filter settings on audio signals. - These class methods implement research based algorithms with low computational cost, aimed for noise reduction via mobile phone. - 
beta¶
- Value applied in Wiener filter that smooths the application of ‘gain’; default set according to previous research. (default 0.98) - Type
 
 - 
first_iter¶
- Keeps track if first_iter is relevant in filtering. If True, filtering has just started, and calculations made for filtering cannot use information from previous frames; if False, calculations for filtering use information from previous frames; if None, no difference is applied when processing the 1st vs subsequent frames. (default None) - Type
- bool, optional
 
 - 
target_subframes¶
- The number of total subsections within the total number of samples belonging to the target signal (i.e. audiofile being filtered). Until target_subframes is calculated, it is set to None. (default None) 
 - 
noise_subframes¶
- The number of total subsections within the total number of samples belonging to the noise signal. If noise power spectrum is used, this doesn’t need to be calculated. Until noise_subframes is calculated, it is set to None. (default None) 
 - 
gain¶
- Once calculated, the attenuation values to be applied to the fft for noise reduction. Until calculated, None. (default None) - Type
- ndarray,- None
 
 - Methods - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - 
set_volume(samples, max_vol=0.4, min_vol=0.15)[source]¶
- Records and limits the maximum amplitude of original samples. - This enables the output wave to be within a range of volume that does not go below or too far above the orignal maximum amplitude of the signal. - Parameters
- samples ( - ndarray) – The original samples of a signal (1 dimensional), of any length
- max_vol ( - float) – The maximum volume level. If a signal has values higher than this number, the signal is curtailed to remain at and below this number.
- min_vol ( - float) – The minimum volume level. If a signal has only values lower than this number, the signal is amplified to be at this number and below.
 
- Returns
- Return type
 
 
- 
- 
class soundpy.filters.WienerFilter(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=0.4, smooth_factor=0.98, first_iter=None, zeropad=None)[source]¶
- Bases: - soundpy.filters.Filter- Methods - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - apply_postfilter - apply_wienerfilter 
- 
class soundpy.filters.BandSubtraction(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=0.4, num_bands=6, band_spacing='linear', zeropad=None, smooth_factor=0.98, first_iter=None)[source]¶
- Bases: - soundpy.filters.Filter- Methods - Calculate over subtraction factor used in the cited paper. - calc_relevant_band(target_powspec)- Calculates band with highest energy levels. - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - Provides starting and ending frequncy bins/indices for each band. - update_posteri_bands(target_powspec, …)- Updates SNR of each set of bands. - apply_bandspecsub - apply_floor - apply_postfilter - sub_noise - 
setup_bands()[source]¶
- Provides starting and ending frequncy bins/indices for each band. - Parameters
- self ( - class) – Contains variables num_bands (if None, set to 6) and frame_length
- Returns
- Sets the class variables band_start_freq and band_end_freq. 
- Return type
 - Examples - >>> import soundpy as sp >>> import numpy as np >>> # Default is set to 6 bands: >>> fil = sp.BandSubtraction() >>> fil.setup_bands() >>> fil.band_start_freq array([ 0., 80., 160., 240., 320., 400.]) >>> fil.band_end_freq array([ 80., 160., 240., 320., 400., 480.]) >>> # change default settings >>> fil = sp.BandSubtraction(num_bands=5) >>> fil.setup_bands() >>> fil.band_start_freq array([ 0., 96., 192., 288., 384.]) >>> fil.band_end_freq array([ 96., 192., 288., 384., 480.]) 
 - 
update_posteri_bands(target_powspec, noise_powspec)[source]¶
- Updates SNR of each set of bands. - MATLAB code from speech enhancement book uses power, puts it into magnitude (via square root), then puts it back into power..? And uses some sort of ‘norm’ function… which I think is actually just the sum. Original equation can be found in the paper below. page 117 from book? - paper: Kamath, S. D. & Loizou, P. C. (____), A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. - I am using power for the time being. - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for space: >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and without noise >>> time = np.arange(0, 10, 0.01) >>> signal = np.sin(time)[:fil.frame_length] >>> np.random.seed(0) >>> noise = np.random.normal(np.mean(signal),np.mean(signal)+0.3,960) >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> powerspec_noisy = np.abs(np.fft.fft(signal + noise))**2 >>> fil.update_posteri_bands(powerspec_clean, powerspec_noisy) >>> fil.snr_bands array([ -1.91189028, -39.22078063, -44.16682922, -45.65265895]) >>> # compare with no noise in signal: >>> fil.update_posteri_bands(powerspec_clean, powerspec_clean) >>> fil.snr_bands array([0., 0., 0., 0.]) 
 - 
calc_oversub_factor()[source]¶
- Calculate over subtraction factor used in the cited paper. - Uses decibel SNR values calculated in update_posteri_bands() - paper: Kamath, S. D. & Loizou, P. C. (____), A multi-band spectral subtraction method ofr enhancing speech corrupted by colored noise. - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for space: >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and without noise >>> time = np.arange(0, 10, 0.01) >>> signal = np.sin(time)[:fil.frame_length] >>> np.random.seed(0) >>> noise = np.random.normal(np.mean(signal),np.mean(signal)+0.3,960) >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> powerspec_noisy = np.abs(np.fft.fft(signal + noise))**2 >>> fil.update_posteri_bands(powerspec_clean, powerspec_noisy) >>> fil.snr_bands array([ -1.91189028, -39.22078063, -44.16682922, -45.65265895]) >>> a = fil.calc_oversub_factor() >>> a array([4.28678354, 4.75 , 4.75 , 4.75 ]) >>> # compare with no noise in signal: >>> fil.update_posteri_bands(powerspec_clean, powerspec_clean) >>> fil.snr_bands array([0., 0., 0., 0.]) >>> a = fil.calc_oversub_factor() >>> a array([4., 4., 4., 4.]) 
 - 
calc_relevant_band(target_powspec)[source]¶
- Calculates band with highest energy levels. - Parameters
- self ( - class instance) – Contains class variables band_start_freq and band_end_freq.
- target_powerspec ( - np.ndarray) – Power spectrum of the target signal.
 
- Returns
- rel_band_index ( - int) – Index for which band contains the most energy.
- band_energy_matrix ( - np.ndarray [size=(num_bands,- ),- dtype=np.float]) – Power levels of each band.
 
 - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for this example (default is 6): >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and with frequency 25 >>> time = np.arange(0, 10, 0.01) >>> full_circle = 2 * np.pi >>> freq = 25 >>> signal = np.sin((freq*full_circle)*time)[:fil.frame_length] >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> rel_band_index, band_power_energies = fil.calc_relevant_band(powerspec_clean) >>> rel_band_index 2 >>> # and with frequency 50 >>> freq = 50 >>> signal = np.sin((freq*full_circle)*time)[:fil.frame_length] >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> rel_band_index, band_power_energies = fil.calc_relevant_band(powerspec_clean) >>> rel_band_index 3 
 
- 
- 
class soundpy.filters.FilterSettings(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, zeropad=None)[source]¶
- Bases: - object- Basic settings for filter related classes to inherit from. - 
sr¶
- Desired sampling rate of audio; audio will be resampled to match if audio has other sampling rate. (default 48000) - Type
 
 - 
frame_length¶
- Number of audio samples in each frame: frame_dur multiplied with sr, divided by 1000. (default 960) - Type
 
 - 
overlap_length¶
- Number of overlapping audio samples between subsequent frames: frame_length multiplied by percent_overlap, floored. (default 480) - Type
 
 - 
num_fft_bins¶
- The number of frequency bins used when calculating the fft. Currently the frame_length is used to set num_fft_bins. - Type
 
 - 
zeropad¶
- If False, only full frames of audio data are processed. If True, the last partial frame will be zeropadded. (default False) - Type
- bool, optional
 
 - Methods - Returns window acc. 
- 
- 
class soundpy.filters.Filter(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=None, zeropad=None)[source]¶
- Bases: - soundpy.filters.FilterSettings- Interactive class to explore Wiener filter settings on audio signals. - These class methods implement research based algorithms with low computational cost, aimed for noise reduction via mobile phone. - 
beta¶
- Value applied in Wiener filter that smooths the application of ‘gain’; default set according to previous research. (default 0.98) - Type
 
 - 
first_iter¶
- Keeps track if first_iter is relevant in filtering. If True, filtering has just started, and calculations made for filtering cannot use information from previous frames; if False, calculations for filtering use information from previous frames; if None, no difference is applied when processing the 1st vs subsequent frames. (default None) - Type
- bool, optional
 
 - 
target_subframes¶
- The number of total subsections within the total number of samples belonging to the target signal (i.e. audiofile being filtered). Until target_subframes is calculated, it is set to None. (default None) 
 - 
noise_subframes¶
- The number of total subsections within the total number of samples belonging to the noise signal. If noise power spectrum is used, this doesn’t need to be calculated. Until noise_subframes is calculated, it is set to None. (default None) 
 - 
gain¶
- Once calculated, the attenuation values to be applied to the fft for noise reduction. Until calculated, None. (default None) - Type
- ndarray,- None
 
 - Methods - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - 
__init__(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=None, zeropad=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - 
set_volume(samples, max_vol=0.4, min_vol=0.15)[source]¶
- Records and limits the maximum amplitude of original samples. - This enables the output wave to be within a range of volume that does not go below or too far above the orignal maximum amplitude of the signal. - Parameters
- samples ( - ndarray) – The original samples of a signal (1 dimensional), of any length
- max_vol ( - float) – The maximum volume level. If a signal has values higher than this number, the signal is curtailed to remain at and below this number.
- min_vol ( - float) – The minimum volume level. If a signal has only values lower than this number, the signal is amplified to be at this number and below.
 
- Returns
- Return type
 
 
- 
- 
class soundpy.filters.WienerFilter(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=0.4, smooth_factor=0.98, first_iter=None, zeropad=None)[source]¶
- Bases: - soundpy.filters.Filter- Methods - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - apply_postfilter - apply_wienerfilter 
- 
class soundpy.filters.BandSubtraction(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=0.4, num_bands=6, band_spacing='linear', zeropad=None, smooth_factor=0.98, first_iter=None)[source]¶
- Bases: - soundpy.filters.Filter- Methods - Calculate over subtraction factor used in the cited paper. - calc_relevant_band(target_powspec)- Calculates band with highest energy levels. - check_volume(samples)- ensures volume of filtered signal is within the bounds of the original - get_samples(audiofile[, dur_sec])- Load signal and save original volume - get_window()- Returns window acc. - set_num_subframes(len_samples[, is_noise, …])- Sets the number of target or noise subframes available for processing - set_volume(samples[, max_vol, min_vol])- Records and limits the maximum amplitude of original samples. - Provides starting and ending frequncy bins/indices for each band. - update_posteri_bands(target_powspec, …)- Updates SNR of each set of bands. - apply_bandspecsub - apply_floor - apply_postfilter - sub_noise - 
__init__(win_size_ms=None, percent_overlap=None, sr=None, window_type=None, max_vol=0.4, num_bands=6, band_spacing='linear', zeropad=None, smooth_factor=0.98, first_iter=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - 
setup_bands()[source]¶
- Provides starting and ending frequncy bins/indices for each band. - Parameters
- self ( - class) – Contains variables num_bands (if None, set to 6) and frame_length
- Returns
- Sets the class variables band_start_freq and band_end_freq. 
- Return type
 - Examples - >>> import soundpy as sp >>> import numpy as np >>> # Default is set to 6 bands: >>> fil = sp.BandSubtraction() >>> fil.setup_bands() >>> fil.band_start_freq array([ 0., 80., 160., 240., 320., 400.]) >>> fil.band_end_freq array([ 80., 160., 240., 320., 400., 480.]) >>> # change default settings >>> fil = sp.BandSubtraction(num_bands=5) >>> fil.setup_bands() >>> fil.band_start_freq array([ 0., 96., 192., 288., 384.]) >>> fil.band_end_freq array([ 96., 192., 288., 384., 480.]) 
 - 
update_posteri_bands(target_powspec, noise_powspec)[source]¶
- Updates SNR of each set of bands. - MATLAB code from speech enhancement book uses power, puts it into magnitude (via square root), then puts it back into power..? And uses some sort of ‘norm’ function… which I think is actually just the sum. Original equation can be found in the paper below. page 117 from book? - paper: Kamath, S. D. & Loizou, P. C. (____), A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. - I am using power for the time being. - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for space: >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and without noise >>> time = np.arange(0, 10, 0.01) >>> signal = np.sin(time)[:fil.frame_length] >>> np.random.seed(0) >>> noise = np.random.normal(np.mean(signal),np.mean(signal)+0.3,960) >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> powerspec_noisy = np.abs(np.fft.fft(signal + noise))**2 >>> fil.update_posteri_bands(powerspec_clean, powerspec_noisy) >>> fil.snr_bands array([ -1.91189028, -39.22078063, -44.16682922, -45.65265895]) >>> # compare with no noise in signal: >>> fil.update_posteri_bands(powerspec_clean, powerspec_clean) >>> fil.snr_bands array([0., 0., 0., 0.]) 
 - 
calc_oversub_factor()[source]¶
- Calculate over subtraction factor used in the cited paper. - Uses decibel SNR values calculated in update_posteri_bands() - paper: Kamath, S. D. & Loizou, P. C. (____), A multi-band spectral subtraction method ofr enhancing speech corrupted by colored noise. - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for space: >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and without noise >>> time = np.arange(0, 10, 0.01) >>> signal = np.sin(time)[:fil.frame_length] >>> np.random.seed(0) >>> noise = np.random.normal(np.mean(signal),np.mean(signal)+0.3,960) >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> powerspec_noisy = np.abs(np.fft.fft(signal + noise))**2 >>> fil.update_posteri_bands(powerspec_clean, powerspec_noisy) >>> fil.snr_bands array([ -1.91189028, -39.22078063, -44.16682922, -45.65265895]) >>> a = fil.calc_oversub_factor() >>> a array([4.28678354, 4.75 , 4.75 , 4.75 ]) >>> # compare with no noise in signal: >>> fil.update_posteri_bands(powerspec_clean, powerspec_clean) >>> fil.snr_bands array([0., 0., 0., 0.]) >>> a = fil.calc_oversub_factor() >>> a array([4., 4., 4., 4.]) 
 - 
calc_relevant_band(target_powspec)[source]¶
- Calculates band with highest energy levels. - Parameters
- self ( - class instance) – Contains class variables band_start_freq and band_end_freq.
- target_powerspec ( - np.ndarray) – Power spectrum of the target signal.
 
- Returns
- rel_band_index ( - int) – Index for which band contains the most energy.
- band_energy_matrix ( - np.ndarray [size=(num_bands,- ),- dtype=np.float]) – Power levels of each band.
 
 - Examples - >>> import soundpy as sp >>> import numpy as np >>> # setting to 4 bands for this example (default is 6): >>> fil = sp.BandSubtraction(num_bands=4) >>> fil.setup_bands() >>> # generate sine signal with and with frequency 25 >>> time = np.arange(0, 10, 0.01) >>> full_circle = 2 * np.pi >>> freq = 25 >>> signal = np.sin((freq*full_circle)*time)[:fil.frame_length] >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> rel_band_index, band_power_energies = fil.calc_relevant_band(powerspec_clean) >>> rel_band_index 2 >>> # and with frequency 50 >>> freq = 50 >>> signal = np.sin((freq*full_circle)*time)[:fil.frame_length] >>> powerspec_clean = np.abs(np.fft.fft(signal))**2 >>> rel_band_index, band_power_energies = fil.calc_relevant_band(powerspec_clean) >>> rel_band_index 3 
 
-