mcs module¶
This module defines multichannel audio flies augmentation class MultiChannelSignal.
- class mcs.MultiChannelSignal(np_data: ndarray = None, sampling_rate: int = -1, seed: int = -1)¶
Class provides support of multichannel sound data sintesized or loaded from WAV file.
- channels_count() int¶
Returns the number of channels in the MultiChannelSignal object.
- channels_len() int¶
Returns the number of samples in one channel of MultiChannelSignal object.
- copy() MultiChannelSignal¶
Make deep copy of the MultiChannelSignal object.
- cpy() MultiChannelSignal¶
Make deep copy of the MultiChannelSignal object.
- gen(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') MultiChannelSignal¶
Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.
- Parameters:
frequency_list (list) – A list of frequencies to generate sound for.
duration (float) – The duration of the sound in seconds.
sampling_rate (int) – The sample rate of the sound. Defaults to -1.
mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.
Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.
- Returns:
representing the generated multichannel
- Return type:
self (MultiChannelSignal)
sound.
- generate(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') MultiChannelSignal¶
Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.
- Parameters:
frequency_list (list) – A list of frequencies to generate sound for.
duration (float) – The duration of the sound in seconds.
sampling_rate (int) – The sample rate of the sound. Defaults to -1.
mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.
Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.
- Returns:
representing the generated multichannel
- Return type:
self (MultiChannelSignal)
sound.
- get() ndarray¶
Returns the multichannel sound data stored in the MultiChannelSignal instance.
- Returns:
The multichannel sound data.
- Return type:
np.ndarray
- info() dict¶
Returns a dictionary containing metadata about the audio data.
The dictionary contains the following information:
path: The file path where the audio data was loaded from.
channels_count: The number of audio channels in the data (1 for mono, 2 and more for stereo and other).
sample_rate: The sampling rate at which the audio data is stored.
length_s: The duration of the audio data in seconds.
If the data is not loaded, the path, channels_count, and length_s values will be -1. Otherwise, they will be populated with actual values.
Returns: dict: A dictionary containing metadata about the audio data.
- merge() MultiChannelSignal¶
Merges channels of MultiChannelSignal object into a single channel.
- Parameters:
none
- Returns:
- The MultiChannelSignal object containing
a single channel of merging result.
- Return type:
self (MultiChannelSignal)
- mrg() MultiChannelSignal¶
Merges channels of MultiChannelSignal object into a single channel.
- Parameters:
none
- Returns:
- The MultiChannelSignal object containing
a single channel of merging result.
- Return type:
self (MultiChannelSignal)
- pause_detect(relative_level: list[float]) ndarray[int]¶
Detects pauses in a multichannel sound.
Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.
- Returns:
The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.
- Return type:
np.ndarray[int]
- pause_shrink(mask: ndarray[int], min_pause: list[int]) MultiChannelSignal¶
Shrink pauses in multichannel sound.
- Parameters:
mask (np.ndarray[int]) – The mask indicating the pauses in the
sound. (multichannel)
min_pause (list[int]) – The list of minimum pause lengths for
object. (each channel in)
- Returns:
- The MultiChannelSignal object with
pauses shrunk.
- Return type:
self (MultiChannelSignal)
- pdt(relative_level: list[float]) ndarray[int]¶
Detects pauses in a multichannel sound.
Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.
- Returns:
The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.
- Return type:
np.ndarray[int]
- put(signal: MultiChannelSignal) MultiChannelSignal¶
Updates the multichannel sound data and sample rate of the MultiChannelSignal instance.
- Parameters:
signal (MultiChannelSignal) – source of multichannel sound data.
- Returns:
The updated MultiChannelSignal instance.
- Return type:
self (MultiChannelSignal)
- rd(source_path: str) MultiChannelSignal¶
Reads a multichannel sound from a WAV file.
- Parameters:
source_path (str) – The path to the WAV file.
- Returns:
- An object MultiChannelSignal containing
the sample rate and the multichannel sound data.
- Return type:
self(MultiChannelSignal)
- read(source_path: str) MultiChannelSignal¶
Reads a multichannel sound from a WAV file.
- Parameters:
source_path (str) – The path to the WAV file.
- Returns:
- An object MultiChannelSignal containing
the sample rate and the multichannel sound data.
- Return type:
self(MultiChannelSignal)
- rms(last_index_of_sample: int = -1, decimals: int = -1)¶
Calculate the root mean square (RMS) of a multichannel sound.
- Parameters:
last_index_of_sample (int) – The last index to consider when calculating the RMS. If -1, consider the entire array. Defaults to -1.
decimals (int) – Number of decimal places to round the RMS value. If -1, do not round. Defaults to -1.
- Returns:
A list of RMS values for each channel in the multichannel sound.
- Return type:
list
- sbs(signal2: MultiChannelSignal) MultiChannelSignal¶
Concatenates two multichannel sound signals side by side.
- Parameters:
signal2 (MultiChannelSignal) – The second multichannel sound signal.
- Returns:
- The concatenated sound signal
containing channels of both MultiChannelSignal objects.
- Return type:
self (MultiChannelSignal)
- set_seed(seed: int = -1)¶
Set seeding value.
- shape() tuple¶
Returns the shape of the multichannel sound object ‘data’ field.
- Returns:
A tuple containing the shape of the multichannel sound data.
- Return type:
tuple
- side_by_side(signal2: MultiChannelSignal) MultiChannelSignal¶
Concatenates two multichannel sound signals side by side.
- Parameters:
signal2 (MultiChannelSignal) – The second multichannel sound signal.
- Returns:
- The concatenated sound signal
containing channels of both MultiChannelSignal objects.
- Return type:
self (MultiChannelSignal)
- split(channels_count: int) MultiChannelSignal¶
Splits a multichannel signal (containing single channel) into multiple identical channels.
- Parameters:
channels_count (int) – The number of channels to split the signal
into.
- Returns:
- The split multichannel signal, with
each channel identical.
- Return type:
self (MultiChannelSignal)
- splt(channels_count: int) MultiChannelSignal¶
Splits a multichannel signal (containing single channel) into multiple identical channels.
- Parameters:
channels_count (int) – The number of channels to split the signal
into.
- Returns:
- The split multichannel signal, with
each channel identical.
- Return type:
self (MultiChannelSignal)
- sum(signal2: MultiChannelSignal) MultiChannelSignal¶
Sums two multichannel signals.
- Parameters:
signal2 (MultiChannelSignal) – The second multichannel sound signal.
- Returns:
- The sum of self.data and signal2.data
signals as MultiChannelSignal.
- Return type:
self (MultiChannelSignal)
- wr(dest_path: str) MultiChannelSignal¶
Writes the given multichannel sound data to a WAV file at the specified path.
- Parameters:
dest_path (str) – The path to the WAV file.
Returns: self (MultiChannelSignal): representing saved multichannel sound.
- wrbc(dest_path: str) MultiChannelSignal¶
Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.
File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.
- Parameters:
dest_path (str) – The path to the WAV file. The filename will be
number. (modified to include the channel)
- Returns:
The instance itself, allowing for method chaining.
- Return type:
self (MultiChannelSignal)
- write(dest_path: str) MultiChannelSignal¶
Writes the given multichannel sound data to a WAV file at the specified path.
- Parameters:
dest_path (str) – The path to the WAV file.
Returns: self (MultiChannelSignal): representing saved multichannel sound.
- write_by_channel(dest_path: str) MultiChannelSignal¶
Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.
File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.
- Parameters:
dest_path (str) – The path to the WAV file. The filename will be
number. (modified to include the channel)
- Returns:
The instance itself, allowing for method chaining.
- Return type:
self (MultiChannelSignal)
- mcs.pause_measure(mask: ndarray[int]) list¶
Measures pauses in multichannel sound.
- Parameters:
mask (np.ndarray) – A mask indicating the pauses in the multichannel
sound.
- Returns:
A list of lists containing pairs of (index, length) of pauses for each channel. Length is in samples.
- Return type:
list