mcs module¶

This module defines multichannel audio flies augmentation class MultiChannelSignal.

class mcs.MultiChannelSignal(np_data: ndarray = None, sampling_rate: int = -1, seed: int = -1)¶

Class provides support of multichannel sound data sintesized or loaded from WAV file.

channels_count() → int¶: Returns the number of channels in the MultiChannelSignal object.

channels_len() → int¶: Returns the number of samples in one channel of MultiChannelSignal object.

copy() → MultiChannelSignal¶: Make deep copy of the MultiChannelSignal object.

cpy() → MultiChannelSignal¶: Make deep copy of the MultiChannelSignal object.

gen(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') → MultiChannelSignal¶

Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.

Parameters:

frequency_list (list) – A list of frequencies to generate sound for.
duration (float) – The duration of the sound in seconds.
sampling_rate (int) – The sample rate of the sound. Defaults to -1.
mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.

Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.

Returns:

representing the generated multichannel

Return type:

self (MultiChannelSignal)

sound.

generate(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') → MultiChannelSignal¶

Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.

Parameters:

frequency_list (list) – A list of frequencies to generate sound for.
duration (float) – The duration of the sound in seconds.
sampling_rate (int) – The sample rate of the sound. Defaults to -1.
mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.

Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.

Returns:

representing the generated multichannel

Return type:

self (MultiChannelSignal)

sound.

get() → ndarray¶

Returns the multichannel sound data stored in the MultiChannelSignal instance.

Returns:: The multichannel sound data.
Return type:: np.ndarray

info() → dict¶

Returns a dictionary containing metadata about the audio data.

The dictionary contains the following information:

path: The file path where the audio data was loaded from.

channels_count: The number of audio channels in the data (1 for mono, 2 and more for stereo and other).

sample_rate: The sampling rate at which the audio data is stored.

length_s: The duration of the audio data in seconds.

If the data is not loaded, the path, channels_count, and length_s values will be -1. Otherwise, they will be populated with actual values.

Returns: dict: A dictionary containing metadata about the audio data.

merge() → MultiChannelSignal¶

Merges channels of MultiChannelSignal object into a single channel.

Parameters:

none

Returns:

The MultiChannelSignal object containing: a single channel of merging result.

Return type:

self (MultiChannelSignal)

mrg() → MultiChannelSignal¶

Merges channels of MultiChannelSignal object into a single channel.

Parameters:

none

Returns:

The MultiChannelSignal object containing: a single channel of merging result.

Return type:

self (MultiChannelSignal)

pause_detect(relative_level: list[float]) → ndarray[int]¶

Detects pauses in a multichannel sound.

Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.

Returns:: The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.
Return type:: np.ndarray[int]

pause_shrink(mask: ndarray[int], min_pause: list[int]) → MultiChannelSignal¶

Shrink pauses in multichannel sound.

Parameters:

mask (np.ndarray[int]) – The mask indicating the pauses in the
sound. (multichannel)
min_pause (list[int]) – The list of minimum pause lengths for
object. (each channel in)

Returns:

The MultiChannelSignal object with: pauses shrunk.

Return type:

self (MultiChannelSignal)

pdt(relative_level: list[float]) → ndarray[int]¶

Detects pauses in a multichannel sound.

Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.

Returns:: The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.
Return type:: np.ndarray[int]

put(signal: MultiChannelSignal) → MultiChannelSignal¶

Updates the multichannel sound data and sample rate of the MultiChannelSignal instance.

Parameters:: signal (MultiChannelSignal) – source of multichannel sound data.
Returns:: The updated MultiChannelSignal instance.
Return type:: self (MultiChannelSignal)

rd(source_path: str) → MultiChannelSignal¶

Reads a multichannel sound from a WAV file.

Parameters:

source_path (str) – The path to the WAV file.

Returns:

An object MultiChannelSignal containing: the sample rate and the multichannel sound data.

Return type:

self(MultiChannelSignal)

read(source_path: str) → MultiChannelSignal¶

Reads a multichannel sound from a WAV file.

Parameters:

source_path (str) – The path to the WAV file.

Returns:

An object MultiChannelSignal containing: the sample rate and the multichannel sound data.

Return type:

self(MultiChannelSignal)

rms(last_index_of_sample: int = -1, decimals: int = -1)¶

Calculate the root mean square (RMS) of a multichannel sound.

Parameters:

last_index_of_sample (int) – The last index to consider when calculating the RMS. If -1, consider the entire array. Defaults to -1.
decimals (int) – Number of decimal places to round the RMS value. If -1, do not round. Defaults to -1.

Returns:

A list of RMS values for each channel in the multichannel sound.

Return type:

list

sbs(signal2: MultiChannelSignal) → MultiChannelSignal¶

Concatenates two multichannel sound signals side by side.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The concatenated sound signal: containing channels of both MultiChannelSignal objects.

Return type:

self (MultiChannelSignal)

set_seed(seed: int = -1)¶: Set seeding value.

shape() → tuple¶

Returns the shape of the multichannel sound object ‘data’ field.

Returns:: A tuple containing the shape of the multichannel sound data.
Return type:: tuple

side_by_side(signal2: MultiChannelSignal) → MultiChannelSignal¶

Concatenates two multichannel sound signals side by side.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The concatenated sound signal: containing channels of both MultiChannelSignal objects.

Return type:

self (MultiChannelSignal)

split(channels_count: int) → MultiChannelSignal¶

Splits a multichannel signal (containing single channel) into multiple identical channels.

Parameters:

channels_count (int) – The number of channels to split the signal
into.

Returns:

The split multichannel signal, with: each channel identical.

Return type:

self (MultiChannelSignal)

splt(channels_count: int) → MultiChannelSignal¶

Splits a multichannel signal (containing single channel) into multiple identical channels.

Parameters:

channels_count (int) – The number of channels to split the signal
into.

Returns:

The split multichannel signal, with: each channel identical.

Return type:

self (MultiChannelSignal)

sum(signal2: MultiChannelSignal) → MultiChannelSignal¶

Sums two multichannel signals.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The sum of self.data and signal2.data: signals as MultiChannelSignal.

Return type:

self (MultiChannelSignal)

wr(dest_path: str) → MultiChannelSignal¶

Writes the given multichannel sound data to a WAV file at the specified path.

Parameters:: dest_path (str) – The path to the WAV file.

Returns: self (MultiChannelSignal): representing saved multichannel sound.

wrbc(dest_path: str) → MultiChannelSignal¶

Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.

File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.

Parameters:

dest_path (str) – The path to the WAV file. The filename will be
number. (modified to include the channel)

Returns:

The instance itself, allowing for method chaining.

Return type:

self (MultiChannelSignal)

write(dest_path: str) → MultiChannelSignal¶

Writes the given multichannel sound data to a WAV file at the specified path.

Parameters:: dest_path (str) – The path to the WAV file.

Returns: self (MultiChannelSignal): representing saved multichannel sound.

write_by_channel(dest_path: str) → MultiChannelSignal¶

Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.

File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.

Parameters:

dest_path (str) – The path to the WAV file. The filename will be
number. (modified to include the channel)

Returns:

The instance itself, allowing for method chaining.

Return type:

self (MultiChannelSignal)

mcs.pause_measure(mask: ndarray[int]) → list¶

Measures pauses in multichannel sound.

Parameters:

mask (np.ndarray) – A mask indicating the pauses in the multichannel
sound.

Returns:

A list of lists containing pairs of (index, length) of pauses for each channel. Length is in samples.

Return type:

list

mcs module¶

wavaugmentate

Navigation

Related Topics