mcs module

This module defines multichannel audio flies augmentation class MultiChannelSignal.

class mcs.MultiChannelSignal(np_data: ndarray = None, sampling_rate: int = -1, seed: int = -1)

Class provides support of multichannel sound data sintesized or loaded from WAV file.

channels_count() int

Returns the number of channels in the MultiChannelSignal object.

channels_len() int

Returns the number of samples in one channel of MultiChannelSignal object.

copy() MultiChannelSignal

Make deep copy of the MultiChannelSignal object.

cpy() MultiChannelSignal

Make deep copy of the MultiChannelSignal object.

gen(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') MultiChannelSignal

Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.

Parameters:
  • frequency_list (list) – A list of frequencies to generate sound for.

  • duration (float) – The duration of the sound in seconds.

  • sampling_rate (int) – The sample rate of the sound. Defaults to -1.

  • mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.

Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.

Returns:

representing the generated multichannel

Return type:

self (MultiChannelSignal)

sound.

generate(frequency_list: list[int], duration: float = 5, sampling_rate: int = -1, mode='sine') MultiChannelSignal

Generate a multichannel sound based on the given frequency list, duration, sample rate, and mode. The mode can be ‘sine’ or ‘speech’. In ‘sine’ mode, output multichannel sound will be a list of sine waves. In ‘speech’ mode, output will be a list of speech like signals. In this mode input frequencies list will be used as basic tone frequency for corresponding channel, it should be in interval 600..300.

Parameters:
  • frequency_list (list) – A list of frequencies to generate sound for.

  • duration (float) – The duration of the sound in seconds.

  • sampling_rate (int) – The sample rate of the sound. Defaults to -1.

  • mode (str) – The mode of sound generation. Can be ‘sine’ or ‘speech’.

Defaults to ‘sine’. Mode ‘speech’ generates speech like signals.

Returns:

representing the generated multichannel

Return type:

self (MultiChannelSignal)

sound.

get() ndarray

Returns the multichannel sound data stored in the MultiChannelSignal instance.

Returns:

The multichannel sound data.

Return type:

np.ndarray

info() dict

Returns a dictionary containing metadata about the audio data.

The dictionary contains the following information:

  • path: The file path where the audio data was loaded from.

  • channels_count: The number of audio channels in the data (1 for mono, 2 and more for stereo and other).

  • sample_rate: The sampling rate at which the audio data is stored.

  • length_s: The duration of the audio data in seconds.

If the data is not loaded, the path, channels_count, and length_s values will be -1. Otherwise, they will be populated with actual values.

Returns: dict: A dictionary containing metadata about the audio data.

merge() MultiChannelSignal

Merges channels of MultiChannelSignal object into a single channel.

Parameters:

none

Returns:

The MultiChannelSignal object containing

a single channel of merging result.

Return type:

self (MultiChannelSignal)

mrg() MultiChannelSignal

Merges channels of MultiChannelSignal object into a single channel.

Parameters:

none

Returns:

The MultiChannelSignal object containing

a single channel of merging result.

Return type:

self (MultiChannelSignal)

pause_detect(relative_level: list[float]) ndarray[int]

Detects pauses in a multichannel sound.

Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.

Returns:

The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.

Return type:

np.ndarray[int]

pause_shrink(mask: ndarray[int], min_pause: list[int]) MultiChannelSignal

Shrink pauses in multichannel sound.

Parameters:
  • mask (np.ndarray[int]) – The mask indicating the pauses in the

  • sound. (multichannel)

  • min_pause (list[int]) – The list of minimum pause lengths for

  • object. (each channel in)

Returns:

The MultiChannelSignal object with

pauses shrunk.

Return type:

self (MultiChannelSignal)

pdt(relative_level: list[float]) ndarray[int]

Detects pauses in a multichannel sound.

Args: relative_level (list[float]): The list of relative levels for each channel, signal below this level will be marked as pause.

Returns:

The mask indicating the pauses in the multichannel sound. The mask has the same shape as the input sound. It contains zeros and ones 0 - pause, 1 - not a pause.

Return type:

np.ndarray[int]

put(signal: MultiChannelSignal) MultiChannelSignal

Updates the multichannel sound data and sample rate of the MultiChannelSignal instance.

Parameters:

signal (MultiChannelSignal) – source of multichannel sound data.

Returns:

The updated MultiChannelSignal instance.

Return type:

self (MultiChannelSignal)

rd(source_path: str) MultiChannelSignal

Reads a multichannel sound from a WAV file.

Parameters:

source_path (str) – The path to the WAV file.

Returns:

An object MultiChannelSignal containing

the sample rate and the multichannel sound data.

Return type:

self(MultiChannelSignal)

read(source_path: str) MultiChannelSignal

Reads a multichannel sound from a WAV file.

Parameters:

source_path (str) – The path to the WAV file.

Returns:

An object MultiChannelSignal containing

the sample rate and the multichannel sound data.

Return type:

self(MultiChannelSignal)

rms(last_index_of_sample: int = -1, decimals: int = -1)

Calculate the root mean square (RMS) of a multichannel sound.

Parameters:
  • last_index_of_sample (int) – The last index to consider when calculating the RMS. If -1, consider the entire array. Defaults to -1.

  • decimals (int) – Number of decimal places to round the RMS value. If -1, do not round. Defaults to -1.

Returns:

A list of RMS values for each channel in the multichannel sound.

Return type:

list

sbs(signal2: MultiChannelSignal) MultiChannelSignal

Concatenates two multichannel sound signals side by side.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The concatenated sound signal

containing channels of both MultiChannelSignal objects.

Return type:

self (MultiChannelSignal)

set_seed(seed: int = -1)

Set seeding value.

shape() tuple

Returns the shape of the multichannel sound object ‘data’ field.

Returns:

A tuple containing the shape of the multichannel sound data.

Return type:

tuple

side_by_side(signal2: MultiChannelSignal) MultiChannelSignal

Concatenates two multichannel sound signals side by side.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The concatenated sound signal

containing channels of both MultiChannelSignal objects.

Return type:

self (MultiChannelSignal)

split(channels_count: int) MultiChannelSignal

Splits a multichannel signal (containing single channel) into multiple identical channels.

Parameters:
  • channels_count (int) – The number of channels to split the signal

  • into.

Returns:

The split multichannel signal, with

each channel identical.

Return type:

self (MultiChannelSignal)

splt(channels_count: int) MultiChannelSignal

Splits a multichannel signal (containing single channel) into multiple identical channels.

Parameters:
  • channels_count (int) – The number of channels to split the signal

  • into.

Returns:

The split multichannel signal, with

each channel identical.

Return type:

self (MultiChannelSignal)

sum(signal2: MultiChannelSignal) MultiChannelSignal

Sums two multichannel signals.

Parameters:

signal2 (MultiChannelSignal) – The second multichannel sound signal.

Returns:

The sum of self.data and signal2.data

signals as MultiChannelSignal.

Return type:

self (MultiChannelSignal)

wr(dest_path: str) MultiChannelSignal

Writes the given multichannel sound data to a WAV file at the specified path.

Parameters:

dest_path (str) – The path to the WAV file.

Returns: self (MultiChannelSignal): representing saved multichannel sound.

wrbc(dest_path: str) MultiChannelSignal

Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.

File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.

Parameters:
  • dest_path (str) – The path to the WAV file. The filename will be

  • number. (modified to include the channel)

Returns:

The instance itself, allowing for method chaining.

Return type:

self (MultiChannelSignal)

write(dest_path: str) MultiChannelSignal

Writes the given multichannel sound data to a WAV file at the specified path.

Parameters:

dest_path (str) – The path to the WAV file.

Returns: self (MultiChannelSignal): representing saved multichannel sound.

write_by_channel(dest_path: str) MultiChannelSignal

Writes each channel of the multichannel sound data to a separate WAV files, 1 for each channel.

File name will be modified to include the channel number. If path contains ./outputwav/sound_augmented.wav the output file names will be ./outputwav/sound_augmented_1.wav ./outputwav/sound_augmented_2.wav and so on.

Parameters:
  • dest_path (str) – The path to the WAV file. The filename will be

  • number. (modified to include the channel)

Returns:

The instance itself, allowing for method chaining.

Return type:

self (MultiChannelSignal)

mcs.pause_measure(mask: ndarray[int]) list

Measures pauses in multichannel sound.

Parameters:
  • mask (np.ndarray) – A mask indicating the pauses in the multichannel

  • sound.

Returns:

A list of lists containing pairs of (index, length) of pauses for each channel. Length is in samples.

Return type:

list