src.embeddings

Submodules

Attributes

logger

Classes

BaseEmbedding

Base class for embedding models used to generate embeddings from audio data.

Functions

_split_audio_into_chunks(→ pandas.DataFrame)

Splits an audio file into non-overlapping chunks of specified duration.

Package Contents

src.embeddings.logger
src.embeddings._split_audio_into_chunks(filename: str, chunk_duration: float, sampling_rate: int or None = None) pandas.DataFrame

Splits an audio file into non-overlapping chunks of specified duration.

Args:

filename (str): Path to the audio file. chunk_duration (float): Duration of each chunk in seconds. sampling_rate (int, optional): Sampling rate to load the audio file. If None, the default sampling rate

is used by librosa.

Returns:
pd.DataFrame: A DataFrame with two columns:
  • ‘filename’: The chunked filename with the format ‘original_filename_starttime_endtime.ext’.

  • ‘audio_data’: The corresponding chunked audio data.

class src.embeddings.BaseEmbedding(dataset_name: str, clip_duration: float = 3.0, model_path: str | pathlib.Path | None = None, sampling_rate: int | None = None)

Base class for embedding models used to generate embeddings from audio data.

This class provides core functionality for embedding models, including loading the model, reading and processing audio datasets, and optionally using Dask for distributed processing. It is meant to be subclassed by specific embedding models, which should implement their own model loading and processing logic.

Attributes:

dataset_namepd.DataFrame or None

DataFrame holding the processed audio data (e.g., file paths, audio features).

embeddingspd.DataFrame or None

DataFrame containing the generated embeddings for the audio dataset.

Methods:

load_model():

Abstract method for loading the model. Must be implemented by subclasses. :raises NotImplementedError: This method must be implemented by subclasses for model loading.

process(audio_files):

Abstract method for processing audio files to generate embeddings. Must be implemented by subclasses. :raises NotImplementedError: This method must be implemented by subclasses for model loading.

read_audio_dataset() -> pd.DataFrame:

Reads and processes an audio dataset, optionally using Dask for parallel processing. Returns a pandas DataFrame containing the audio file paths and other metadata. :return: A pandas DataFrame indexed by ‘filename’ and a data column ‘audio_data’ containing processed audio chunks.

dataset_name
model_path
sampling_rate
clip_duration
data
embeddings
list_of_audio_files = []
path_dataset = None
load_model()

Placeholder method for loading the model. This should be implemented by subclasses if needed.

abstract process()

Placeholder method for processing audio files. This should be implemented by subclasses.

get_path_dataset(url_db: str | None = None)
read_audio_dataset() pandas.DataFrame

Read the dataset of audio files and process it using Dask for parallelization if available.

return:

A pandas DataFrame containing audio file paths and any other relevant metadata.

save_embeddings(embedding_method_name: str, embeddings)
_save_embedding_metadata_to_db(dataset_id: int, embedding_id: int, file_path: str) None