src.embeddings.vae

Attributes

logger

Classes

VAEEmbedding

VAEEmbedding class for computing spectrograms from audio data and fitting a Variational Autoencoder (VAE).

Functions

_compute_spectrogram(audio, sample_rate, resolution, ...)

Module Contents

src.embeddings.vae.logger
class src.embeddings.vae.VAEEmbedding(dataset_name: str, clip_duration: float = 3.0, model_path: str or pathlib.Path or None = None, sampling_rate: int or None = None, learning_rate=0.05, batch_size=16, epochs=10, latent_dim=128, beta_kl=1, kw_spectrograms: dict or None = None)

Bases: embeddings.BaseEmbedding

VAEEmbedding class for computing spectrograms from audio data and fitting a Variational Autoencoder (VAE).

This class extends BaseEmbedding and provides functionality for computing spectrograms and training a VAE on those spectrograms.

Attributes:

model_pathstr

Path where the VAE model will be saved or loaded.

datapd.DataFrame or None

DataFrame holding the computed spectrograms.

vaetensorflow.keras.Model

The VAE model used to fit the spectrogram data.

Methods:

load_model():

Loads a pre-trained VAE model if available.

process(dataset_name: str, extension: str = ‘.wav’, sampling_rate: int = 48000, **kwargs):

Processes the audio dataset by computing spectrograms and fitting a VAE.

learning_rate
epochs
batch_size
latent_dim
beta_kl
kw_spectrograms
model = None
spectrograms = None
compute_spectrograms()
load_model()
train_model()
process()
src.embeddings.vae._compute_spectrogram(audio: numpy.array, sample_rate: int, resolution: float, overlap: float, freq_min: float, freq_max: float or None, n_freqs: int, **kwargs)