src.clustering
==============

.. py:module:: src.clustering


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/src/clustering/affinity/index
   /autoapi/src/clustering/dbscan/index
   /autoapi/src/clustering/hdbscan/index
   /autoapi/src/clustering/kmeans/index
   /autoapi/src/clustering/optics/index
   /autoapi/src/clustering/spectral/index


Classes
-------

.. autoapisummary::

   src.clustering.BaseClustering


Functions
---------

.. autoapisummary::

   src.clustering.get_clustering_model


Package Contents
----------------

.. py:class:: BaseClustering(dataset_name: str = None, embedding_method: str = None, dataset_id: int = None, embedding_id: int = None, embeddings: pandas.DataFrame = None)

   Base class for clustering models used to group data based on embeddings or other features.

   This class provides core functionality for clustering models, including loading data and storing
   clustering results. It is meant to be subclassed by specific clustering algorithms, which should
   implement their own logic for fitting the model and predicting clusters.

   Attributes:
   -----------
   data : pd.DataFrame or None
       DataFrame containing the data to be clustered.
   labels : pd.Series or None
       Series containing the cluster labels assigned to the data.

   Methods:
   --------
   load_data(file_path: str) -> pd.DataFrame:
       Loads a dataset from a CSV or pickle file into a pandas DataFrame.

   save_labels(file_path: str):
       Saves the cluster labels to a CSV or pickle file.


   .. py:attribute:: dataset_name


   .. py:attribute:: embedding_method


   .. py:attribute:: dataset_id


   .. py:attribute:: embedding_id


   .. py:attribute:: embeddings


   .. py:attribute:: data
      :value: None


   .. py:attribute:: labels
      :value: None


   .. py:method:: load_data() -> pandas.DataFrame

      Load the data to be clustered from a CSV or pickle file.

      :param file_path: Path to the data file (CSV or pickle format).
      :return: DataFrame containing the loaded data.


   .. py:method:: scale_data(data)


   .. py:method:: save_labels(labels)

      Save the cluster labels to a CSV or pickle file.

      :param file_path: Path where the cluster labels will be saved.


   .. py:method:: fit_predict()

      :param self.embeddings: DataFrame containing the data to be clustered.
      :return: DataFrame containing the cluster labels assigned to the data.


.. py:function:: get_clustering_model(method_name: str, *args, **kwargs)