Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

New class structure in the package and incorporation of distance matrix clustering #36

Merged
merged 11 commits into from
Dec 4, 2023
Merged
220 changes: 129 additions & 91 deletions CONTRIBUTING.md

Large diffs are not rendered by default.

14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/HiPart)](https://pypi.org/project/HiPart/)
[![example workflow](https://github.com/panagiotisanagnostou/HiPart/actions/workflows/python-app.yml/badge.svg)](https://github.com/panagiotisanagnostou/HiPart/blob/main/.github/workflows/python-app.yml)
[![codecov](https://codecov.io/gh/panagiotisanagnostou/HiPart/branch/main/graph/badge.svg?token=FHoZrLjqfj)](https://codecov.io/gh/panagiotisanagnostou/HiPart)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/60c751d914474e288b369461e6e3466a)](https://www.codacy.com/gh/panagiotisanagnostou/HiPart/dashboard?utm_source=github.com&utm_medium=referral&utm_content=panagiotisanagnostou/HiPart&utm_campaign=Badge_Grade)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/60c751d914474e288b369461e6e3466a)](https://app.codacy.com/gh/panagiotisanagnostou/HiPart/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/panagiotisanagnostou/HiPart/blob/main/LICENSE)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.05024/status.svg)](https://doi.org/10.21105/joss.05024)

Expand Down Expand Up @@ -31,7 +31,17 @@ X, y = make_blobs(n_samples=1500, centers=6, random_state=0)
clustered_class = DePDDP(max_clusters_number=6).fit_predict(X)
```

Users can find complete execution examples for all the algorithms of the HiPart package in the [clustering_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_example.py) file of the repository. Also, the users can find a KernelPCA method usage example in the [clustering_with_kpca_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_with_kpca_example.py) file of the repository. Finally, the file [interactive_visualization_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/interactive_visualization_example.py) contains an example execution of the interactive visualization. The instructions for the interactive visualization GUI can be found with the execution of this visualization.
The HiPart package offers a comprehensive suite of examples to guide users in utilizing its various algorithms. These examples are conveniently located in the repository's examples directory.

For a general understanding of the package's capabilities, users can refer to the [clustering_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_example.py) file. This file serves as a foundational guide, providing complete examples of the package's algorithms in action.

Additionally, for those interested in incorporating KernelPCA methods, the [clustering_with_kpca_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_with_kpca_example.py) file is an invaluable resource. It offers a detailed example of how to apply KernelPCA within the context of the HiPart package.

Recognizing the importance of clustering via similarity or dissimilarity matrices, such as distance matrices, the HiPart package includes the [clustering_with_distance_matrix_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/distance_matrix_example.py) file. This specific example demonstrates the use of the DePDDP algorithm with a distance matrix, offering a practical application scenario.

Lastly, the package features an interactive visualization component, which is exemplified in the [interactive_visualization_example](https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/interactive_visualization_example.py) file. This example not only showcases the execution of the interactive visualization but also provides comprehensive instructions for navigating the visualization GUI.

These resources collectively ensure that users of the HiPart package have a well-rounded and practical understanding of its functionalities and applications.

Documentation
-------------
Expand Down
1 change: 1 addition & 0 deletions docs/HiPart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ Package contents
================

.. toctree::
:maxdepth: 2
:hidden:

.. automodule:: HiPart
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
author = 'Panagiotis Anagnostou'

# The full version, including alpha/beta/rc tags
release = '0.4.2'
release = '1.0.0'

# -- General configuration ---------------------------------------------------

Expand Down
20 changes: 15 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Welcome to HiPart's documentation!
.. image:: https://codecov.io/gh/panagiotisanagnostou/HiPart/branch/main/graph/badge.svg?token=FHoZrLjqfj
:target: https://codecov.io/gh/panagiotisanagnostou/HiPart
.. image:: https://app.codacy.com/project/badge/Grade/60c751d914474e288b369461e6e3466a
:target: https://www.codacy.com/gh/panagiotisanagnostou/HiPart/dashboard?utm_source=github.com&utm_medium=referral&utm_content=panagiotisanagnostou/HiPart&utm_campaign=Badge_Grade
:target: https://app.codacy.com/gh/panagiotisanagnostou/HiPart/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade
.. image:: https://img.shields.io/badge/License-MIT-yellow.svg
:target: https://github.com/panagiotisanagnostou/HiPart/blob/main/LICENSE
.. image:: https://joss.theoj.org/papers/10.21105/joss.05024/status.svg
Expand All @@ -25,7 +25,7 @@ HiPart is a package created for the implementation of hierarchical divisive clus

Installation
------------
For the installation of the package, the only necessary actions and requirements are a version of Python higher or equal to 3.7 and the execution of the following command.
For the installation of the package, the only necessary actions and requirements are a version of Python higher or equal to 3.8 and the execution of the following command.

.. code-block:: sh

Expand All @@ -44,7 +44,17 @@ The example bellow is the simplest form of the package's execution. Shortly, it

clustered_class = DePDDP(max_clusters_number=6).fit_predict(X)

Users can find complete execution examples for all the algorithms of the HiPart package in the `clustering_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_example.py>`_ file of the repository. Also, the users can find a KernelPCA method usage example in the `clustering_with_kpca_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_with_kpca_example.py>`_ file of the repository. Finally, the file `interactive_visualization_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/interactive_visualization_example.py>`_ contains an example execution of the interactive visualization. The instructions for the interactive visualization GUI can be found with the execution of this visualization.
The HiPart package offers a comprehensive suite of examples to guide users in utilizing its various algorithms. These examples are conveniently located in the repository's examples directory.

For a general understanding of the package's capabilities, users can refer to the `clustering_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_example.py>`_ file. This file serves as a foundational guide, providing complete examples of the package's algorithms in action.

Additionally, for those interested in incorporating KernelPCA methods, the `clustering_with_kpca_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/clustering_with_kpca_example.py>`_ file is an invaluable resource. It offers a detailed example of how to apply KernelPCA within the context of the HiPart package.

Recognizing the importance of clustering via similarity or dissimilarity matrices, such as distance matrices, the HiPart package includes the `clustering_with_distance_matrix_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/distance_matrix_example.py>`_ file. This specific example demonstrates the use of the DePDDP algorithm with a distance matrix, offering a practical application scenario.

Lastly, the package features an interactive visualization component, which is exemplified in the `interactive_visualization_example <https://github.com/panagiotisanagnostou/HiPart/blob/main/examples/interactive_visualization_example.py>`_ file. This example not only showcases the execution of the interactive visualization but also provides comprehensive instructions for navigating the visualization GUI.

These resources collectively ensure that users of the HiPart package have a well-rounded and practical understanding of its functionalities and applications.


Citation
Expand Down Expand Up @@ -76,10 +86,10 @@ Contents
-------------

.. toctree::
:maxdepth: 2
:maxdepth: 3

self
modules
HiPart
examples

* :ref:`genindex`
Expand Down
7 changes: 0 additions & 7 deletions docs/modules.rst

This file was deleted.

2 changes: 1 addition & 1 deletion examples/clustering_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
from sklearn.metrics import adjusted_rand_score as ari
from sklearn.metrics import normalized_mutual_info_score as nmi

import HiPart.visualizations as viz
import matplotlib.pyplot as plt
import time
import HiPart.visualizations as viz

# %% Example data creation
# number of cluster in the data
Expand Down
8 changes: 4 additions & 4 deletions examples/clustering_with_kpca_example.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
import HiPart.visualizations as viz
import matplotlib.pyplot as plt
import numpy as np

from HiPart.clustering import DePDDP
from sklearn.decomposition import KernelPCA
from sklearn.datasets import make_circles

import HiPart.visualizations as viz
import matplotlib.pyplot as plt
import numpy as np


def plot_manafolds(X, y, vals, gamma, title):
"""
Expand Down
47 changes: 47 additions & 0 deletions examples/distance_matrix_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from HiPart.clustering import DePDDP
from sklearn.datasets import make_blobs
from sklearn.metrics import adjusted_rand_score as ari
from sklearn.metrics import normalized_mutual_info_score as nmi
from scipy.spatial import distance_matrix

import HiPart.visualizations as viz
import HiPart.interactive_visualization as iviz
import time

# %% Example data creation
# number of cluster in the data
clusters = 5

X, y = make_blobs(
n_samples=500,
centers=5,
cluster_std=.8,
random_state=123,
)
print("Example data shape: {}\n".format(X.shape))

# Calculate distance matrix
dist_matrix = distance_matrix(X, X)

# %% dePDDP algorithm execution
# timer for the execution time in the form of tic-toc
tic = time.perf_counter()
depddp = DePDDP(
decomposition_method="mds",
max_clusters_number=clusters,
bandwidth_scale=0.5,
percentile=0.1,
distance_matrix=True,
random_state=12,
).fit(dist_matrix)
toc = time.perf_counter()

# results evaluation in terms of execution time, MNI and ARI metrics
print("depddp_time= {val:.5f}".format(val=toc - tic))
print("depddp_mni= {val:.5f}".format(val=nmi(y, depddp.labels_)))
print("depddp_ari= {val:.5f}\n".format(val=ari(y, depddp.labels_)))

# scatter visualization
viz.split_visualization(depddp).show()
# interactive visualization
iviz.main(depddp)
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()

__version__ = "0.4.2"
__version__ = "1.0.0"

setuptools.setup(
name="HiPart",
Expand Down
2 changes: 1 addition & 1 deletion src/HiPart/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
from KDEpy.TreeKDE import TreeKDE
from KDEpy.FFTKDE import FFTKDE

__version__ = "0.4.2"
__version__ = "1.0.0"
__author__ = "Panagiotis Anagnostou"

TreeKDE = TreeKDE
Expand Down
Loading