This is an implementation of the Spectral Clustering algorithm in Python using the concept of Algebraic Connectivity. For an introduction to these two topics, please check out our presentation here and our summary here.
The algorithm works as follows:
This project was made as part of our final project of APMA 2812G : Combinatorial Theory for the Fall 2022 Semester.
We will explain how the input files should be formatted here, note that you can find some examples under the example folder.
The program takes in a text file of the following format:
- The first line of the file is as follows:
<Number of Nodes/Points> <Type of Input> <Additional Information>
We currently support 3 types of file inputs:
Type of Input | File Type |
---|---|
g | weighted graph |
e | data points with epsilon neighborhood |
s | data points with similarity function |
Let n be the number of vertices of the graph. the first line of a weighted graph is as follows:
n g
The graph will have vertex numbered from 0 to n-1. The
<index of neighbor 1> <weight of edge to neighbor 1> ...
For example:
5 g
3 1 4 1
2 1
1 1 3 1
2 1 0 1 4 1
0 1 3 1
Let n be the number of data points and m be the epsilon value, then the first line of the file is as follows:
n e m
All lines after the first line should be the data points given. If the data points are in
<x_1 component of the point> ... <x_d component of the point>
The similarity function we use is the Gaussian Similarity with takes in a parameter sigma. Let n be the number of data points , the first line of the file is follows:
n s sigma
All lines after the first line should be the data points given. If the data points are in
<x_1 component of the point> ... <x_d component of the point>
To install dependencies (note this program was made in Python 3.7.3)
pip install -r requirements.txt
To run tests
cd tests
pytest
To run the clustering method:
python3 main.py <path_to_file> <number of clusters K>
To re-create the examples from our presentation, run the following commands:
python3 main.py example/graph/two_triangles.txt 2
python3 main.py example/data/two_epsilon.txt 2
python3 main.py example/data/two.txt 2
python3 main.py example/data/similarity.txt 4