- Step 1: Run python diamond_distance_matrix.py to get the pair-wise distance between all AMP pairs in dataset D1.csv (There is precomputed diamond_distance_matrix in file diamond_distance_matrix.zip for step 1, so you can skip step1)
- Step 2: Run python diamond_get_amp_prototype.py to write prototypes in file amp_prototype_diamond_distance.txt
Precomputed distance metric on our current set D1 (used for the real oracle) accessible here
To run: python k_medoids_better.py --n_medoids 500
- This will load the default distance matrix built on the full D1 set (real oracle) - the list of medoids (initially 500), and the full list of associated neighbors (clusters members)
The HackMD note summarizing the approach is here