Skip to content

Latest commit

 

History

History
124 lines (101 loc) · 4.26 KB

README.md

File metadata and controls

124 lines (101 loc) · 4.26 KB

This repository contains scripts and notebooks to reproduce the experiments in

From t-SNE to UMAP with contrastive learning ICLR 2023 (openreview, arxiv)
Sebastian Damrich, Niklas Böhm, Fred A Hamprecht, Dmitry Kobak

@inproceedings{damrich2023from,
  title={From $t$-{SNE} to {UMAP} with contrastive learning},
  author={Damrich, Sebastian and B{\"o}hm, Jan Niklas  and Hamprecht, Fred A and Kobak, Dmitry},
  booktitle={International Conference on Learning Representations},
  year={2023},
}

It depends on several other repositories, in particular contrastive-ne, which implement the actual logic and contain utilities.

Installation

Create and activate the conda environment

conda env create -f environment.yml
conda activate cl_tsne_umap

Install openTSNE, vis_utis, umap, ncvis and cne

git clone https://github.com/sdamrich/openTSNE
cd openTSNE
python setup.py install
cd ..

git clone https://github.com/sdamrich/vis_utils
cd vis_utils
python setup.py install
cd ..

git clone https://github.com/sdamrich/UMAPs-true-loss
cd UMAPs-true-loss
python setup.py install
cd ..

git clone https://github.com/sdamrich/ncvis
cd ncvis
make libs
make wrapper

git clone -b iclr2023 https://github.com/berenslab/contrastive-ne
pip install --no-deps . 
cd ..

Usage

To reproduce the Neg-t-SNE embeddings from Fig. 1 a)-e), run

python scripts/compute_embds_cne.py

and check out the results in notebooks/negtsne.ipynb.

Neg-t-SNE on MNIST

To reproduce the UMAP embeddings from Fig. S1 a)-c), run

python scripts/compute_embds_umap.py

and check out the results in notebooks/umap_vs_negtsne.ipynb.

UMAP no annealing

To compute the metrics for the Neg-t-SNE embedding spectra (Fig. S4), run

python scripts/compute_metrics.py

and check out the results in notebooks/metrics.ipynb.

kNN recall and Spearmann correlation over spectra

To reproduce the run time by batch size analysis from Fig. S6, run

python scripts/run_time_by_batch_size.py

and check out the results in notebooks/speed_up.ipynb.

Run time by batch size

To reproduce the SimCLR experiments with m=16 and random seed r=0, run

python cne_scripts_notebooks/scripts/cifar10_acc.py -m 16 -r 0

The results will be printed in terminal but can also be checked out in notebooks/eval_cifar.ipynb.

For other experiments adapt the parameters at the top of compute_embds_cne.py and compute_embds_umap.py or at the top of the main function in cifar10_acc.py accordingly. The number of negative samples and the random seed for cifar10_acc.py can be passed as command line arguments, as above. Downloaded datasets and neighbor embedding results will be saved in cne_scripts_notebooks/data and figures will be saved in cne_scripts_notebooks/figures.

All neighbor embedding results alongside their parameters can be inspected in the jupyter notebooks in cne_scripts_notebooks/notebooks. This list details which figures can be inspected using which notebooks:

  • Fig 1: negtsne.ipynb, tsne.ipynb, ncvis.ipynb
  • Fig 2: umap_vs_negtsne.ipynb
  • Fig 3: parametric.ipynb
  • Fig S1: umap_vs_negtsne.ipynb
  • Fig S2: umap_vs_negtsne.ipynb
  • Fig S3: trimap.ipynb
  • Fig S4: metrics.ipynb
  • Fig S5: toy_experiment.ipynb
  • Fig S6: speed_up.ipynb
  • Fig S7: attr_rep_plot_UMAP_neg.ipynb
  • Fig S8: umap_vs_negtsne_vary_n_noise.ipynb
  • Fig S9: tsne_vs_ncvis.ipynb
  • Fig S10: tsne_vs_ncvis.ipynb
  • Fig S11: negtsne.ipynb, tsne_ipynb, ncvis.ipynb
  • Fig S12: imba_mnist_negtsne.ipynb, imba_mnist_tsne_ncvis_umap.ipynb
  • Fig S13: human_negtsne.ipynb, human_tsne_ncvis_umap.ipynb
  • Fig S14: zebrafish_negtsne.ipynb, zebrafish_tsne_ncvis_umap.ipynb
  • Fig S15: c_elegans_negtsne.ipynb, c_elegans_tsne_ncvis_umap.ipynb
  • Fig S16: k49_negtsne.ipynb, k49_tsne_ncvis_umap.ipynb
  • Fig S17: k49_negtsne.ipynb
  • Fig S18: ncvis.ipynb, tsne.ipynb
  • Fig S19: infonctsne.ipynb, tsne.ipynb
  • Tab 1: eval_cifar.ipynb