SIMPLEs2020: Reproducible codes for SIMPLEs experiments
Data sets used in the manuscript
- Can be downloaded from the Zenodo: https://doi.org/10.5281/zenodo.3958371
NOTE: Links below to GitHub might be broken. Please see the corresponding codes and data in the Zenodo.
-
mESC (Deng's) data
-
I used this dataset to show that:
-
- SIMPLEs can discover subtypes of cells. TSNE plot of imputed data
-
- Imputation result of differential expressed genes. Heatmap
-
-
Raw data: https://hemberg-lab.github.io/scRNA.seq.datasets/mouse/edev/#deng
-
Preprocessed data:
- deng_dat: log-normalized and filtered an outlier cell and genes.
- It has 4 objects: celltype_true (6 major cell types), cl2 (10 subtypes), Y2 (preprocessed data), clus(initial clustering results by Kmeans)
-
The script for preprocessing and plotting the results: plot
-
The script for different imputation methods: SCRABBLE_VIPER_SAVER, SIMPLE and the results.
-
-
hECS (Chu's) data
-
Chu’s data has two parts: 1) chu1: different cell types; 2) chu2: time series. Both of them have corresponding bulk RNASeq. Chu’s data has good quality, so I added dropouts to the original data to compare if can recover the truth.
-
Raw data: downloaded from GEO: GSE75748
-
Preprocessed data:
-
Chu_celltype: log-normalized and filtered genes for part 1.
- preprocess
- It has 5 objects: bulk_norm (log-normalized bulk RNASeq), dat_norm (log-normalized scRNASeq), celltype (cell type label for scRNASeq), celltype0 (cell type label for bulk RNASeq), bulk_norm_mean (mean of gene expression for each cell type from bulk RNASeq). The rows (genes) of bulk and scRNASeq data are the same.
-
Chu_ts: log-normalized and filtered genes for part 2.
-
-
Chu's cell type data (Chu1) based simulation, i.e adding dropouts to original data set:
- Rscripts: SCRABBLE_VIPER_SAVER, SIMPLE_MAGIC_SCIMPUTE
-
Chu's cell type data:
- all cells (Chu_all):
- Rscript: SCRABBLE_VIPER_SAVER, MAGIC_SCIMPUTE, SIMPLES
- Results: Others_result/Chu_all_*.RData, SIMPLEs
- only DE and EC cell types: plot
- all cells (Chu_all):
-
Chu's time series data (Chu2):
- Rscripts: SCRABBLE_VIPER_SAVER, MAGIC_SCIMPUTE, SIMPLES
- Results: Others_result/Chu_ts__.RData, scimpute_ts, SIMPLE-B
-
SIMPLEs: a single-cell RNA sequencing imputation strategy preserving gene modules and cell clusters variation Zhirui Hu, Songpeng Zu, Jun S. Liu, bioRiv 2020