SCITUNA

Single-Cell data Integration Tool Using Network Alignment

SCITUNA: a novel single-cell data integration approach that combines both graph-based and anchor-based techniques. SCITUNA constructs a graph for each batch to represent intra-batch cell similarities, and a bipartite graph to capture inter-batch similarities. This transforms the integration problem into a many-to-one matching problem, where cells from a query batch are matched with cells from a reference batch. The resulting matches are then used to transform the query cell space to the reference cell space.

SCITUNA operates directly in the original gene expression space.
The method introduces a novel batch ordering strategy based on optimal transport cost.

#For more information, please refer to the article which can be found at here.

The five main stages of the SCITUNA workflow: a) preprocessing and normalization, b) dimensionality reduction and clustering, c) construction of intra-graphs and the inter-graph, d) anchor selection, e) integration, and f) visualization of the integration results.

Run SCITUNA

Below are the steps to obtain the results in the paper.

Get Datasets

To download the employed datasets, follow these steps:

Navigate to the data directory:
```
cd data
```
Run the script to download the dataset. The dataset argument can be either pancreas, lung, small_atac_peaks or small_atac_windows:
```
python get_data.py [dataset]
```

Example usage:

python get_data.py pancreas

Multi-batch Integrations

To integrate multiple batches using SCITUNA, run the following command:

python multi_batch_integration.py --i [input_dataset] --b [batch_id] --c [num_cores]

Arguments

--i (input_dataset): The dataset file located in "data/" (supported formats: H5AD).

--b (batch_id): The column name in ".obs" that indicates batch labels for integration.

--c (num_cores): Number of CPU cores to use for parallel processing.

Pairwise Integrations

To perform pairwise batch integration using SCITUNA, run the following command:

python pairwise_integration.py --i [input_dataset] --b [batch_id] --c [num_cores]

We provide t-SNE and UMAP plots for a deeper analysis of the results. You can access them through this Google Drive link.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
data		data
evaluation		evaluation
README.md		README.md
SCITUNA.png		SCITUNA.png
SCITUNA.py		SCITUNA.py
multi_batch_integration.py		multi_batch_integration.py
pairwise_integration.py		pairwise_integration.py
requirements.txt		requirements.txt
run_scituna.py		run_scituna.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCITUNA

Single-Cell data Integration Tool Using Network Alignment

Run SCITUNA

Get Datasets

Multi-batch Integrations

Pairwise Integrations

About

Releases

Packages

Languages

abu-compbio/SCITUNA

Folders and files

Latest commit

History

Repository files navigation

SCITUNA

Single-Cell data Integration Tool Using Network Alignment

Run SCITUNA

Get Datasets

Multi-batch Integrations

Pairwise Integrations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages