1-soupX_template.Rmd

---
title: "SoupX Template"
author: "Jane Velghe"
date: "2022-08-03"
output: "pdf_document"
---

This is an [R Markdown](http://rmarkdown.rstudio.com) Notebook. When you execute code within the notebook, the results appear beneath the code. 

Try executing this chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Ctrl+Shift+Enter*. 

```{r} 
library(dplyr)
library(Seurat)
library(ggplot2)
library(patchwork)
library(SoupX)
library(DoubletFinder)
library(Matrix)
library(DropletUtils)
# toc - table of counts
# tod = table of droplets
```

Name of the Sample folder from the sample sheet to run SoupX on
```{r}
name <- '/R441P' #change
folder <- '/R441' #change
```

Let's get organized before we start
```{r}
#path to project folder
data_path <- c('~/Verchere/jdrf2')

#path to data
seq_outputs <- ("~/Sequencing_Outputs/JV")

#path for soupX outputs in sample folder
dir.create(paste0(data_path,folder,name,'/SoupX/'))
soupX_path <- paste0(data_path,folder,name,'/SoupX/')

data_dir_filt <- paste0(seq_outputs,folder,name,'/outs/filtered_feature_bc_matrix')
list.files(data_dir_filt) # Should show barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz
data_dir_raw <- paste0(seq_outputs,folder,name,'/outs/raw_feature_bc_matrix')
list.files(data_dir_raw) # Should show barcodes.tsv.gz, features.tsv.gz, and matrix.mtx.gz
```

```{r}
toc = Read10X(data.dir = data_dir_filt) # toc - table of counts
tod = Read10X(data.dir = data_dir_raw) # tod = table of droplets
sc = SoupChannel(tod, toc)
sc #Channel with _____ genes and _____ cells

sample_prefilt <- CreateSeuratObject(toc, min.cells = 0, min.features = 0)
sample_prefilt #_____ features across _____ samples within 1 assay 

sample_prefilt    <- SCTransform(sample_prefilt, verbose = F)
sample_prefilt    <- RunPCA(sample_prefilt, verbose = F)
ElbowPlot(sample_prefilt)
sample_prefilt    <- RunUMAP(sample_prefilt, dims = 1:15, verbose = F)
sample_prefilt    <- FindNeighbors(sample_prefilt, dims = 1:15, verbose = F)
sample_prefilt    <- FindClusters(sample_prefilt, verbose = T)

meta <- sample_prefilt@meta.data
umap  <- sample_prefilt@reductions$umap@cell.embeddings

#set clusters by adding in clusters from full object metadata
sc = setClusters(sc, setNames(meta$seurat_clusters, rownames(meta)))
umap  <- sample_prefilt@reductions$umap@cell.embeddings

#set data reduction
sc  <- setDR(sc, umap)

#run soup!
sc  <- autoEstCont(sc)
# ____ genes passed tf-idf cut-off and ____ soup quantile filter.  Taking the top 100.
# Using ____ independent estimates of rho.
# Estimated global rho of 0.__

ggsave(paste0(soupX_path,"soup_autoEstCont.pdf"), plot = last_plot(), height = 8, width = 10)

head(sc$soupProfile[order(sc$soupProfile$est, decreasing = T), ], n = 20)

#We will use roundToInt option to make sure we output integer matrix.
adj.matrix  <- adjustCounts(sc, roundToInt = T)

#Finally, let’s write the directory with corrected read counts.
DropletUtils:::write10xCounts(paste0(soupX_path,"soupX_filt"), adj.matrix)
```