shoal is a tool which jointly quantify transcript abundances across multiple samples. Specifically, shoal learns an empirical prior on transcript-level abundances across all of the samples in an experiment, and subsequently applies a variant of the variational Bayesian expectation maximization algorithm to apply this prior adaptively across multi-mapping groups of reads.
shoal can increase quantification accuracy, inter-sample consistency, and reduce false positives in downstream differential analysis when applied to multi-condition RNA-seq experiments. Moreover, shoal, runs downstream of Salmon and requires less than a minute per-sample to re-estimate transcript abundances while accounting for the learned empirical prior.
shoal is designed and developed by Avi Srivastava, Michael Love and Rob Patro.
Shoal requires to have salmon output of all the samples in the experiment
separately using the latest version of Salmon (either built from the develop branch of the Salmon repo; or, you can grab a pre-compiled binary for Linux from here). Please run Salmon with the --dumpEqWeights
option, which will produce output suitable for shoal.
- clone shoal into your local machine:
git clone https://github.com/COMBINE-lab/shoal.git
- run shoal2:
./run_shoal.sh -q <salmon_quant_directory_path> -o <output_directory_path>
This script assumes that all of the Salmon quantification directories are subdirectories of the path that you provide via the -q
option. So, e.g., if you have an experiment with six samples across 2 conditions (say, A{1,2,3}
and B{1,2,3}
), then the shoal script would expect a layout like:
exp_quants
|
|--- A1
|
|--- quant.sf
|--- A2
|
|--- quant.sf
|--- A3
|
|--- quant.sf
|--- B1
|
|--- quant.sf
|--- B2
|
|--- quant.sf
|--- B3
|
|--- quant.sf
the script would then be invoked by passing -q exp_quants
to provide the top-level quantification directory for the entire experiment.
Specifically, a command like ./run_shoal -q exp_quants -o exp_shoal_quants
would produce a modified (Salmon-format) quantification file for each of the samples ({A,B}{1,2,3}
) in the directory exp_shoal_quants
as described below (the script will create the output directory if it does not already exist).
- shoal output:
-- shoal generates.sf
files for each sample in the experiment with naming convention as follows:
<output_directoty>/<sample_name>_adapt.sf
readlink: illegal option -- f
usage: readlink [-n] [file ...]
install coreutils for greadlink
command
brew install coreutils
1 This image is from the wikipedia artical on shoaling. It is licensed under CC-BY-SA.↩
2 shell script can be given executable permission with command: chmod +x run_shoal.sh