Skip to content

Commit

Permalink
fix: mv configuration docs to specific README file
Browse files Browse the repository at this point in the history
  • Loading branch information
ftabaro committed Nov 15, 2023
1 parent bbf4926 commit a8adec4
Showing 1 changed file with 3 additions and 87 deletions.
90 changes: 3 additions & 87 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Overview

This is a Snakemake pipeline for the integrated analysis of single copy genes, transposable elements and tRNAs. It performs standard quality control checks and genome alignment in three different ways specialized either for single copy genes or transposable elements. It then quantifies gene expression depending on how the alignement step was performed. Finally it performs differential gene expression analysis yielding lists of genes significantly deregulated between two given conditions.
This is a Snakemake workflow for the integrated analysis of single copy genes, transposable elements and tRNAs. It performs standard quality control checks and genome alignment in three different ways specialized either for single copy genes or transposable elements. It then quantifies gene expression depending on how the alignement step was performed. Finally it performs differential gene expression analysis yielding lists of genes significantly deregulated between two given conditions.

![Overall 3t-seq workflow](docs/figures/3t-wf.png)

Expand Down Expand Up @@ -47,92 +47,8 @@ After the pipeline completes, you can find the results in the `results/` directo

## Configuration

Adjust parameters in the `config.yaml` file to match your experimental setup.

Here is an example config file:

```yaml
# config/config.yaml

# A list of datasets
sequencing_libraries:
- name: GSE13073
sample_sheet: sample-sheet.csv
trimmomatic: >-
"ILLUMINACLIP:TruSeq3-PE.fa:1:0:15:2
SLIDINGWINDOW:20:22
MAXINFO:20:0.6
LEADING:22
TRAILING:20
MINLEN:75"
star: >-
"--seedSearchStartLmax 30
--outFilterMismatchNoverReadLmax 0.04
--winAnchorMultimapNmax 40"
bamCoverage: "--binSize 50 --normalizeUsing None"

# - name: ...
# sample_sheet: ...
# trimmomatic: ...
# star: ...
# bamCoverage: ...

#
globals:
# path to reads folder
# NB: ./GSE13073 is expected to exist
reads_folder: .

# path to results folder
results_folder: results/

# path to qc
qc_folder: results/qc

# path to log
log_folder: results/log

# path to references
references_folder: results/references

# temp folder
tmp_folder: /tmp

# path to analysis
analysis_folder: results/analysis

# genome informations
genome:
# genome label
label: mm10

# annotation type
# can be ensembl, mgi, gencode
annotation_type: ensembl

# URL or path to genome sequence
fasta_url: <Genome fasta URL>

# URL or path to genome annotation file
gtf_url: <Genome annotation URL>

# URL to gtRNAdb zip file
gtrnadb_url: <GtRNADb bundle URL>

# Differential expression analysis parameters
deseq2:
# wd
working_directory: ../../..

# DESeq2 test name, can be Wald or LRT
test: Wald

# name of the column from sample sheet with experimental variable
variable: genotype

# base level from variable column
reference_level: wt
```
Adjust parameters in the `config.yaml` file to match your experimental setup. See `config/README.md` for further instructions.


### Sample sheet preparation

Expand Down

0 comments on commit a8adec4

Please # to comment.