Skip to content

CCICB/consHLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

consHLA

A Next Generation Sequencing Consensus-based HLA Typing Workflow

overall workflow

The sub-workflow (blue box)
A: Bowtie2 Alignment to IMGT HLA reference (generates .sam)
B: Mapped reads extraction with samtools (generates .fastq.gz)
C: HLA typing with HLA-HD (generates .txt and .json)
sub workflow

Running the workflow

RAM Requirements (indicative only)

RAM depend on input file size
For WGS results with 30x coverage: min RAM = 2Gb
For WGS results with 100x coverage: min RAM = 30Gb

Cloud platforms

You can run the workflow (cwl v1.0 or v1.2) on any cloud platform supporting CWL execution (i.e. Cavatica)

You can also run consHLA on an instance of Cromwell which utilises Azure backend. Please use Cromwell version 79 or earlier because CWL was no longer supported after version 79. In addition, Cromwell only supports CWL v1.0 and the consHLA compatible with Cromwell are under ./cwl/v1.0.

Since CWL v1.0 does not support conditional execution of workflow steps, consHLA in cwl v1.0 had to be split into two modes as:

  • ./cwl/v1.0/consHLA WGS contains the consHLA workflow that accepts two NGS inputs (germline and tumour WGS). Workflow dependencies are zipped.
  • ./cwl/v1.0/consHLA WGS and RNA-seq contains the consHLA workflow that accepts three NGS inputs (germline and tumour WGS and tumour RNA-seq). Workflow dependencies are zipped.

Local

You will need to have a docker daemon available.
Running a .cwl workflow requires specific software. Here we pick cwltool. Install it following these instructions. cwltool usage is shown below

cwltool --basedir . ./cwl/v1.2/consHLA.cwl ./sample_input.yml

You can run the whole or part of the consHLA workflow by specifing the .cwl file and supplying the correct input.yml

Output files

*_sample1_hla.json: HLA alleles typed from tumour WGS
*_sample2_hla.json: HLA alleles typed from germline WGS
*_sample3_hla.json: HLA alleles typed from tumour RNAseq (optional)
*_[three|two]Sample_hla.consensus.clinSig.[json|txt]: Consensus HLA alleles for clinically significant genes
*_[three|two]Sample_hla.consensus.[json|txt]: Consensus HLA alleles for all genes

Test samples

Publicly available NGS data for two cell lines COLO829 and HCC1954 were used to demonstrate consHLA functionality. Download the files to validate consHLA installation. The expected output is provided in ./sample_output

  • COLO829 tumour WGS link
  • COLO829 germline WGS link
  • COLO829 tumour RNAseq link
  • HCC1954 tumour WGS link
  • HCC1954 germline WGS link

Runtime

Runtime tested with 30x WGS and RNAseq with 180M reads on amazon cloud computing EC2 instance model c5.4xlarge with 16 CPUs, 32Gb of RAM, and 1024Gb of attached storage Runtime analysis

Funding

We would like to acknowledge Luminesce Alliance – Innovation for Children’s Health for its contribution and support. Luminesce Alliance, is a not-for-profit cooperative joint venture between the Sydney Children’s Hospitals Network, the Children’s Medical Research Institute, and the Children’s Cancer Institute. It has been established with the support of the NSW Government to coordinate and integrate paediatric research. Luminesce Alliance is also affiliated with the University of Sydney and the University of New South Wales Sydney.

LICENSE

consHLA is a wrapper on HLA-HD and is protected by MIT open source software license. For commercial use of consHLA, please contact the author of HLA-HD to obtain a commercial license.

About

ConsHLA: consensus HLA typing from NGS data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published