Skip to content

Using The Test Sets

Matt Ravenhall edited this page Jan 25, 2019 · 1 revision

Test sets are provided for both the Analysis and Visualisation modules to help test and try out both aspects of SV-Pop. Both datasets sparse, to ensure fast run times, and biologically meaningless.

Analysis Module

Input Files

This test set is located within Analysis/TestSet/ and should contain:

  • input.txt containing paths to the example vcf input files, found in Analysis/TestSet/ExampleVCFs
  • pheno.txt sub-population metadata for each sample
  • annotation.txt .tsv annotation file
  • excluded.csv regions to exclude
  • runTest.sh Bash script for running setting off the test run (run this as bash runTest.sh to perform a test run)

Running

A bash script runTest.sh is provided in Analysis/TestSet/, simply run this.

Expected Output

  • TestRun_DEL_chrALL_variants_annotated_v1-0-1.csv containing 19 variants
  • TestRun_DEL_chrALL_windows_annotated_v1-0-1.csv containing 22 windows
  • SVPop_Logs/ containing log files for each chromosome and indicating variants removed for excessive homozygous and heterozygous reference calls, poor quality, or overlapping an excluded region.

Visualisation Module

Input Files

This test set is located within Visualisation/Files/TestSet/ and should contain:

  • (DEL/DUP/INS/INV)_Variants.csv: containing variant frequencies
  • (DEL/DUP/INS/INV)_Windows.csv: containing window-based counts
  • (DEL/DUP/INS/INV)_AllIndex.csv: containing all variant locations
  • (DEL/DUP/INS/INV)_FrqIndex.csv: containing frequent variant locations
  • annotation.txt: containing feature annotations

Running

First copy all test files to Visualisation/Files/, then run Rscript easyRun.r.

Expected Output

The visualisation module should automatically open in your default browser, following installation of dependencies.