Skip to content

Files

Latest commit

 

History

History

mikado

Mikado code details are available on github Mikado available via bioconda However I used the singularity package created by the lead develepor of BIND pipeline.

ml singularity
singularity pull --name mikado.sif shub://aseetharam/mikado

Consolidate all the transcripts, and predict potential protein coding sequence by Mikado:

  • Step 1: Make a configure file and prepare transcripts

    • Step 1.1 Prepare a tab-delimited file called list.txt (shown below) to include gtf path (1st column), gtf abbrev (2nd column), stranded-specific or not (3rd column):
     /path/to/file/genome_class.gtf	araip_class	False
     /path/to/file/genome_transcripts.gtf	araip_cufflinks	False
     /path/to/file/genome_stringtie.gtf	araip_stringtie	False
     /path/to/file/genome_strawberry.gtf	araip_strawberry	False
    
    • Step 1.2 Move or generate a symoblic link to the portcullis_all.junctions.bed file and the genome file

    • Step 1.3 Run step1.sh script

     ./step1.sh <genome.fasta> 
    
    • Step 1.4 Edit the configure.yaml file created from running Mikado_step1.sh See example_configuration.yaml

      Edit configure.yaml manually to keep all ORFs. Mikado nosplit mode is selected in step1 and it is best to keep all ORFs if any ORFs overlapped. Add these lines under 'subloci_out:' in configure.yaml

       output_format:
         report_all_orfs: true 
      
  • Step 2: Generate mikado_prepared.fasta Run step2.sh; this script will generate mikado_prepared.fasta file that will be used for predicting ORFs in the next step.

  • Step 3: Predict potential CDS from transcripts There are multiple ways to conduct this step. The output needed from this step is an ORF.bed file. Scripts available to use for this step:

  • Step 4:Pick best transcripts for each locus and annotate them as gene

./step4.sh <ORFs.bed> <prefix> 

This script will generate a number of output files. The main output of interest will be labeled as prefix.loci.gff3

It is best to filter the mikado output (prefix.loci.gff3) before going to the next step. Please review filter_output.md.