Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

IsoQuant extended annotation missing transcripts #151

Closed
jcmtnez opened this issue Feb 14, 2024 · 2 comments
Closed

IsoQuant extended annotation missing transcripts #151

jcmtnez opened this issue Feb 14, 2024 · 2 comments
Labels
bug Something isn't working fixed in dev Issue resolved but not released yet fixed in release Issue resolved and the fix is released, waiting for approval

Comments

@jcmtnez
Copy link

jcmtnez commented Feb 14, 2024

Hello,

Thank you for making such a useful tool! When running IsoQuant with a reference GTF file, the final extended GTF contains less transcripts than the original reference despite discovering novel transcripts. I just wanted to make sure this is not related to the parameters I use run the program.

My code:
isoquant.py --reference /work/users/j/c/jcmtnez/index/GRCh38_gencode_v45/GRCh38.primary_assembly.genome.fa
--genedb /work/users/j/c/jcmtnez/index/GRCh38_gencode_v45/transcripts.gtf --complete_genedb
--data_type pacbio_ccs
--fastq /work/users/j/c/jcmtnez/RNAseq/SF_long/fastq/WT.fastq
/work/users/j/c/jcmtnez/RNAseq/SF_long/fastq/SF3B1.fastq
/work/users/j/c/jcmtnez/RNAseq/SF_long/fastq/SRSF2.fastq
/work/users/j/c/jcmtnez/RNAseq/SF_long/fastq/U2AF1.fastq
--sqanti_output
-o /work/users/j/c/jcmtnez/RNAseq/SF_long/isoquant/

The result from GFF compare:

gffcompare v0.10.4 | Command line was:

#gffcompare -r transcripts.gtf OUT.extended_annotation.gtf

#= Summary for dataset: OUT.extended_annotation.gtf

Query mRNAs : 206377 in 31488 loci (191408 multi-exon transcripts)

(16483 multi-transcript loci, ~6.6 transcripts per locus)

Reference mRNAs : 251242 in 57581 loci (225209 multi-exon)

Super-loci w/ reference transcripts: 25892

#-----------------| Sensitivity | Precision |
Base level: 72.0 | 94.8 |
Exon level: 77.1 | 97.3 |
Intron level: 76.9 | 97.3 |
Intron chain level: 78.5 | 92.4 |
Transcript level: 75.1 | 91.5 |
Locus level: 52.9 | 96.0 |

 Matching intron chains:  176850
   Matching transcripts:  188791
          Matching loci:   30477

      Missed exons:  151359/666158	( 22.7%)
       Novel exons:    4550/535946	(  0.8%)
    Missed introns:   91276/402509	( 22.7%)
     Novel introns:    1336/318070	(  0.4%)
       Missed loci:   25270/57581	( 43.9%)
        Novel loci:     847/31488	(  2.7%)

Total union super-loci across all input datasets: 31483
206377 out of 206377 consensus transcripts written in gffcmp.annotated.gtf (0 discarded as redundant)

Thank you for your time!
Jose

@andrewprzh
Copy link
Collaborator

Dear @jcmtnez

Thank you for the positive feedback!

Thanks for the report. Yes, this is a known bug related to extended annotation only. It is now fixed and the bug-fix release will be out soon.

Best
Andrey

@andrewprzh andrewprzh added bug Something isn't working fixed in dev Issue resolved but not released yet labels Feb 15, 2024
@andrewprzh andrewprzh added the fixed in release Issue resolved and the fix is released, waiting for approval label May 9, 2024
@andrewprzh
Copy link
Collaborator

Finally released new version 3.4, which fixes this issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working fixed in dev Issue resolved but not released yet fixed in release Issue resolved and the fix is released, waiting for approval
Projects
None yet
Development

No branches or pull requests

2 participants