Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

minimap2 accepts invalid fastq records #255

Closed
mdkeehan opened this issue Oct 23, 2018 · 1 comment
Closed

minimap2 accepts invalid fastq records #255

mdkeehan opened this issue Oct 23, 2018 · 1 comment

Comments

@mdkeehan
Copy link

I was rather horrified to discover that minimap2 will do a lot of processing with an invalid fastq.
I was fishing out a subset of reads from a large pacbio fastq file.
The grep statement was wrong and resulted in some malformed fastqs. These were then processed by minimap2.

someone@somehost:/somedir/git/minimap2/test$ cat t2-with-malformed.fq 
@t2-with-chopped-quals
GGACATCCCGATGGTGCAGgtGCTATTAAAGGTTCGTTTGTTCAACGATTAAagTCCTACCTGTACGAAAGGAC
+
++++++++++++
@t2-nothing-wrong
GGACATCCCGATGGTGCAGgtGCTATTAAAGGTTCGTTTGTTCAACGATTAAagTCCTACCTGTACGAAAGGAC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
fred.fq:@t2-greppedfrom fred*.fq
fred.fq:GGACATCCCGATGGTGCAGgtGCTATTAAAGGTTCGTTTGTTCAACGATTAAagTCCTACCTGTACGAAAGGAC
fred.fq:+
fred.fq:++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
@t2-with-extra-quals-from-grep-group-separator
GGACATCCCGATGGTGCAGgtGCTATTAAAGGTTCGTTTGTTCAACGATTAAagTCCTACCTGTACGAAAGGAC
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--

Minimap2 will happily parse these records.

I suggest at a minimum check that the qual-score length = base length when reading the fastq and then terminate as soon as possible so the user can investigate.

Clearly the default grep group separator is a pathological case as it is a valid and common pacbio quality score character.

@lh3
Copy link
Owner

lh3 commented Nov 5, 2018

fd64dd2 warns about incorrect records.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants