Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

iterate ONLY over the unmapped reads #424

Closed
smangul1 opened this issue Mar 25, 2017 · 5 comments
Closed

iterate ONLY over the unmapped reads #424

smangul1 opened this issue Mar 25, 2017 · 5 comments

Comments

@smangul1
Copy link

Is there a way to iterate ONLY over the unmapped reads?

Thanks,
Serghei

@AndreasHeger
Copy link
Contributor

Good question, but I don't think there is, other than iterating over the complete file and filtering by 'unmapped' flag. The problem, I believe, is that unmapped reads can have a contig assigned to them, or not. Thus, unmapped reads might appear in several locations in a sorted BAM and not as a contiguous block.

@arogozhnikov
Copy link

In my case all unmapped alignments are not assigned to any contig. Still no better way than just iterate over whole file?

@bw2
Copy link

bw2 commented Jun 26, 2024

Same feature request - adding a way to iterate only over the unmapped read pairs at the end of a cram file (ie. those with ref seq id = -1 in the cram index).

@jmarshall
Copy link
Member

Have you tried the documented way of doing this, which uses the same syntax as the underlying samtools queries:

AlignmentFile.fetch('*')

@bw2
Copy link

bw2 commented Jun 26, 2024

Wow, thanks! That's exactly what I was looking for!

The top google hits currently talk about doing this using fetch(until_eof=True)
https://pysam.readthedocs.io/en/latest/faq.html#alignmentfile-fetch-does-not-show-unmapped-reads
https://www.biostars.org/p/361215/
https://www.biostars.org/p/9542602/
etc.
but this is a much better solution.

jmarshall added a commit that referenced this issue Oct 30, 2024
The default fetch() *does* return unmapped reads that have been placed
alongside their mapped mates. Add note about using fetch("*") to return
only unplaced unmapped reads. Fixes #424.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants