Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

aligned chromosomes of some reads are not displayed correctly in pysam #1318

Closed
creaturemoon opened this issue Nov 18, 2024 · 1 comment
Closed

Comments

@creaturemoon
Copy link

Hello,

I used pysam to treat a NGS primarily mapped bam and found the aligned chromosomes of some reads were started #, but with the code read.reference_name it displayed the correct chromosome. Here is one example of these reads.

for record in bam.fetch():
    if record.query_name == "c4e0190a-6474-4007-997b-d868cf363f4e":`
        read = record
print(read)

c4e0190a-6474-4007-997b-d868cf363f4e 0 #0 2772945 0 360M1D45M3D495M1D5M4D10M1D14M1I1M3D62M1I419M1D721M2I175M1D835M2D2M1775N291M2I193M2S * 0 0GAGATCCAAAGAATAAAGTCGTGAAACTATTTCTCCTAAAAACTATTTTTTATTTCTTGGCGTTGTCCTTAGTCAACTGACGGGACATTAGTTCGACTCATAAATAAAACAACAATTTTACTGGCGCAGTCGGTAGGATACAATAGTATCCGAAAAAAAAAGAACCTTCGAGTGGAAAATAAGTTAAATTTTATAGTCCAGTGCTCGAAACATCTCCCAAAATAAATTCGTGAAAACTCTTCAACTTGGATTATAATTCCAATTCGGTTATCCAATAATAAGTGGAAGTGAAATACGAAACAAAAATATTAAGTCCAAAGGCAACTAAGTTTTAAAACCAACATATAAAAATAAAAAATTAAACAATATAGAATTT

print(read.reference_name)

2L

When I checked this read using samtools view, it displayed the correct chromosome,

c4e0190a-6474-4007-997b-d868cf363f4e 0 2L 2772945 0 360M1D45M3D495M1D5M4D10M1D14M1I1M3D62M1I419M1D721M2I175M1D835M2D2M1775N291M2I193M2S * 0 0 GAGATCCAAAGAATAAAGTCGTGAAACTATTTCTCCTAAAAACTATTTTTTATTTCTTGGCGTTGTCCTTAGTCAACTGACGGGACATTAGTTCGACTCATAAATAAAACAACAATTTTACTGGCGCAGTCGGTAGGATACAATAGTATCCGAAAAAAAAAGAACCTTCGAGTGGAAAATAAGTTAAATTTTATAGTCCAGTGCTCGAAACATCTCCCAAAATAAATTCGTGAAAACTCTTCAACTTGGATTATAATTCCAATTCGGTTATCCAATAATAAGTGGAAGTGAAATACGAAACAAAAATATTAAGTCCAAAGGCAACTAAGTTTTAAAACCAACATATAAAAATAAAAAATTAAACAATATAGA

Here is a sam file containing the read c4e0190a-6474-4007-997b-d868cf363f4e. testsam.zip
Could you clarify the reason for the use of # to denote chromosomes in pysam? Looking forward to your reply.

@jmarshall
Copy link
Member

As noted in the documentation, a header is not necessarily available when evaluating str(read), so it prints the reference chromosome index instead. However a header is usually available and this is indeed annoying, so I have changed it to print the reference name instead when that is available.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants