Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bcftools norm <DEL> for very long variants #2029

Closed
christopher-schroeder opened this issue Oct 27, 2023 · 1 comment
Closed

bcftools norm <DEL> for very long variants #2029

christopher-schroeder opened this issue Oct 27, 2023 · 1 comment

Comments

@christopher-schroeder
Copy link

christopher-schroeder commented Oct 27, 2023

Hi,
it is great that 1.18 supports the normalization of symbolic <DEL> notation. But some of the called deletions in my data are of length bigger than 20.000.000 bp (they are probably false positives, but these things might happen with e.g. with radiotherapy). Instead of the symbolic representation bcftools norm transforms them to explicit representation, so it write the complete 20.000.000 bp to the records ref field.
While technically correct, I would HIGHLY prefer the previous <DEL> instead. Maybe an optional flag or threshold when to do the explicit / symbolic representation would be nice

@christopher-schroeder christopher-schroeder changed the title bcftools norm <DEL> bcftools norm <DEL> for very long variants Oct 27, 2023
@pd3
Copy link
Member

pd3 commented Oct 31, 2023

This was actually not intended, thank you for reporting the problem. I believe the ALT column behaved correctly, the problem was in expanding the REF allele. It is now fixed.

@pd3 pd3 closed this as completed Oct 31, 2023
pd3 added a commit that referenced this issue Nov 1, 2023
Symbolic <DEL> alleles caused norm to expand REF to the full length of the deletion.
This was not intended and was problematic for long deletions, the REF allele should list
one base only.

Resolves #2029
pd3 added a commit that referenced this issue Nov 1, 2023
Symbolic <DEL> alleles caused norm to expand REF to the full length of the deletion.
This was not intended and was problematic for long deletions, the REF allele should list
one base only.

Resolves #2029
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants