Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bcftools norm swapped phased haplotypes #1893

Closed
BinglanLi opened this issue Mar 27, 2023 · 1 comment
Closed

bcftools norm swapped phased haplotypes #1893

BinglanLi opened this issue Mar 27, 2023 · 1 comment

Comments

@BinglanLi
Copy link

Hi,

I am wondering if this is a bug. Bcftools norm (v1.17 but also other versions) changed the phased haplotypes for some positions when reconstructing multiallelic loci.

Here is an example of the input

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	Sample
chr19	40991369	rs8192709	C	T	.	.	.	GT	1|0
chr19	41016810	rs3211371	C	A	.	.	.	GT	0|0
chr19	41016810	rs3211371	C	T	.	.	.	GT	0|1

I ran the following command

# on grch38
bcftools norm -m+ -c ws -f reference.fna.bgz input.vcf
# or the following just to collapse the multiallelic locus
bcftools norm -m+ -N input.vcf

I expect the following output

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	Sample
chr19	40991369	rs8192709	C	T	.	.	.	GT	1|0
chr19	41016810	rs3211371	C	A,T	.	.	.	GT	0|2

But the actual output looked like the following

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	Sample
chr19	40991369	rs8192709	C	T	.	.	.	GT	1|0
chr19	41016810	rs3211371	C	A,T	.	.	.	GT	2|0

Why is the phased haplotype swapped after reconstructing multiallelic loci? This happened too when I split the multiallelic loci to uniallelic representations and then reconstructed the multiallelic loci back.

Looking forward to hearing from you.

@pd3
Copy link
Member

pd3 commented Mar 28, 2023

The merging would disregard phasing, adding the support for it now. Please try it out. Thank you for the issue.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants