Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support for indels in FastaVariant class #84

Open
danielecook opened this issue Dec 17, 2015 · 4 comments
Open

Support for indels in FastaVariant class #84

danielecook opened this issue Dec 17, 2015 · 4 comments

Comments

@danielecook
Copy link

Hello - I was wondering if it would be possible to recognize indels within the fastavariant class? What are the challenges involved there?

@mdshw5
Copy link
Owner

mdshw5 commented Dec 31, 2015

Hmmm... I haven't paid much attention to the FastaVariant class recently. I think indels shouldn't be too hard, but I omitted them originally as I wanted to maintain a 1-1 mapping with the original reference coordinates. Do you have a use case for this?

@danielecook
Copy link
Author

Sure - well, I can describe my reason for wanting this implemented.

I am developing a number of utilities for working with VCF files. One of the tools is aimed at helping to validate variants within VCFs. It generates primers (using primer3) for sanger sequencing or snip-SNP verification based on any variants that are provided as input. However, when generating primers or looking for restriction sites, I want to account for neighboring variation to increase the changes that primers work (by incorporating alternative alleles/indels) or predicted product sizes (resulting from differences in restriction sites) are accurate.

In terms of coordinates - I don't think it is an issue? I always intend to work off of reference coordinates and account for differences afterwords. In other words, if I slice from I:1-100, and this region contains an insertion at 50, it should return the reference from 1-100, and THEN add the insertion. The resulting string will be longer than 100 bp. For a deletion, the string would be shorter than 100. Are there any reasons why this might be an issue?

Thanks for your continual support of pyfaidx - it has been very useful.

@mdshw5 mdshw5 modified the milestone: v0.4.8 Feb 23, 2016
@Benja1972
Copy link

Would be nice to have indels implemented same way as "bcftools consensus" works. Now I run bcftools as subprocess to incorporate all VCF records in fasta regions

Thank you in advance

@mdshw5
Copy link
Owner

mdshw5 commented Feb 20, 2019

Thanks for the feedback. I agree that the bcftools model is appropriate, and if I can get some time, or someone willing to help with the implementation, it will get done :).

@mdshw5 mdshw5 removed this from the v0.4.9 milestone Jan 24, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants