Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bcftools annotate --set-id and --remove combined can cause a segmentation fault #1540

Closed
freeseek opened this issue Jul 29, 2021 · 1 comment

Comments

@freeseek
Copy link
Contributor

Generate a simple VCF:

(echo "##fileformat=VCFv4.2"
echo "##contig=<ID=1,length=249250621>"
echo "##INFO=<ID=rsID,Number=1,Type=String,Description=\"dbSNP rsID\">"
echo -e "#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO"
echo -e "1\t564621\t.\tC\tT\t.\t.\trsID=rs10458597") > file.vcf

It should look like this:

$ cat file.vcf
##fileformat=VCFv4.2
##contig=<ID=1,length=249250621>
##INFO=<ID=rsID,Number=1,Type=String,Description="dbSNP rsID">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
1	564621	.	C	T	.	.	rsID=rs10458597

I can now use --set-id to copy the rsID field in the INFO field:

$ bcftools annotate --no-version --set-id %rsID file.vcf
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1,length=249250621>
##INFO=<ID=rsID,Number=1,Type=String,Description="dbSNP rsID">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
1	564621	rs10458597	C	T	.	.	rsID=rs10458597

But if I try to combine the --set-id with the --remove option:

$ bcftools annotate --no-version --set-id %rsID --remove INFO/rsID file.vcf
Segmentation fault (core dumped)

I think the problem originates in vcfannotate.c:

static void annotate(args_t *args, bcf1_t *line)
{
    int i, j;
    for (i=0; i<args->nrm; i++)
        args->rm[i].handler(args, line, &args->rm[i]);
...
    if ( args->set_ids )
    {
        args->tmpks.l = 0;
        convert_line(args->set_ids, line, &args->tmpks);
        if ( args->tmpks.l )
        {
            int replace = 0;
            if ( args->set_ids_replace ) replace = 1;
            else if ( !line->d.id || (line->d.id[0]=='.' && !line->d.id[1]) ) replace = 1;
            if ( replace )
                bcf_update_id(args->hdr_out,line,args->tmpks.s);
        }
    }
...
}

Where annotations are removed before IDs are set. Would it be enough to swap the order here? Though I believe swapping the order would make --set-id only work on fields present before --annotations/--columns is resolved. It should at a least be clarified that --set-id operates on fields present after --annotations/--columns and --remove are resolved and an error should be generated if the field that --set-id wants to access is not present anymore.

@pd3 pd3 closed this as completed in b874aa9 Aug 14, 2021
@pd3
Copy link
Member

pd3 commented Aug 14, 2021

This is now fixed. Thank you for the bug report!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants