Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

bcftools isec not producing a sites.txt file #1462

Closed
stevekm opened this issue Apr 8, 2021 · 5 comments
Closed

bcftools isec not producing a sites.txt file #1462

stevekm opened this issue Apr 8, 2021 · 5 comments

Comments

@stevekm
Copy link

stevekm commented Apr 8, 2021

(posted here https://www.biostars.org/p/9464238/)

I am running a bcftools command like this;

bcftools isec -p /data --targets-file targets.bed Sample1.vcf.gz Sample2.vcf.gz

And I am getting output files like this;

0000.vcf
0001.vcf
0002.vcf
0003.vcf
README.txt

There is no sites.txt file output.

Normally when I run other sets of samples in bcftools isec, I also get a file sites.txt which has just the genomic regions used for intersection, and which sample each variant was present in, like this;

$ head input/sites.txt
1       45799087        G       A       111
1       115256527       C       T       110
1       115256527       CTTG    TTTT    001
1       115256530       G       T       110
1       193099343       G       T       010
2       26101038        C       A       100

Since I am not getting this file output, its making it difficult to parse with my existing workflow. Instead, the README.txt file has a description this like;

This file was produced by vcfisec.
The command line was:   bcftools isec -p /data --targets-file targets.bed Sample1.vcf.gz Sample2.vcf.gz

Using the following file names:
/data/0000.vcf     for records private to  Sample1.vcf.gz
/data/0001.vcf     for records private to  Sample2.vcf.gz
/data/0002.vcf     for records from Sample1.vcf.gz shared by both       Sample1.vcf.gz Sample2.vcf.gz
/data/0003.vcf     for records from Sample2.vcf.gz shared by both    Sample1.vcf.gz Sample2.vcf.gz

Any idea why I am not getting the sites.txt file?

Using

$ bcftools --version
bcftools 1.9
Using htslib 1.9
@stevekm
Copy link
Author

stevekm commented Apr 8, 2021

looking at the source code, it seems like sites.txt is only created when >2 input files are used;

args->fh_sites = open_file(NULL, "w", "%s/sites.txt", args->prefix);

Is this a bug? It seems like sites.txt should be created no matter how many input files are used.

@pd3 pd3 closed this as completed in 26aa4b2 Apr 16, 2021
@pd3
Copy link
Member

pd3 commented Apr 16, 2021

It is not exactly a bug, just an omission. But you are right that there is no reason not to create the file also for the Venn set type of output. This is now fixed. Thank you for reporting the issue.

By the way, you are way behind the latest version of bcftools, we are now at 1.12.

@stevekm
Copy link
Author

stevekm commented May 4, 2021

thanks, is this change included in the latest release?

@stevekm
Copy link
Author

stevekm commented May 5, 2021

on an unrelated note, it looks like the reason I was using bcftools version 1.9 is because version 1.12 appears to be incompatible with the other tools I need as well;

$ conda install -y \
bioconda::bcftools \
bioconda::vcf2maf \
bioconda::bedops

...

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    bcftools-1.9               |       ha228f0b_4         807 KB  bioconda
    bedops-2.4.39              |       hc9558a2_0         9.1 MB  bioconda
    bzip2-1.0.8                |       h7b6447c_0          78 KB
    ca-certificates-2021.4.13  |       h06a4308_1         114 KB
    certifi-2020.12.5          |   py37h06a4308_0         141 KB
    conda-4.10.1               |   py37h06a4308_1         2.9 MB
    curl-7.71.1                |       hbc83047_1         140 KB
    htslib-1.9                 |       ha228f0b_7         1.5 MB  bioconda
    krb5-1.18.2                |       h173b8e3_0         1.3 MB
    libcurl-7.71.1             |       h20c2e04_1         305 KB
    libdeflate-1.0             |       h14c3975_1          43 KB  bioconda
    libedit-3.1.20210216       |       h27cfd23_1         167 KB
    libgcc-7.2.0               |       h69d50b8_2         269 KB
    libssh2-1.9.0              |       h1ba5d50_1         269 KB
    ncurses-6.2                |       he6710b0_1         817 KB
    openssl-1.1.1k             |       h27cfd23_0         2.5 MB
    perl-5.26.2                |       h14c3975_0        10.5 MB
    samtools-1.7               |                1         1.0 MB  bioconda
    vcf2maf-1.6.21             |       hdfd78af_0          37 KB  bioconda
    ------------------------------------------------------------

^ conda chooses bcftools 1.9

Trying to install bcftools 1.12;

$ conda install -y \
bioconda::bcftools==1.12 \
bioconda::vcf2maf \
bioconda::bedops

...

Found conflicts! Looking for incompatible packages.                                                                                          failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:

# a whole bunch of packages

@pd3
Copy link
Member

pd3 commented Jun 17, 2021

This will be included in the next release, which is imminent.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants