Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Cannot import SV VCF #11

Closed
edawson opened this issue Apr 19, 2019 · 5 comments
Closed

Cannot import SV VCF #11

edawson opened this issue Apr 19, 2019 · 5 comments
Assignees
Labels

Comments

@edawson
Copy link

edawson commented Apr 19, 2019

I'm trying to read in a large VCF of structural variants from a public resource, but I'm receiving the following error:

300	22	573	Offset: 32980955-33035235 Position: 151831760-155258869 Bins: 82-21842
301	23	333	Offset: 33035235-33066063 Position: 2654318-10083699 Bins: 0-8216
tachyon: lib/containers/data_container.cpp:618: bool tachyon::yon1_dc_t::CheckUniformity(): Assertion `cumulative_position == this->data_uncompressed.size()' failed.

My intuition is this is due to something strange about the SVs - perhaps one that's too large? There is also no FORMAT or SAMPLE field in this VCF - would that be an issue?

Basic run info:
htslib-1.9

./tachyon/tachyon import -i sv.vcf.gz -o sv.sites.yon -c 50000
Program:   tachyon-beta-0.6.0 (Tools for querying and manipulating variant call data)
Libraries: tachyon-0.6.0; OpenSSL 3.0.0-dev xx XXX xxxx; ZSTD-1.4.0; htslib 1.9
Contact: Marcus D. R. Klarqvist <mk819@cam.ac.uk>
Documentation: https://github.com/mklarqvist/tachyon
License: MIT
----------
[2019-04-19 15:12:01,351][LOG] Calling import...
@mklarqvist
Copy link
Owner

Thanks for reporting this @edawson . Could you link me the file in question? I will investigate. I have an idea what the problem could be.

@mklarqvist
Copy link
Owner

The problem occurs here
https://github.com/mklarqvist/tachyon/blob/master/lib/containers/data_container.cpp#L618
when the observed word size times the stride does not equal the length of the input data stream. This is a irrecoverable error as it would result in corruption and the program therefore terminates.

I cannot immediately see why this happens. I need the file in question to debug.

@edawson
Copy link
Author

edawson commented Apr 25, 2019

Got it - the public data is here: https://storage.googleapis.com/gnomad-public/papers/2019-sv/gnomad_v2_sv.sites.vcf.gz

Link comes from the bottom of this page (SV sites VCF): https://gnomad.broadinstitute.org/downloads
where the .tbi can also be found.

@mklarqvist
Copy link
Owner

I can reproduce this error. I am investigating.

@mklarqvist mklarqvist added the bug label Apr 26, 2019
@mklarqvist mklarqvist self-assigned this Apr 26, 2019
@mklarqvist
Copy link
Owner

This has now been fixed.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants