Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

granularity #209

Open
parlar opened this issue Jan 28, 2020 · 3 comments
Open

granularity #209

parlar opened this issue Jan 28, 2020 · 3 comments
Labels

Comments

@parlar
Copy link

parlar commented Jan 28, 2020

Hi,

I have not actually used chanjo for coverage reports but as I recall it provides reports of "completeness" on transcript or gene level.

Storing coverage data is tricky business since inclusion of too detailed information (per base) would quickly eat up a lot of space. However, some more granularity might still be useful for assessing the sequencing quality in different regions.

I have two questions.

  1. Do you think it would be feasible to provide completeness info on exon-level? I made some quick tests with a WGS dataset using all exons for all ensembl transcripts and 4 completeness levels. The resulting data table amounted to 12 Mb compressed and 63 Mb uncompressed. Admittedly quite alot but it could reduced significantly more if, for example, CCDS was used instead. The size would also be reduced by using an SQL database if the data is sufficiently normalized.

  2. In my mind, however, it would be a good thing if coverage data could be included directly into the scout system. But then it would also be convenient if data was stored in MongoDB, which though prevents the use JOINs and normalized data.

Do you have any thoughts on this?

@adrosenbaum
Copy link
Contributor

Hello,

Our plan during the spring is to do some development on chanjo, and one goal would be to facilitate storage of more granular data, e.g. exons.

Did you do the test using mongodb or sql-database? We have also been thinking of using mongodb as backend, however it would be necessary to assess the performance of this vs sql.

@moonso
Copy link
Contributor

moonso commented Apr 20, 2020

Hi @parlar , there is a PR for using a mongodb backend open in #202 , however to say when we have time to get somewhere with this is tricky at the moment. We are hiring some people now and we hope to start developing chanjo again soon. Can not give any time frame now unfortunately

@parlar
Copy link
Author

parlar commented Apr 21, 2020 via email

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants