-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Enable JSON <-> YAML, JSON <-> binary conversion? #16
Comments
@julesjacobsen not sure this is absolutely needed? If we stick with JSON that will mean we encourage people to use JSON as the primary format? |
@julesjacobsen see new class DefaultPhenopacketIngestor. We could add some functions to this class such as public fromYamlFile(...) and DefaultPhenopacketIngestor(Message message). Thoughts? |
@ielis is this issue closable? I think this is supported for some operations |
In principle yes. Each command that reads or writes a phenopacket accepts/produces phenopacket, family, or cohort in any of these formats. We do not have a command solely for the format conversion (something similar to |
Just revisiting this - is JSON the primary format for phenopackets? Is this written somewhere else? I am trying to do some dataset sharing (ala EGA) - and was considering placing a phenopacket alongside each individuals' genomic artifacts. But I was assuming I needed that to be a protobuf file with some sort of known file suffix like e.g.
And so to that end - I was going to store some v2 JSON or YAML phenopackets for ease of editing - and then convert them over to protobuf using the CLI tool (so this is my +1 for the general feature of being able to convert between formats with just the CLI tool - which is currently not possible - convert requires the input to be v1 format) But if JSON is the primary way we think phenopackets are to be exchanged in the wild - then I can skip using protobuf entirely. Is there some suggested file naming conventions to let people know it is a phenopacket (in JSON)? |
I should add that I am starting via hand crafting some examples for a demonstration of how this would all work - hence the hand editing of JSON or YAML. Obviously for a real system I would be translating from some clinical source like an EHR or Redcap or something and so I guess I would do that using the Java library and output easily whatever format choice I wanted. I think the broader thought is still there - if I have unlimited choice here - what is the primary "phenopacket" file format and how should I name them to make this clear? |
Hi Andrew, there could be a lossless conversion from protobuf (binary), JSON, YAML, XML, SQL ... so there really isn't a primary format. My guess is that almost everybody would prefer JSON because of the tooling for JSON. |
In which case - having an tool that seamlessly converts between the formats might be useful (if I get a batch of phenopackets in protobuf but would prefer them in JSON) - I can just run the CLI tool to convert.. (rather than dusting off my java and writing a small snippet using the library to do the same) |
Currently the converter only handles JSON. Might be an idea to offer conversion of other formats too.
The text was updated successfully, but these errors were encountered: