Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Locus spec improvements #75

Merged
merged 4 commits into from
Oct 30, 2019
Merged

Locus spec improvements #75

merged 4 commits into from
Oct 30, 2019

Conversation

egor-dolzhenko
Copy link
Contributor

This pull request aims to streamline locus description and catalog loading process.

@egor-dolzhenko egor-dolzhenko changed the title [WIP] Better locus description and catalog loading Locus spec improvements Oct 29, 2019
Copy link
Contributor

@xiao-chen-xc xiao-chen-xc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@egor-dolzhenko I have completed my review. Besides a few comments below, one last question I have is that: this doesn't seem to be backwards compatible with the previous catalog format, does it? It requires a "LocusType" field to be present in the catalog.

{
public:
CnvLocusSpecification(
RegionId locusId, LocusType locusType, CnvLocusSubtype locusSubtype, ContigCopyNumber contigCopyNumber,
GenomicRegion locusLocation, GenotyperParameters genotyperParams)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep locusLocation, at least for cnvLocusSpec. This is because the coordinates we give for the target and baseline variants are the coordinates used for counting. The true coordinates for the CNV variant (encoded as the locus here) can be different from the target or the baseline (say, in the example of an overlapping CNV). So we will need to specify them in the user input.

@@ -194,7 +194,7 @@ void CnvVariantJsonWriter::visit(CnvVariantFindings& cnvFindings)
record_.clear();
record_["VariantId"] = locusSpec_.locusId();
record_["VariantType"] = "CNV";
record_["ReferenceRegion"] = encode(contigInfo_, locusSpec_.locusLocation());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where we need the true coordinates of the CNV, which is taken from the catalog input.


const CnvLocusSubtype& locusSubtype() const { return locusSubtype_; }
std::vector<GenomicRegion> regionsWithReads() const;
const CnvType& locusSubtype() const { return cnvType_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename locusSubtype to cnvType?

{
using VariantTuple = std::tuple<int32_t, int64_t, int64_t, LocusIdAndVariantId>;
std::vector<VariantTuple> tuples;

for (const auto& locusIdAndFindings : sampleFindings_)
{
const string& locusId = locusIdAndFindings.first;
const LocusSpecification& locusSpec = *(regionCatalog_.at(locusId));
// TO DO: add CNVs
if (locusSpec.locusType() == LocusType::kGraph)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CNVVariantFindings are actually one per locus and go by locus id, so looking for locus id in variant ids will not be successful. I guess we can return a locusId-locusId pair for CNVs here, since by now we only need locus-level information, not variant-level information.

@xiao-chen-xc
Copy link
Contributor

After discussion with @egor-dolzhenko, we decided to add an "OutputVariants" field to the catalog so variant findings can still go by variant instead of locus. Will merge this branch now.

@xiao-chen-xc xiao-chen-xc merged commit a942631 into Parse-Catalog Oct 30, 2019
@egor-dolzhenko egor-dolzhenko deleted the locus-spec-update branch October 31, 2019 14:23
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants