Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[R-package] adjust csv-parser to allow comment lines #40

Closed
jweile opened this issue Dec 5, 2018 · 5 comments
Closed

[R-package] adjust csv-parser to allow comment lines #40

jweile opened this issue Dec 5, 2018 · 5 comments
Labels
enhancement New feature or request

Comments

@jweile
Copy link
Collaborator

jweile commented Dec 5, 2018

MaveDB API will soon include comment lines for licensing information. The parser must be readied to account for that.

This will need to be changed in the main mavevis.R and in the background-sync script.

@jweile jweile added the enhancement New feature or request label Dec 5, 2018
@jweile
Copy link
Collaborator Author

jweile commented Dec 19, 2018

This will actually have to be implemented in rapimave.

@jweile
Copy link
Collaborator Author

jweile commented Dec 19, 2018

Hm. This seems to cause some downstream problems, with the BRCA1 dataset:

job e2950d82-a0b1-426f-bfe9-27b1950d9036 with parameters:
ssid = urn:mavedb:00000003-a-2
uniprotId = P38398
pdbs = 1JM7
mainChains = A
wt.seq = GATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTCCCATCTGCCTGGAGTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTGCATGCTGAAACTTCTCAACCAGAAGAAAGGGCCTTCACAGTGTCCTTTATGTAAGAATGATATAACCAAAAGGAGCCTACAAGAAAGTACGAGATTTAGTCAACTTGTTGAAGAGCTATTGAAAATCATTTGTGCTTTTCAGCTTGACACAGGTTTGGAGTATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTCTCCTGAACATCTAAAAGATGAAGTTTCTATCATCCAAAGTATGGGCTACAGAAACCGTGCCAAAAGACTTCTACAGAGTGAACCCGAAAATCCTTCCTTGCAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAAAGCAGCGGATACAACCTCAAAGGACGTCTGTCTACATTGAATTGGGATCTGATTCTTCTGAAGATACCGTTAATAAGGCAACTTATTGCAGTGTGGGAGATCAAGAATTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAATCAGTTTGGATTCTGCAAAAAAGGCTGCTTGTGAATTTTCTGAGACGGATGTAACAAATACTGAACATCATCAACCCAGTAATAATGATTTGAACACCACTGAGAAGCGTGCAGCTGAGAGGCATCCAGAAAAGTATCAGGGTAGTTCTGTTTCAAACTTGCATGTGGAGCCATGTGGCACAAATACTCATGCCAGCTCATTACAGCATGAGAACAGCAGTTTATTACTCACTAAAGACAGAATGAATGTAGAAAAGGCTGAGTTC
seq.offset = 0
syn.med = 1
stop.med =
overrideCache = FALSE
outFormats = pdf png svg
pngRes = 80

Plotting....Error in if (x <= valStops[[1]]) 0 else if (x >= valStops[[length(valStops)]]) 1 else { :
missing value where TRUE/FALSE needed
Calls: dashboard -> genophenogram -> cm -> sapply -> lapply -> FUN
In addition: Warning messages:
1: NAs introduced by coercion
2: NAs introduced by coercion
Execution halted

@jweile
Copy link
Collaborator Author

jweile commented Dec 19, 2018

Now there seems to be a regression to issue #32 , but for BRCA instead of the WW dataset:

image

Starting job 7ea011fe-336e-4fac-b73e-f1aead0e1fe8 with parameters:
ssid = urn:mavedb:00000003-b-1
uniprotId = P38398
pdbs = 1JM7
mainChains = A
wt.seq = GATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTCCCATCTGCCTGGAGTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTGCATGCTGAAACTTCTCAACCAGAAGAAAGGGCCTTCACAGTGTCCTTTATGTAAGAATGATATAACCAAAAGGAGCCTACAAGAAAGTACGAGATTTAGTCAACTTGTTGAAGAGCTATTGAAAATCATTTGTGCTTTTCAGCTTGACACAGGTTTGGAGTATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTCTCCTGAACATCTAAAAGATGAAGTTTCTATCATCCAAAGTATGGGCTACAGAAACCGTGCCAAAAGACTTCTACAGAGTGAACCCGAAAATCCTTCCTTGCAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAAAGCAGCGGATACAACCTCAAAGGACGTCTGTCTACATTGAATTGGGATCTGATTCTTCTGAAGATACCGTTAATAAGGCAACTTATTGCAGTGTGGGAGATCAAGAATTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAATCAGTTTGGATTCTGCAAAAAAGGCTGCTTGTGAATTTTCTGAGACGGATGTAACAAATACTGAACATCATCAACCCAGTAATAATGATTTGAACACCACTGAGAAGCGTGCAGCTGAGAGGCATCCAGAAAAGTATCAGGGTAGTTCTGTTTCAAACTTGCATGTGGAGCCATGTGGCACAAATACTCATGCCAGCTCATTACAGCATGAGAACAGCAGTTTATTACTCACTAAAGACAGAATGAATGTAGAAAAGGCTGAGTTC
seq.offset = 0
syn.med =
stop.med =
overrideCache = FALSE
outFormats = pdf png svg
pngRes = 80
Loading required package: methods
hash-2.2.6 provided by Decision Patterns

Translating WT sequence to Protein...
Retrieving scoreset from local cache...
Retrieving parsed variants from local cache...
Filtering for single mutant variants...
Obtaining conservation information...
Querying UniRef...success
Retrieving sequences...success
Aligning sequences...success
Reading structural features...
Querying PDB...
Parsing PDB file...
Splitting protein complex...
Read 44604 items
Calculating surface area...
Calculating interface areas...
Calculating secondary structures...
Compiling results...
Plotting......Done!
Warning messages:
1: NAs introduced by coercion
2: NAs introduced by coercion

Job completed successfully!

@jweile
Copy link
Collaborator Author

jweile commented Dec 19, 2018

Hm. This may actually be a problem with the dataset. The synonymous and stop variants just seem to be bad:

> summary(sm.data$score[which(sm.mut$type == "synonymous")])
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-2.34200 -0.35260 -0.01884 -0.13260  0.24790  1.08200 
> summary(sm.data$score[which(sm.mut$variant=="*")])
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-5.5870 -0.5913  0.4375 -0.1150  1.1710  2.0520 
> head(sm.data[which(sm.mut$variant == "*"),1:4])
                               urn  hgvs_nt    hgvs_pro      score
127  urn:mavedb:00000003-b-1#20851 c.877A>T p.Lys293Ter  1.1817745
557  urn:mavedb:00000003-b-1#21281 c.388A>T p.Arg130Ter -0.4453293
813  urn:mavedb:00000003-b-1#21537 c.625C>T p.Gln209Ter  0.3754994
1184 urn:mavedb:00000003-b-1#21908 c.675T>A p.Cys225Ter  2.0521364
1312 urn:mavedb:00000003-b-1#22036 c.904G>T p.Glu302Ter -0.4542513
1368 urn:mavedb:00000003-b-1#22092 c.772G>T p.Glu258Ter  0.2100059
> head(sm.data[which(sm.mut$type == "synonymous"),1:4])
                              urn  hgvs_nt hgvs_pro       score
32  urn:mavedb:00000003-b-1#20756 c.372C>T      p.=  0.40005867
155 urn:mavedb:00000003-b-1#20879 c.786T>C      p.= -1.78391714
391 urn:mavedb:00000003-b-1#21115 c.624T>C      p.=  0.04629578
405 urn:mavedb:00000003-b-1#21129 c.192G>A      p.=  0.02823471
425 urn:mavedb:00000003-b-1#21149 c.669T>C      p.= -1.97649415
541 urn:mavedb:00000003-b-1#21265 c.453C>G      p.= -0.35483822

@jweile
Copy link
Collaborator Author

jweile commented Dec 19, 2018

Will open a new issue instead, to add warning instead

@jweile jweile closed this as completed Dec 19, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant