-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
CNV-induced inversions #340
Comments
@adf-ncgr I'm sure you've seen this but I just noticed a very nice example of the inversion algorithm doing a great job: |
thanks @alancleary that's an interesting example, although technically I think it is not an inversion but rather a segmental duplication. This is perhaps more clear when looking at the dotplot although it is somewhat puzzling to me why we see more copies of some of the genes in the dotplot than appear to exist in the aligned track. Maybe some segments are getting ignored due to their scores (trying to fiddle with the params a bit and having trouble getting it to bend to my will- must be getting old!) |
BTW, in the dotplot when you hover over a circle you only get info about the query track gene. would be nice to know what the pair represents! let me know if this is issue-worthy. |
Oh, I see. I guess I'll take a look at it while I'm working on this issue.
I actually had that same thought when playing with the dot plots just now. Definitely issue-worthy! |
Just want to bump this issue slightly (however symbolic an act that may be) after having had to convince myself (and a collaborator) that the "inversion" seen in this example: https://medicago.legumeinfo.org/tools/gcv/gene;medicago=medtr.A17.gnm5.ann1_6.MtrunA17Chr4g0064841?q=medtr.A17.gnm5.MtrunA17Chr4:55872871-56368598&sources=medicago&algorithm=repeat&match=10&mismatch=-1&gap=-1&score=30&threshold=25&bmatched=20&bintermediate=10&bmask=10&bchrgenes=1&bchrlength=100000&linkage=average&cthreshold=20&neighbors=35&matched=4&intermediate=5&bregexp=&border=distance®exp=&order=distance |
despite #262
we are still seeing some false inversions in regions of CNV, e.g.:
https://legumeinfo.org/gcv2/gene;lis=phavu.Phvul.003G002400?algorithm=repeat&match=10&mismatch=-1&gap=-1&score=30&threshold=25&bmatched=20&bintermediate=10&bmask=10&linkage=average&cthreshold=20&neighbors=10&matched=4&intermediate=5&sources=lis&bregexp=&border=chromosome®exp=&order=distance
per @alancleary :
couple of things to note- in this case the orientation of the genes seems like it would be helpful, see #140 for some discussion of this in another context (not exactly sure it applies here); also, could consider something akin to "homopolymer compression" for minimizing the impact of CNV in such situations. If nothing else, perhaps introducing an inversion penalty would make sense.
The text was updated successfully, but these errors were encountered: