-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add option to hard clip read in CigarUtils (replacement for #1453) #1461
Conversation
…o mf_hard_clip_adapter
…ing a previously soft clipped read.
…hard_clip_adapter
Codecov Report
@@ Coverage Diff @@
## master #1461 +/- ##
==============================================
+ Coverage 68.41% 69.197% +0.787%
- Complexity 8499 8700 +201
==============================================
Files 583 587 +4
Lines 34413 34571 +158
Branches 5729 5776 +47
==============================================
+ Hits 23542 23922 +380
+ Misses 8650 8370 -280
- Partials 2221 2279 +58
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kachulis a few comments. Do we want to thing about invalidating tags as part of the clipping operation? Things like NM will not be correct anymore.
final int clipFrom, final String expectedReadString, final String expectedCigar) throws IOException { | ||
|
||
final SAMRecord rec = createTestSamRec(initialReadString, initialCigar, negativeStrand); | ||
// Assert.assertEquals(rec.getCigarString(), initialCigar, testName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either delete this or uncomment it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was already removed in commit 696541f
* @param clippingOperator clipping operator to be merged | ||
* @param trailingHardClippedBases number of hardClippedBases which were on the end of the original cigar | ||
*/ | ||
static private void mergeClippingCigarElement(List<CigarElement> newCigar, CigarElement c, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename some of the parameters and variables here. They're confusing as hell.
Maybe:
c
-> originalElement
op
-> originalOperator
clippingOperator
-> newClippingOperator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -180,11 +180,14 @@ public static void clip3PrimeEndOfRead(SAMRecord rec, final int clipFrom, final | |||
|
|||
// If hard-clipping, remove the hard-clipped bases from the read | |||
if(clippingOperator == CigarOperator.HARD_CLIP) { | |||
byte[] bases = rec.getReadBases(); | |||
final byte[] bases = rec.getReadBases(); | |||
final byte[] baseQualities = rec.getBaseQualities(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we miss this on the first pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets make sure bases.length and baseQualities.length are the same, and then use the same values for the start/stop for each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for the quick review @lbergelson. I have added code to invalidate NM, MD, and UQ tags when clipping changes the length of the read. Note this will change behavior (to be more correct) for soft-clipping as well. Also added more validation of the things that are changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kachulis Two comment. I think we're good after that. Clipping always has more weird edges though...
@@ -201,6 +212,13 @@ public static void clip3PrimeEndOfRead(SAMRecord rec, final int clipFrom, final | |||
} | |||
} | |||
|
|||
if (rec.getReadLength() != originalReadLength) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a method comment saying that this will remove these tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -201,6 +212,13 @@ public static void clip3PrimeEndOfRead(SAMRecord rec, final int clipFrom, final | |||
} | |||
} | |||
|
|||
if (rec.getReadLength() != originalReadLength) { | |||
//invalidate NM, UQ, MD tags if we have changed the length of the read. | |||
rec.setAttribute(SAMTag.NM.name(), null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test that shows these get invalidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@lbergelson I added documentation and tests for the tag invalidations. Also adjusted the logic to only invalidate in the number of reference bases the read aligns to changes (no need to invalidate these tags if we are harclipping bases that were already softclipped). I don't think I have merge permissions, so I'm happy with you merging this at this point if you approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems like a good improvement
@lbergelson here is the replacement PR for #1453 we discussed earlier today.
@fleharty, @lbergelson and I talked about how to get hard clipping merged ASAP, given that I do not have permissions to push directly to htsjdk, and so cannot make modifications directly to your branch. He suggested opening up a new PR to replace the old one, which is what this PR is.