-
Notifications
You must be signed in to change notification settings - Fork 4
ReadMappers
The choice of read mapper is of paramount importance in an Hagfish analysis. Obviously the read mapper must precise, sensitive and able to map read pairs.
Given that Hagfish employs read-pairs that align at anomalous
distances, it is important that these anomalous pairs are not filtered
out by the read mapper, hence - an important feature of the read
mapper used is that it must be possible to manually define the minimum
and maximum distance at which a pair may be aligned. A tool like
BWA estimates the insert size and
does not (seem to) allow a manual override. Commonly used with success
is Bowtie (set the
insert size with -I
/-X
). The exact definition of the allowed
insert size has an impact on the anomalies that can be
identified. Typical values would be a minimal insert size of 1 and a
maximal insert size of at least a few times the expected inserxt size,
but high values (10e5 and up) will work as well. In the case of
Bowtie, the --best
parameter is also advisable.
Obviously, all parameters used while mapping will have an effect on the output of Hagfish. Many mappers allow specification how the software should handle pairs that align at multiple location. Choices range ignoring pairs that do (1), return one candidate at random (2), or return a number, or all candidates (3). The choice depends on the project in which Hagfish is used. The first option is the most safe, but will give an insight only in those regions that are unique in the genome. The second option will yield more information on repetitive regions, whereas the third option highlights repetitive regions by showing coverage peaks.
A common combination of Bowtie parameters used for Hagfish:
bowtie -k 5 --best --strata -I 1 -X 100000