Skip to content
Mark Fiers edited this page Apr 12, 2012 · 4 revisions

Read Mappers

The choice of read mapper is of paramount importance in an Hagfish analysis. Obviously the read mapper must precise, sensitive and able to map read pairs.

Given that Hagfish employs read-pairs that align at anomalous distances, it is important that these anomalous pairs are not filtered out by the read mapper, hence - an important feature of the read mapper used is that it must be possible to manually define the minimum and maximum distance at which a pair may be aligned. A tool like BWA estimates the insert size and does not (seem to) allow a manual override. Commonly used with success is Bowtie (set the insert size with -I/-X). The exact definition of the allowed insert size has an impact on the anomalies that can be identified. Typical values would be a minimal insert size of 1 and a maximal insert size of at least a few times the expected inserxt size, but high values (10e5 and up) will work as well. In the case of Bowtie, the --best parameter is also advisable.

Obviously, all parameters used while mapping will have an effect on the output of Hagfish. Many mappers allow specification how the software should handle pairs that align at multiple location. Choices range ignoring pairs that do (1), return one candidate at random (2), or return a number, or all candidates (3). The choice depends on the project in which Hagfish is used. The first option is the most safe, but will give an insight only in those regions that are unique in the genome. The second option will yield more information on repetitive regions, whereas the third option highlights repetitive regions by showing coverage peaks.

A common combination of Bowtie parameters used for Hagfish:

bowtie -k 5 --best --strata  -I 1 -X 100000
Clone this wiki locally