- q is an upper bound of the total quantity of MDS (lenght/2)
-
=
is string equivalence -
MIC[i,j]
(MAC[i,j]
) is the substring starting ati
and finishing atj
(i
,j
being positions) of the MIC(MAC). Can be trivially defined using string concatenation and MIC(i,c) (MAC(i,c)). -
reverse_complement(String)
is the Watson-Crick reverse complement ofString
- Size of the Oxytricha Input genome: MIC is fragmented into ~750 000 MDSs, MAC into 300 000.
- Variables marked with
$*$ are populated during the preprocessing phase.
$*Eq(i,j,h,l) = \begin{cases} 0 \ 1, & \mbox{if } \code{MIC[i:j]} = \code{MAC[h:l]} \end{cases}$
$*cwc(i,j,h,l) = \begin{cases} 0 \ 1, & \mbox{if \code{MIC[i:j]} is the reverse complement of \code{MAC[h:l]}} \end{cases}$
$*Possible_{MDSMAC}(i,a,b) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ can start at } a \mbox{ and finish at } b \mbox{ in the MAC} \end{cases}$
$*Possible_{MDSMIC}(i,a,b) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ can start at } a \mbox{ and finish at } b \mbox{ in the MIC} \end{cases}$
$MDS_{MICstart}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ starts at position } j \mbox{ in the MIC} \end{cases}$
$MDS_{MICend}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ ends at position } j \mbox{ in the MIC} \end{cases}$
$MDS_{MACstart}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ starts at position } j \mbox{ in the MAC} \end{cases}$
$MDS_{MACend}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ ends at position } j \mbox{ in the MAC} \end{cases}$
$Inv(i) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ is inverted in the MAC } \end{cases}$
$P_{start}(i,j) = \begin{cases} 0 \ 1, & \mbox{if } MDS_{MACstart}(i,j) = 1 \mbox{, Pointer } i \mbox{ starts at position } j \mbox{ in the MAC} \end{cases}$
$P_{end}(i,j) = \begin{cases} 0 \ 1, & \mbox{if } MDS_{MACend}(i-1,j) = 1 \mbox{, Pointer } i \mbox{ ends at position } j \mbox{ in the MAC} \end{cases}$
$*MAC(i,c) = \begin{cases} 0 \ 1, & \mbox{if } c\mbox{ is the character at position } i \mbox{ in the MAC} \end{cases}$
$*MIC(i,c) = \begin{cases} 0 \ 1, & \mbox{if } c\mbox{ is the character at position } i \mbox{ in the MIC} \end{cases}$
$Cov_{MIC}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ covers the position } j \mbox{ in the MIC} \end{cases}$
$Cov_{MAC}(i,j) = \begin{cases} 0 \ 1, & \mbox{if MDS } i\mbox{ covers the position } j \mbox{ in the MAC} \end{cases}$
Internally Eliminated Sequences
$IES(j) = \begin{cases} 0 \ 1, & \mbox{if } i \mbox{ is part of an IES:} \mathlarger{\sum_{0 \leq i \leq q} Cov_{MIC}(i,j) = 0} \end{cases}$
MDSs must correspond to identical or reverse and complemented substrings of MIC and MAC. The following constraints enforce this fact:
Coverage