You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make standing XSLT script that converts tokenized corpus documents to non-tokenized ones and copying only contents
This can then be the basis for a 3rd main variety of every corpus document in which we will store and edit IGT glosses
The following can be the other outputs:
1) Basic structure with original Mixtec(non-tokenized), English and Spanish sentence translations
<seg xml:id="d1e140a" n="3" xml:lang="mix" resp="#TS" type="S">Nikitsi Shanty ka tsi mee ncha aueroperto S.F </seg>
<spanGrp type="annotations">
<span ana="#S" target="#d1e140a" xml:lang="en" type="translation">Shanty came with me to the S.F airport.</span>
<span ana="#S" target="#d1e140a" xml:lang="es" type="translation">Shanty vino conmigo al aeropuerto de S.F.</span>
</spanGrp>
This can then be copied (in a slightly modified XSLT) and (mostly) manually edited to become:
2) IGT centered data structure
<seg xml:id="d1e140igt" n="3" xml:lang="mix" resp="#TS" type="IGT">Ni-kits-i Shanty=ka tsi mee ncha aueroperto S.F.</seg>
<spanGrp type="annotations">
<span ana="#S" target="#d1e140igt" xml:lang="en" type="IGT">PFV-come-3s Shanty=TPC with PRON-EMPH.1s ADPOS.until S.F airport</span> <!-- DECIDE ON TYPOLOGY -->
<span ana="#S" target="#d1e140igt" xml:lang="en" type="translation">Shanty came with me to the S.F airport.</span>
<span ana="#S" target="#d1e140igt" xml:lang="es" type="translation">Shanty vino conmigo al aeropuerto de S.F </span>
</spanGrp>
Important note: I will need to create proper typology for the values of //seg to express that it is both the sentence (#S) and segmented as an interlinear glossed text (#IGT) and for the value of the //span that contains the interlinear glosses corresponding to that //seg
The text was updated successfully, but these errors were encountered:
Make standing XSLT script that converts tokenized corpus documents to non-tokenized ones and copying only contents
This can then be the basis for a 3rd main variety of every corpus document in which we will store and edit IGT glosses
The following can be the other outputs:
1) Basic structure with original Mixtec(non-tokenized), English and Spanish sentence translations
This can then be copied (in a slightly modified XSLT) and (mostly) manually edited to become:
2) IGT centered data structure
//seg
to express that it is both the sentence (#S) and segmented as an interlinear glossed text (#IGT) and for the value of the//span
that contains the interlinear glosses corresponding to that//seg
The text was updated successfully, but these errors were encountered: