-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Show morphological breakdown structure (inflectional morpheme boundaries first) #397
Comments
@kobexamoh Note - I updated the mock-up vizualizations above. These two are different types of information - keeping them separate clarifies things, as well as moving the (Verb/Noun - ...) information next to the dictionary entry rather than the word form. |
As discussed in our meeting this last week of July, we might want to have multiple forms of information available for each paradigm layout cell. For instance, the following:
|
@nienna73 Yes, it looks like what we were expecting visually. I think we thought that the middle-dot would be a good way to indicate the morpheme boundaries. I think we might want to show the middle-dot also in conjunction with hyphens (to the right of the hyphen, where the FST outputs the prefix boundary marker |
Great progress! Looks good! The reason for the latter case, i.e. wâpamêw, is that the word-form comes out of the lexical database, which is statically defined and doesn't contain morpheme boundaries. We'd have to add those as a separate computational step -- generally not too difficult, but I would not be surprised by edge cases that caused some extra head aches. |
EDIT (29.2.2022): removed exception for morpheme boundaries in conjunction with hyphens
-
.Subtasks (added July 20):
<
and>
marks for inflectional boundaries, and/
for derivational boundaries):nôhkom+N+A+D+Px1Pl+Pl
-->ni4<ohkom>i2nân>ak
-->n<ôhkom>inân>ak
middle-dot
). For now, this could even be implemented as not showing anything, until we decide how to best represent morpheme boundaries.Since our FST already outputs morpheme boundaries (primarily inflectional ones), there would be many circumstance when it would be advantageous to show those, in the standardized version of the search string, as well as in the generated inflectional paradigms:
One way to achieve this would be to represent the inflectional morpheme boundaries that the FST outputs as
<
and>
with a middle-dot·
, something like the following:Ideally, that middle dot (or any other character) would not be copyable, so when one paints and copies any wordform, one only gets the actual characters.
Alternatives could be using different colors or shading to differentiate the morphemes, or some visual animation effects such as slight magnification when hovering over individual morphemes. On the other hand, having the morpheme boundaries immediately but non-intrusively available might be the simpler solution - or one might have the morpheme-boundary-output option as a output setting that can be triggered similar to the selection of orthography. Also, we might want to keep magnification or pop-ups till later for giving the plain-language definition of each morpheme. Finally, one might provide such a breakdown explicitly when going after the full paradigms.
First, we could implement this for inflectional morpheme boundaries, and later on for derivational boundaries as well.
The text was updated successfully, but these errors were encountered: