Skip to content

Applying a rules-based approach for modifying words with affixes to expand the Chamorro language lexicon. (In Progress)

Notifications You must be signed in to change notification settings

schyuler/Chamorro-Lexicon-Expander

Repository files navigation

Chamorro-Lexicon-Expander

Chamorro Lexicon Expander is a Python project designed to expand the Chamorro-English dictionary by generating all possible affixed variations of Chamorro root words. This tool automates the process of creating word forms using common Chamorro prefixes, suffixes, and infixes according to linguistic rules. The goal is to enable a more comprehensive representation of Chamorro vocabulary for language learners and dictionary development, and to provide a labelled dataset to use in other machine learning projects. (In Progress)

Important Note: The focus of this project is testing out applying affixes algorithmically to words according to linguistic rules, to experiment with ways to expedite creating word lists with known affixes. It is also meant to create a training set for future machine learning projects, such as training a machine learning model for predicting the root word (lemma) of a given word in Chamorro. So while the words generated in this project may accurately follow linguistic rules, the resulting words may or may not reflect actual, natural speech patterns in Chamorro. Therefore, it is always important to verify with a reliable corpus and/or native speakers on word usage.

Contributers

Schyuler Lujan

Reasoning for this project

Benefits of this project

Features

  • Provides a dataset of Chamorro words, definitions, and part of speech tags
  • Transforms words using a rules-based approach, according to linguistic rules
  • Exports output into a CSV file that has the new word, original word, and original word definition
  • Generates an expanded dataset of affixed words labelled with their root words and their root word definitions

License and Copyright

Schyuler Lujan

About

Applying a rules-based approach for modifying words with affixes to expand the Chamorro language lexicon. (In Progress)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published