This repository has been archived by the owner on Jan 14, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
Pr parameters
Carmen-digitalPebble edited this page Jul 12, 2012
·
2 revisions
Init params
- directory : directory where the vector and lexicon files will be generated
- reinitCorpus : delete existing files in the directory when reinitialising the PR
Runtime params
- inputAnnotationSet : annotation set where the label and attribute annotations will be taken from
- labelAnnotationType : annotation type to use as a training unit e.g. sentence, paragraph etc...
- labelAnnotationValue : label to use for the training e.g. language etc...
- attributeAnnotationType : annotation type to use for generating attributes e.g. Token
- attributeAnnotationValue : feature to use for generating attributes e.g. string
Note : the TrainingCorpusCreator takes a single value for the parameters above. Some preprocessing might be needed in order to combine different annotation types (e.g. Token.string + Token.pos) into a single annotation. Also, the TrainingCorpusCreator does not generate the model directly. This must be done separately using a number of manual commands, see https://github.com/DigitalPebble/TextClassification/wiki/HOWTO for reference.
Init params
- modelDir : directory containing the model and lexicon files
Runtime params
- inputAnnotationSet : annotation set where the label and attribute annotations will be taken from
- labelAnnotationType : annotation type to use as a unit for classification e.g. sentence, paragraph etc...
- labelAnnotationValue : label to use for the classification e.g. language etc... this will override any preexisting feature with the same name
- attributeAnnotationType : annotation type to use for generating attributes e.g. Token
- attributeAnnotationValue : feature to use for generating attributes e.g. string