Pr parameters

TrainingCorpusCreator PR

Init params

directory : directory where the vector and lexicon files will be generated
reinitCorpus : delete existing files in the directory when reinitialising the PR

Runtime params

inputAnnotationSet : annotation set where the label and attribute annotations will be taken from
labelAnnotationType : annotation type to use as a training unit e.g. sentence, paragraph etc...
labelAnnotationValue : label to use for the training e.g. language etc...
attributeAnnotationType : annotation type to use for generating attributes e.g. Token
attributeAnnotationValue : feature to use for generating attributes e.g. string

Note : the TrainingCorpusCreator takes a single value for the parameters above. Some preprocessing might be needed in order to combine different annotation types (e.g. Token.string + Token.pos) into a single annotation. Also, the TrainingCorpusCreator does not generate the model directly. This must be done separately using a number of manual commands, see https://github.com/DigitalPebble/TextClassification/wiki/HOWTO for reference.

Classifier PR

Init params

modelDir : directory containing the model and lexicon files

Runtime params

inputAnnotationSet : annotation set where the label and attribute annotations will be taken from
labelAnnotationType : annotation type to use as a unit for classification e.g. sentence, paragraph etc...
labelAnnotationValue : label to use for the classification e.g. language etc... this will override any preexisting feature with the same name
attributeAnnotationType : annotation type to use for generating attributes e.g. Token
attributeAnnotationValue : feature to use for generating attributes e.g. string

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pr parameters

TrainingCorpusCreator PR

Classifier PR

Uh oh!

Clone this wiki locally