0.0.17
(2020/07/26)
- Added a way to tag texts where word are already tokenized: new lines are word separator,
double new lines are sentence separator - Reworked the way preprocessing of special chars is done prior to sentence tokenization and after it.
Creation of the class Excluder (pie_extended.pipeline.tokenizers.utils.excluder)- Allows for more code sharing across models.
- Fixed a typo that would prevent to tag with FREEM (and nobody saw that ! ;) )
- (Unseen in CHANGES.md) New excluder : [REF:1.a.b] will be ignored with the LASLA model.