Skip to content

0.0.17

Compare
Choose a tag to compare
@PonteIneptique PonteIneptique released this 27 Jul 07:34
· 71 commits to master since this release

(2020/07/26)

  • Added a way to tag texts where word are already tokenized: new lines are word separator,
    double new lines are sentence separator
  • Reworked the way preprocessing of special chars is done prior to sentence tokenization and after it.
    Creation of the class Excluder (pie_extended.pipeline.tokenizers.utils.excluder)
    • Allows for more code sharing across models.
  • Fixed a typo that would prevent to tag with FREEM (and nobody saw that ! ;) )
  • (Unseen in CHANGES.md) New excluder : [REF:1.a.b] will be ignored with the LASLA model.