Skip to content

Releases: jftuga/deidentification

v1.3.0

10 Jan 21:06
72a5a7d
Compare
Choose a tag to compare

add --exclude option

Add ability to exclude entities from de-identification with -x, --exclude.

  • This uses a comma as the delimiter to allow for multiple entities.
  • Comma can be overridden by setting the DEIDENTIFY_EXCLUDE_DELIM environment variable.

The Python API can also use this option be setting a DeidentificationConfig.excluded_entities option to a Python set data type.


Improve Python API

  • reset all internal variables at the beginning of the deidentify method
  • lower-case all config.excluded_entities
  • added API testing with api_test.py

v1.2.1

04 Jan 22:55
1a3fd96
Compare
Choose a tag to compare

prepare for PiPY deployment

  • create and/or update files for PyPI
  • Created Makefile and get_project_name.py to deploy to test and prod PyPI servers
  • updated install instructions in README.md
  • set minimum Python version to 3.10

allow for multiple languages

  • allow for multiple languages in the future by making GENDER_PRONOUNS a dict which uses the DeidentificationLanguages Enum-style class as keys
  • moved helper classes to deidentification_constants.py to avoid a circular dependency
  • DeidentificationLanguages now maps the default DeidentificationConfig.replacement word to a language-specific noun, such as PERSON

v1.2.0

03 Jan 22:13
26c0fa7
Compare
Choose a tag to compare

Model Download

  • When a spaCy model has not been downloaded, advise the user on how to manually download it.

v1.1.2

03 Jan 03:19
afb0cd1
Compare
Choose a tag to compare

Small Bug Fixes

  • get_identified_elements() will now always return pronouns
    • If multiple passes were needed in deidentify(), then get_identified_elements() would not have returned any pronouns.
  • use self.text instead of self.replaced_text in get_identified_elements()
  • Include small refinements to README.md

v1.1.0

02 Jan 13:41
b2d8f1e
Compare
Choose a tag to compare

CLI Improvements

  • added third-party VeryPrettyTable module as a dependency
  • documented the CLI program, deidentify in README.md
  • added -t to save detected entities to a JSON file to the CLI
  • added -d for debug mode to the CLI
  • use the third-party chardet module to detect file character encodings for input files
  • updated Deidentification class to accommodate these CLI options

v1.0.0

02 Jan 01:40
1c10533
Compare
Choose a tag to compare
1.0.0