Releases: jftuga/deidentification
Releases · jftuga/deidentification
v1.3.0
add --exclude option
Add ability to exclude entities from de-identification with -x
, --exclude
.
- This uses a comma as the delimiter to allow for multiple entities.
- Comma can be overridden by setting the
DEIDENTIFY_EXCLUDE_DELIM
environment variable.
The Python API can also use this option be setting a DeidentificationConfig.excluded_entities
option to a Python set
data type.
Improve Python API
- reset all internal variables at the beginning of the
deidentify
method - lower-case all
config.excluded_entities
- added API testing with
api_test.py
v1.2.1
prepare for PiPY deployment
- create and/or update files for PyPI
- Created
Makefile
andget_project_name.py
to deploy totest
andprod
PyPI servers - updated install instructions in
README.md
- set minimum Python version to
3.10
allow for multiple languages
- allow for multiple languages in the future by making
GENDER_PRONOUNS
a dict which uses theDeidentificationLanguages
Enum-style class as keys - moved helper classes to
deidentification_constants.py
to avoid a circular dependency DeidentificationLanguages
now maps the defaultDeidentificationConfig.replacement
word to a language-specific noun, such asPERSON
v1.2.0
v1.1.2
Small Bug Fixes
get_identified_elements()
will now always return pronouns-
- If multiple passes were needed in
deidentify()
, thenget_identified_elements()
would not have returned any pronouns.
- If multiple passes were needed in
- use
self.text
instead ofself.replaced_text
inget_identified_elements()
- Include small refinements to
README.md
v1.1.0
CLI Improvements
- added third-party
VeryPrettyTable
module as a dependency - documented the CLI program,
deidentify
inREADME.md
- added
-t
to save detected entities to a JSON file to the CLI - added
-d
for debug mode to the CLI - use the third-party
chardet
module to detect file character encodings for input files - updated
Deidentification
class to accommodate these CLI options