This library contains the implementation of PrivaTree and several other methods for training differentially-private decision trees for binary classification. These models protect the model from leaking information about training data in exchange for a potential drop in utility. This privacy-utility trade-off is decided by a parameter
PrivaTree was developed and implemented by Daniël Vos with the help of Jelle Vos, Tianyu Li, Zekeriya Erkin, and Sicco Verwer. The accompanying paper can be found on arxiv.
To install the required dependencies run (preferably in a virtual environment):
pip install -r requirements.txt
PrivaTree requires a python version >= 3.7.
Experiment code can be found in the base directory of the repository.
In the figure above, PrivaTree can be seen to outperform other works on the UCI adult dataset, averaged over 50 iterations. 'Decision tree' refers to a non-private decision tree. Diffprivlib refers to IBM's differential privacy library. The other works are also implemented in this repository.
The API mostly reflects that of scikit learn, but we note that this library is still under development.
The main implementation can be found under privatree/privatree.py
.