Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Retrieve (also) the CAP measure for each instance rather than just the overall score #531

Open
yleniarotalinti opened this issue Dec 12, 2023 · 2 comments
Labels
feature request Request for a new feature

Comments

@yleniarotalinti
Copy link

Problem Description

I would like to know the privacy risk for each instance rather than just the overall CAP score.

Expected behavior

Implement a function that retrieves the privacy risk for each instance rather than just the overall CAP score. Or include that information as an attribute of the class that the user can store and use.

@yleniarotalinti yleniarotalinti added feature request Request for a new feature new Label applied to new issues labels Dec 12, 2023
@npatki
Copy link
Contributor

npatki commented Dec 12, 2023

Hi @yleniarotalinti thanks for filing this feature request. We can keep this open and use it for tracking purposes whenever we make progress.

In your feature request, does "instance" refer to a row?

If so, you can achieve this by only passing in 1 row of real data into the CategorialCAP metric. You can then inspect and save a score for each row separately. (The average score of all rows is the overall score you are receiving with the full dataset.)

ROW_NUMBER = 0 
real_row = real_data.iloc[[ROW_NUMBER]]

row_score = CategoricalZeroCAP.compute(
    real_data=real_row,
    synthetic_data=synthetic_data,
    key_fields=<your list of key fields>,
    sensitive_fields=<your list of sensitive fields>
)

# TODO: loop through all possible row numbers

Let me know if this is an acceptable workaround or if there is some other measure you had in mind.

@npatki npatki added under discussion Issue is currently being discussed and removed new Label applied to new issues labels Dec 12, 2023
@yleniarotalinti
Copy link
Author

Hi Neha,
it is actually what I was looking for.

Thanks!

@npatki npatki removed the under discussion Issue is currently being discussed label Jun 5, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants