Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

calculate_risk_score() seems to have wrong logic #182

Closed
snickell opened this issue Aug 27, 2024 · 3 comments
Closed

calculate_risk_score() seems to have wrong logic #182

snickell opened this issue Aug 27, 2024 · 3 comments

Comments

@snickell
Copy link

snickell commented Aug 27, 2024

In #138 @asofter says:

We use threshold configured and only calculate the risk score if it's above that threshold. Then the risk score is basically how far above the confidence score from the threshold.

At least in the context of ban_topics.py and toxicity.py I don't think it behaves as described above, but of course I could be very wrong in my understanding.

Here's what happens with the current implementation of calculate_risk_score():

In [1]: def calculate_risk_score(score: float, threshold: float) -> float:
   ...:     if score > threshold:
   ...:         return 1.0
   ...: 
   ...:     risk_score = round(abs(score - threshold) / threshold, 1)
   ...:     # Ensure risk score is between 0 and 1
   ...:     return min(max(risk_score, 0), 1)
   ...: 

In [20]: calculate_risk_score(score=0.2, threshold=0.5)
Out[20]: 0.6

In [21]: calculate_risk_score(score=0.4, threshold=0.5)
Out[21]: 0.2

In [22]: calculate_risk_score(score=0.6, threshold=0.5)
Out[22]: 1.0

In [22]: calculate_risk_score(score=0.8, threshold=0.5)
Out[22]: 1.0

Notice that as you increase score, approaching the threshold, risk_score actually decreases from 1.0 to 0.0. Then once score is greater than the threshold, risk_score is pinned to 1.0.

@snickell
Copy link
Author

snickell commented Aug 27, 2024

Here's a little script showing the current/original implementation of calculate_risk_score(), and 3 alternate implementations that might be the desired logic:

image

I believe that calculate_risk_score_one() (not orig) matches the intended logic described by @asofter in #138.

I personally prefer the semantics of three, but my opinion is not very important.

# Current implementation of calculate_risk_score()
# Before threshold: scaled from 1 to 0
# After thresold: 1.0
def calculate_risk_score_orig(score: float, threshold: float) -> float:
    if score > threshold:
        return 1.0
    risk_score = round(abs(score - threshold) / threshold, 1)
    return min(max(risk_score, 0), 1)

# Before threshold: 0
# After threshold: scaled from 0 to 1
def calculate_risk_score_one(score: float, threshold: float) -> float:
    if score < threshold:
        return 0.0
    risk_score = round(abs(score - threshold) / (1 - threshold), 1)
    return min(max(risk_score, 0), 1)

# Before threshold: scaled from 0 to 1
# After threshold: 1
def calculate_risk_score_two(score: float, threshold: float) -> float:
    if score > threshold:
        return 1.0
    risk_score = round(1 - abs(score - threshold) / threshold, 1)
    return min(max(risk_score, 0), 1)

# Before threshold: scaled from -1 to 0
# After threhsold: scales from 0 to 1
def calculate_risk_score_three(score: float, threshold: float) -> float:
    if score > threshold:
      risk_score = round((score - threshold) / (1 - threshold), 1)
    else:
      risk_score = round((score - threshold) / threshold, 1)
    
    return min(max(risk_score, -1), 1)

def analyze_calculate_risk_score(calculate_risk_score_func, threshold=0.59):
  for score in [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]:
    risk_score = calculate_risk_score_func(score=score, threshold=threshold)
    print(f"{calculate_risk_score_func.__name__}({score}, {threshold}) => {risk_score}")
  print()

analyze_calculate_risk_score(calculate_risk_score_orig)
analyze_calculate_risk_score(calculate_risk_score_one)
analyze_calculate_risk_score(calculate_risk_score_two)
analyze_calculate_risk_score(calculate_risk_score_three)

Output:

calculate_risk_score_orig(0.0, 0.59) => 1.0
calculate_risk_score_orig(0.1, 0.59) => 0.8
calculate_risk_score_orig(0.2, 0.59) => 0.7
calculate_risk_score_orig(0.3, 0.59) => 0.5
calculate_risk_score_orig(0.4, 0.59) => 0.3
calculate_risk_score_orig(0.5, 0.59) => 0.2
calculate_risk_score_orig(0.6, 0.59) => 1.0
calculate_risk_score_orig(0.7, 0.59) => 1.0
calculate_risk_score_orig(0.8, 0.59) => 1.0
calculate_risk_score_orig(0.9, 0.59) => 1.0
calculate_risk_score_orig(1.0, 0.59) => 1.0

calculate_risk_score_one(0.0, 0.59) => 0.0
calculate_risk_score_one(0.1, 0.59) => 0.0
calculate_risk_score_one(0.2, 0.59) => 0.0
calculate_risk_score_one(0.3, 0.59) => 0.0
calculate_risk_score_one(0.4, 0.59) => 0.0
calculate_risk_score_one(0.5, 0.59) => 0.0
calculate_risk_score_one(0.6, 0.59) => 0.0
calculate_risk_score_one(0.7, 0.59) => 0.3
calculate_risk_score_one(0.8, 0.59) => 0.5
calculate_risk_score_one(0.9, 0.59) => 0.8
calculate_risk_score_one(1.0, 0.59) => 1.0

calculate_risk_score_two(0.0, 0.59) => 0.0
calculate_risk_score_two(0.1, 0.59) => 0.2
calculate_risk_score_two(0.2, 0.59) => 0.3
calculate_risk_score_two(0.3, 0.59) => 0.5
calculate_risk_score_two(0.4, 0.59) => 0.7
calculate_risk_score_two(0.5, 0.59) => 0.8
calculate_risk_score_two(0.6, 0.59) => 1.0
calculate_risk_score_two(0.7, 0.59) => 1.0
calculate_risk_score_two(0.8, 0.59) => 1.0
calculate_risk_score_two(0.9, 0.59) => 1.0
calculate_risk_score_two(1.0, 0.59) => 1.0

calculate_risk_score_three(0.0, 0.59) => -1.0
calculate_risk_score_three(0.1, 0.59) => -0.8
calculate_risk_score_three(0.2, 0.59) => -0.7
calculate_risk_score_three(0.3, 0.59) => -0.5
calculate_risk_score_three(0.4, 0.59) => -0.3
calculate_risk_score_three(0.5, 0.59) => -0.2
calculate_risk_score_three(0.6, 0.59) => 0.0
calculate_risk_score_three(0.7, 0.59) => 0.3
calculate_risk_score_three(0.8, 0.59) => 0.5
calculate_risk_score_three(0.9, 0.59) => 0.8
calculate_risk_score_three(1.0, 0.59) => 1.0

@asofter
Copy link
Collaborator

asofter commented Aug 29, 2024

Hey @snickell , thank you for sharing this information including the options and the code.

Indeed, there is a bug in our calculation. I like your option three, which provides more context, and I will use it.

@asofter
Copy link
Collaborator

asofter commented Aug 29, 2024

The change was done

@asofter asofter closed this as completed Sep 7, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants