calculate_risk_score() seems to have wrong logic #182

snickell · 2024-08-27T05:41:03Z

We use threshold configured and only calculate the risk score if it's above that threshold. Then the risk score is basically how far above the confidence score from the threshold.

At least in the context of ban_topics.py and toxicity.py I don't think it behaves as described above, but of course I could be very wrong in my understanding.

Here's what happens with the current implementation of calculate_risk_score():

In [1]: def calculate_risk_score(score: float, threshold: float) -> float:
   ...:     if score > threshold:
   ...:         return 1.0
   ...: 
   ...:     risk_score = round(abs(score - threshold) / threshold, 1)
   ...:     # Ensure risk score is between 0 and 1
   ...:     return min(max(risk_score, 0), 1)
   ...: 

In [20]: calculate_risk_score(score=0.2, threshold=0.5)
Out[20]: 0.6

In [21]: calculate_risk_score(score=0.4, threshold=0.5)
Out[21]: 0.2

In [22]: calculate_risk_score(score=0.6, threshold=0.5)
Out[22]: 1.0

In [22]: calculate_risk_score(score=0.8, threshold=0.5)
Out[22]: 1.0

Notice that as you increase score, approaching the threshold, risk_score actually decreases from 1.0 to 0.0. Then once score is greater than the threshold, risk_score is pinned to 1.0.

The text was updated successfully, but these errors were encountered:

snickell · 2024-08-27T06:48:20Z

Here's a little script showing the current/original implementation of calculate_risk_score(), and 3 alternate implementations that might be the desired logic:

I believe that calculate_risk_score_one() (not orig) matches the intended logic described by @asofter in #138.

I personally prefer the semantics of three, but my opinion is not very important.

# Current implementation of calculate_risk_score()
# Before threshold: scaled from 1 to 0
# After thresold: 1.0
def calculate_risk_score_orig(score: float, threshold: float) -> float:
    if score > threshold:
        return 1.0
    risk_score = round(abs(score - threshold) / threshold, 1)
    return min(max(risk_score, 0), 1)

# Before threshold: 0
# After threshold: scaled from 0 to 1
def calculate_risk_score_one(score: float, threshold: float) -> float:
    if score < threshold:
        return 0.0
    risk_score = round(abs(score - threshold) / (1 - threshold), 1)
    return min(max(risk_score, 0), 1)

# Before threshold: scaled from 0 to 1
# After threshold: 1
def calculate_risk_score_two(score: float, threshold: float) -> float:
    if score > threshold:
        return 1.0
    risk_score = round(1 - abs(score - threshold) / threshold, 1)
    return min(max(risk_score, 0), 1)

# Before threshold: scaled from -1 to 0
# After threhsold: scales from 0 to 1
def calculate_risk_score_three(score: float, threshold: float) -> float:
    if score > threshold:
      risk_score = round((score - threshold) / (1 - threshold), 1)
    else:
      risk_score = round((score - threshold) / threshold, 1)
    
    return min(max(risk_score, -1), 1)

def analyze_calculate_risk_score(calculate_risk_score_func, threshold=0.59):
  for score in [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]:
    risk_score = calculate_risk_score_func(score=score, threshold=threshold)
    print(f"{calculate_risk_score_func.__name__}({score}, {threshold}) => {risk_score}")
  print()

analyze_calculate_risk_score(calculate_risk_score_orig)
analyze_calculate_risk_score(calculate_risk_score_one)
analyze_calculate_risk_score(calculate_risk_score_two)
analyze_calculate_risk_score(calculate_risk_score_three)

Output:

calculate_risk_score_orig(0.0, 0.59) => 1.0
calculate_risk_score_orig(0.1, 0.59) => 0.8
calculate_risk_score_orig(0.2, 0.59) => 0.7
calculate_risk_score_orig(0.3, 0.59) => 0.5
calculate_risk_score_orig(0.4, 0.59) => 0.3
calculate_risk_score_orig(0.5, 0.59) => 0.2
calculate_risk_score_orig(0.6, 0.59) => 1.0
calculate_risk_score_orig(0.7, 0.59) => 1.0
calculate_risk_score_orig(0.8, 0.59) => 1.0
calculate_risk_score_orig(0.9, 0.59) => 1.0
calculate_risk_score_orig(1.0, 0.59) => 1.0

calculate_risk_score_one(0.0, 0.59) => 0.0
calculate_risk_score_one(0.1, 0.59) => 0.0
calculate_risk_score_one(0.2, 0.59) => 0.0
calculate_risk_score_one(0.3, 0.59) => 0.0
calculate_risk_score_one(0.4, 0.59) => 0.0
calculate_risk_score_one(0.5, 0.59) => 0.0
calculate_risk_score_one(0.6, 0.59) => 0.0
calculate_risk_score_one(0.7, 0.59) => 0.3
calculate_risk_score_one(0.8, 0.59) => 0.5
calculate_risk_score_one(0.9, 0.59) => 0.8
calculate_risk_score_one(1.0, 0.59) => 1.0

calculate_risk_score_two(0.0, 0.59) => 0.0
calculate_risk_score_two(0.1, 0.59) => 0.2
calculate_risk_score_two(0.2, 0.59) => 0.3
calculate_risk_score_two(0.3, 0.59) => 0.5
calculate_risk_score_two(0.4, 0.59) => 0.7
calculate_risk_score_two(0.5, 0.59) => 0.8
calculate_risk_score_two(0.6, 0.59) => 1.0
calculate_risk_score_two(0.7, 0.59) => 1.0
calculate_risk_score_two(0.8, 0.59) => 1.0
calculate_risk_score_two(0.9, 0.59) => 1.0
calculate_risk_score_two(1.0, 0.59) => 1.0

calculate_risk_score_three(0.0, 0.59) => -1.0
calculate_risk_score_three(0.1, 0.59) => -0.8
calculate_risk_score_three(0.2, 0.59) => -0.7
calculate_risk_score_three(0.3, 0.59) => -0.5
calculate_risk_score_three(0.4, 0.59) => -0.3
calculate_risk_score_three(0.5, 0.59) => -0.2
calculate_risk_score_three(0.6, 0.59) => 0.0
calculate_risk_score_three(0.7, 0.59) => 0.3
calculate_risk_score_three(0.8, 0.59) => 0.5
calculate_risk_score_three(0.9, 0.59) => 0.8
calculate_risk_score_three(1.0, 0.59) => 1.0

asofter · 2024-08-29T07:58:47Z

Hey @snickell , thank you for sharing this information including the options and the code.

Indeed, there is a bug in our calculation. I like your option three, which provides more context, and I will use it.

asofter · 2024-08-29T08:48:58Z

The change was done

asofter closed this as completed Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calculate_risk_score() seems to have wrong logic #182

calculate_risk_score() seems to have wrong logic #182

snickell commented Aug 27, 2024 •

edited

Loading

snickell commented Aug 27, 2024 •

edited

Loading

asofter commented Aug 29, 2024

asofter commented Aug 29, 2024

calculate_risk_score() seems to have wrong logic #182

calculate_risk_score() seems to have wrong logic #182

Comments

snickell commented Aug 27, 2024 • edited Loading

snickell commented Aug 27, 2024 • edited Loading

asofter commented Aug 29, 2024

asofter commented Aug 29, 2024

snickell commented Aug 27, 2024 •

edited

Loading

snickell commented Aug 27, 2024 •

edited

Loading