Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: weightedMean aggregation of percent change in DataTables doesn't handle nulls and zeros correctly #2852

Open
nickfanion opened this issue Nov 25, 2024 · 1 comment
Labels
bug Something isn't working to-review Evidence team to review

Comments

@nickfanion
Copy link
Contributor

nickfanion commented Nov 25, 2024

Describe the bug

When using totalAgg=weightedMean in DataTables to aggregate percent change, null values and zeroes in the weightCol cause the entire row to be ignored, which leads to incorrect aggregates.

Steps to Reproduce

```sql example_data
select 'Peanut Butter' AS category, 'Chunky Peanut Butter' AS item, 6678 AS sales, 1189 AS sales_ya
union
select 'Peanut Butter', 'Smooth Peanut Butter', 7155, 7651
union
select 'Peanut Butter', 'Powder Peanut Butter', 522, 9928
union
select 'Salad Dressing', 'Thousand Island Dressing', 3210, 0
union
select 'Salad Dressing', 'Ranch Dressing', 9493, 7557
union
select 'Salad Dressing', 'Italian Dressing', 9212, 8051
union
select 'Salad Dressing', 'Caeser Dressing', 2489, 9643
union
select 'Plant-Based Milk', 'Almond Milk', 2867, 5212
union
select 'Plant-Based Milk', 'Oat Milk', 7982, 2430
union
select 'Plant-Based Milk', 'Soy Milk', 9069, 5872
```

```sql table_data
select
  category,
  item,
  sum(sales) as sales,
  sum(sales_ya) as sales_ya,
  ((sum(sales) - sum(sales_ya)) / sum(sales_ya)) as sales_yoy
from ${example_data}
group by all
order by sales DESC NULLS LAST
```

```sql sales_yoy_agg
select
  ((sum(sales) - sum(sales_ya)) / sum(sales_ya)) as sales_yoy
from ${table_data}
group by all
```

<DataTable data={table_data} groupBy=category subtotals=true totalRow=true>
<Column id=item />
<Column id=sales fmt='$#,##0' />
<Column id=sales_ya fmt='$#,##0' />
<Column id=sales_yoy totalFmt='pct1' contentType=delta fmt='pct1' totalAgg=weightedMean weightCol=sales_ya />
<Column id=sales_yoy totalFmt='pct1' contentType=delta fmt='pct1' totalAgg={sales_yoy_agg[0].sales_yoy} />
</DataTable>

The 'Thousand Island Dressing' row has zero sales_ya. Because it's zero, the weightedMean aggregation ignores the row, despite there being being a value in sales.

Logs

N/A

System Info

N/A

Severity

serious, but I can work around it

Additional Information, or Workarounds

This may be an inherent limitation to weightedMean, but percent change is a common calculation, so I think there should be a way of aggregating it in all cases. I can use a custom aggregation value as a workaround, but then that breaks groupBy aggregates in a DataTable.

Another workaround is to create a weightCol with the zeros set to a very small number like 0.001.

@nickfanion nickfanion added bug Something isn't working to-review Evidence team to review labels Nov 25, 2024
@nickfanion
Copy link
Contributor Author

Seems to be the reverse issue of #2081 but I understand why handling of zeros/nulls was changed.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working to-review Evidence team to review
Projects
None yet
Development

No branches or pull requests

1 participant