Skip to content

Many DEBUG datafusion_functions_array] Overwrite existing UDF: array_to_string messages in log #10658

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
alamb opened this issue May 24, 2024 · 2 comments · Fixed by #10661
Closed
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented May 24, 2024

Describe the bug

We noticed some additional expected log messages upstream in InfluxDB. I found the same messages are present in datafusion-cli

To Reproduce

andrewlamb@Andrews-MacBook-Pro-2:~/Software/influxdb_iox$ RUST_LOG=DEBUG datafusion-cli
DataFusion CLI v38.0.0
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_to_string
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: string_to_array
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: range
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: generate_series
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_dims
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: cardinality
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_ndims
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_append
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_prepend
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_concat
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_except
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_element
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_pop_back
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_pop_front
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_slice
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_has_any
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: empty
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_length
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: flatten
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_sort
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_repeat
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_resize
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_reverse
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_distinct
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_intersect
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_union
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_position
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_positions
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_remove_n
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace_n
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace_all
[2024-05-24T19:45:36Z DEBUG datafusion_functions_array] Overwrite existing UDF: array_replace

Expected behavior

I would expect that we aren't re-registering the same UDF multiple times 🤔

Additional context

No response

@alamb alamb added the bug Something isn't working label May 24, 2024
@jayzhan211
Copy link
Contributor

jayzhan211 commented May 25, 2024

Just need to remove aliases in those functions

For example,

String::from("array_has"),
String::from("list_has"),
String::from("array_contains"),
String::from("list_contains"),

Remove the one duplicate name in aliases

@jayzhan211 jayzhan211 added the good first issue Good for newcomers label May 25, 2024
@goldmedal
Copy link
Contributor

take

Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this issue Jul 25, 2024
The alias list no longer includes the name of the function.

Ref: apache/datafusion#10658
andygrove pushed a commit to apache/datafusion-python that referenced this issue Jul 31, 2024
* chore: update datafusion deps

* feat: impl ExecutionPlan::static_name() for DatasetExec

This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266

* feat: update first_value and last_value wrappers.

Upstream signatures were changed for the new new `AggregateBuilder` api [0].

This simply gets the code to work. We should better incorporate that API into `datafusion-python`.

[0] apache/datafusion#10560

* migrate count to UDAF

Builtin Count was removed upstream.

TBD whether we want to re-implement `count_star` with new API.

Ref: apache/datafusion#10893

* migrate approx_percentile_cont, approx_distinct, and approx_median to UDAF

Ref: approx_distinct apache/datafusion#10851
Ref: approx_median apache/datafusion#10840
Ref: approx_percentile_cont and _with_weight apache/datafusion#10917

* migrate avg to UDAF

Ref: apache/datafusion#10964

* migrage corr to UDAF

Ref: apache/datafusion#10884

* migrate grouping to UDAF

Ref: apache/datafusion#10906

* add alias `mean` for UDAF `avg`

* migrate stddev to UDAF

Ref: apache/datafusion#10827

* remove rust alias for stddev

The python wrapper now provides stddev_samp alias.

* migrage var_pop to UDAF

Ref: apache/datafusion#10836

* migrate regr_* functions to UDAF

Ref: apache/datafusion#10898

* migrate bitwise functions to UDAF

The functions now take a single expression instead of a Vec<_>.

Ref: apache/datafusion#10930

* add missing variants for ScalarValue with todo

* fix typo in approx_percentile_cont

* add distinct arg to count

* comment out failing test

`approx_percentile_cont` is now returning a DoubleArray instead of an IntArray.

This may be a bug upstream; it requires further investigation.

* update tests to expect lowercase `sum` in query plans

This was changed upstream.

Ref: apache/datafusion#10831

* update ScalarType data_type map

* add docs dependency pickleshare

* re-implement count_star

* lint: ruff python lint

* lint: rust cargo fmt

* include name of window function in error for find_window_fn

* refactor `find_window_fn` for debug clarity

* search default aggregate functions by both name and aliases

The alias list no longer includes the name of the function.

Ref: apache/datafusion#10658

* fix markdown in find_window_fn docs

* parameterize test_window_functions

`first_value` and `last_value` are currently failing and marked as xfail.

* add test ids to test_simple_select tests marked xfail

* update find_window_fn to search built-ins first

The behavior of `first_value` and `last_value` UDAFs currently does not match the built-in behavior.
This allowed me to remove `marks=pytest.xfail` from the window tests.

* improve first_call and last_call use of the builder API

* remove trailing todos

* fix examples/substrait.py

* chore: remove explicit aliases from functions.rs

Ref: #779

* remove `array_fn!` aliases

* remove alias rules for `expr_fn_vec!`

* remove alias rules from `expr_fn!` macro

* remove unnecessary pyo3 var-arg signatures in functions.rs

* remove pyo3 signatures that provided defaults for first_value and last_value

* parametrize test_string_functions

* test regr_ function wrappers

Closes #778
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants