Skip to content

Make it easier to use rust DataFusion UDFs in datafusion-python #1017

Open
@timsaucer

Description

@timsaucer

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Suppose someone wants to build a library that is usable by both rust and python DataFusion users. They have written a UDF in rust and it implements the rust DataFusion traits (whether scalar, aggregate, or window). Right now, if that user wants to use their UDF in datafusion-python, they need to expose a variety of methods that basically mimic the trait functions of the rust code. For scalar UDFs the interface requires a bit of wrangling from ColumnarValue to PyArrow objects.

While it is possible to do this, it is likely error prone and tedious for implementers to write and maintain this code.

Describe the solution you'd like

We have an established pattern of adding foreign table providers via FFI interface and using PyCapsule. This makes adding a TableProvider a very easy operation. In our example code, the function to expose a table provider is only 6 lines of code and likely will require minimal maintenance.

It would be nice to expose all of the varieties of user defined functions via FFI to make this follow the established pattern and also easy for users to reuse their code.

Describe alternatives you've considered

I did a brief proof of concept where I used python calls to the required functions. This did work, but it took quite a bit of code and I suspect it will be difficult to maintain.

Additional context

This may provide additional value in that it would get us much closer to being able to expose a SessionContext via ffi, which would have nice impacts to both the datafusion-ray and ballista projects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions