-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add example for writing an AnalyzerRule
#10855
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Perhaps @goldmedal is interested in this / has some example code to share |
I think we have an existing example showing how to create an AnalyzerRule in
However, it only works with the analyzer and the optimizer. Maybe we can enhance it after #10849 to show how to apply the custom rules for the end-to-end DataFusion query flow. |
Currently, in my personal work, I just use it like let ctx = SessionContext::new();
let new_state = ctx
.state()
.add_analyzer_rule(Arc::new(ModelAnalyzeRule::new()))
.add_analyzer_rule(Arc::new(ModelGenerationRule::new()));
let new_ctx = SessionContext::new_with_state(new_state);
// create a plan to run a SQL query
let df = new_ctx.sql("SELECT * FROM datafusion.default.orders").await?; After the API updated, I guess it would be let ctx = SessionContext::new();
ctx.add_analyzer_rule(Arc::new(ModelAnalyzeRule::new()))
.add_analyzer_rule(Arc::new(ModelGenerationRule::new()));
// create a plan to run a SQL query
let df = ctx.sql("SELECT * FROM datafusion.default.orders").await?; After #10849 is done, if no one else is working on it, I think I can help with it. |
AnalyzerRule
I have some time on a plane today that I may use to try and write up this example. I was inspired by some discussion I had this week |
Hi @alamb, I just want to share how I use user-defined AnalyzerRule. As you know, I'm working on reimplementing the semantic layer engine, wren-engine, for LLM using DataFusion. The project is still a work in progress. However, I think I have finished the part of integration with DataFusion. I think it could be a nice use case for The basic concept of wren-engine is that user can define a virtual modeling layer to apply on their physical data. select * from wrenai.default.customers_model Then , wren engine translate it to a physical query: SELECT
"customers_model"."city",
"customers_model"."id",
"customers_model"."state"
FROM
(
SELECT
"customers_model"."city",
"customers_model"."id",
"customers_model"."state"
FROM
(
SELECT
"datafusion"."public"."customers"."city" AS "city",
"datafusion"."public"."customers"."id" AS "id",
"datafusion"."public"."customers"."state" AS "state"
FROM
"public"."customers"
) AS "customers_model"
) AS "customers_model" It's a simple use case to show how it works. We have other features, such as The related PR, Canner/wren-engine#613, is still under review. However, you can find the usage of Thanks to DataFusion for providing such amazing features, allowing me to implement a more stable and structured converter for the semantic layer. |
Thank you @goldmedal - this is a great hint. I have some basic analyzer rule coded up but maybe your example is a better idea. I have a PR in progress that I need to polish up and then will ping you for review |
Here is my proposed analyzer rule example: #11089 It isn't quite as fancy as the wren example, but I think it is an understandable enough example |
Is your feature request related to a problem or challenge?
We have an example for writing a user defined optimizer rule in
datafusion/datafusion/core/tests/user_defined/user_defined_plan.rs
Line 251 in 3773fb7
However, we don't have a corresponding example for writing a user defined AnalyzerRule which also means the APIs for using them are complicated (see #10849 for example)
Describe the solution you'd like
The idea example I think would be ti Add a file to https://github.com/apache/datafusion/tree/3773fb7fb54419f889e7d18b73e9eb48069eb08e/datafusion-examples
user_defined_analyzer.rs
Perhaps the example could show how to create an
AnalyzerRule
that replaced an experssion likea / b
with a function callsafe_div(a, b)
wheresave_div
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: