-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Implement SQLancer (a end-to-end SQL fuzz testing library) #11030
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Thank you @2010YOUY01 Sounds like a great idea to me -- I have created a datafusion_contrib repo for this work in case you would like to put it there: https://github.com/datafusion-contrib/datafusion-sqllancer |
This is the first interesting bug found: #11248: |
Nice! |
The initial implementation is done (with ~10 bugs found 👀 ) There is a lot of work can be done to find more bugs, any contributions are welcomed! |
This is really nice work @2010YOUY01 -- thank you so much. |
Filed #11430 to note this on the docs Also posted on twitter: https://twitter.com/andrewlamb1111/status/1811725290801963475 Thanks again @2010YOUY01 |
PARTITION BY
caused panic in 'tokio-runtime-worker' (SQLancer)
#12057
I would like to help with this, I'm big into tests |
@rluvaton Thank you! I'm still interested in this project (though I haven’t been working on it for a few months 😅) and I'm happy to help with any contributions. I think we can start with a few easier tasks:
I still have some local changes that haven’t been pushed to https://github.com/datafusion-contrib/datafusion-sqlancer. I will update it and ensure it works with the latest version of DataFusion next week. |
This would be great. We currently have some tests running only on commits to main that we could potentially extend https://github.com/apache/datafusion/blob/main/.github/workflows/extended.yml |
Is your feature request related to a problem or challenge?
I noticed an awesome SQL fuzzing framework SQLancer can be implemented on DataFusion, and it is able to detect many bugs even in PostgreSQL and SQLite
Update:
Implementation is now at datafusion-sqlancer
Supported SQL Features
JOIN
s,ORDER BY
,WHERE
HAVING
clausetarget_partition
,prefer_hash_join
etc.Supported Test Oracles
Note: most oracles only apply to a subset of available query types, for advanced SQL features like window functions we can only generate random queries and report crashes.
More context for below test oracles at https://github.com/sqlancer/sqlancer/tree/main
How SQLancer works in short
JDBC
to do SQL level testingsSQLancer
has 5 logic check oracles, one of them works like:Above showed consistency check generated Q1 (very likely to be optimized by predicate pushdown), and Q2(hard to be optimized), such test suit focus on correctness of the optimizer. There are 5 similar test oracles available to be implemented, those carefully designed checks make this testing framework really powerful.
Describe the solution you'd like
I plan to implement
SQLancer
onDataFusion
(starting with a specific test oralcleNoREC
which requires less engineering effort).For now, a minimal subset of SQL features is implemented: it hasn't detected any logical bug yet, just 2 bad-input bugs for some scalar functions showed up
(Will share the code once it is cleaned up)
If you have any features (SQL clauses / data types / specific functions) would like to be further tested, I can implement them first :)
Describe alternatives you've considered
SQLsmith
looks like another popular choice, I haven't looked into it carefully yet.But if it's only generating random SQL to test if the system will crash, then
SQLancer
should be a more comprehensive tool.Additional context
SQLancer's page have several papers/YouTube talk video recordings available
The text was updated successfully, but these errors were encountered: