-
Notifications
You must be signed in to change notification settings - Fork 1.5k
refactor!: consistent null handling in coercible signatures #15404
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: main
Are you sure you want to change the base?
refactor!: consistent null handling in coercible signatures #15404
Conversation
repeat Int64 2 IN 3 | ||
repeat LargeUtf8 1 IN 1 | ||
repeat LargeUtf8 1 OUT 1 | ||
repeat Null 1 IN 3 | ||
repeat Null 1 OUT 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this was changed, as there seems to already be a test case for SELECT repeat(NULL, 4)
in the original version, so the information schema shouldn't need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense? Since it was previously accepted by the matches_native_type
without needing to be listed as one of the possible signatures. Now we have added to the possible cartesian products per here.
repeat Int64 2 IN 3 | ||
repeat LargeUtf8 1 IN 1 | ||
repeat LargeUtf8 1 OUT 1 | ||
repeat Null 1 IN 3 | ||
repeat Null 1 OUT 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense? Since it was previously accepted by the matches_native_type
without needing to be listed as one of the possible signatures. Now we have added to the possible cartesian products per here.
e2dba0b
to
680383b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alan910127 and @wiedld
I am worried about the potential subtle semantic issues in downstream projects that would result from this change (see comment). I worry about such changes especially when the benefits of the change are nicer internal datafusion code, but no benefit to downstream crates, although they will pay the cost
I know @shehabgamin was thinking about coercion recently -- maybe he would have some time to review this PR / comment about its potential implications and how it would fit into a broader story about coercion improvements
@@ -263,10 +263,6 @@ impl TypeSignatureClass { | |||
self: &TypeSignatureClass, | |||
logical_type: &NativeType, | |||
) -> bool { | |||
if logical_type == &NativeType::Null { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description says
Are there any user-facing changes?
No.
However, doesn't this change means that existing TypeSignature::Coercable that handle Nulls will need to add Coercion::new_implicit
to retain the same behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's right, I forgot about that. PR description updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply, busy week. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @alan910127 @wiedld and @shehabgamin
Can you please add a brief note in the upgrade guide for 47.0.0 (eta next week) in https://github.com/apache/datafusion/blob/main/docs/source/library-user-guide/upgrading.md
About what a user will need to do when upgrading to 47.0.0? Specifically,
- What kind of UDFs will need to be updated
- What changes need to be done? (you could use examples from this PR)
Hi @alamb, I’ve updated |
Thanks @alan910127 -- I'll check it out in a few days |
Which issue does this PR close?
Null
type inTypeSignatureClass
more consistently #15013.Rationale for this change
Currently, there is a special case for null handling in
TypeSignatureClass::matches_native_type
, which causes inconsistencies and issues betweenTypeSignatureClass::matches_native_type
andTypeSignatureClass::default_casted_type
(it will hit the else match arm which returns an error) whenself
is not aTypeSignatureClass::Native
.What changes are included in this PR?
TypeSignatureClass::matches_native_type
TypeSignatureClass::Native(logical_null())
where the argument is nullable and it was not originally supplied.Are these changes tested?
These changes should be covered in existing test.
Are there any user-facing changes?
Yes, if the user rely on the special null case handling behavior of 'TypeSignatureClass::Native
, they will need to transform the usage to
Coercible::new_implicitand add a
TypeSignatureClass::Native(logical_null())`.