Skip to content

WIP(iox-10577): patched df upgrade 202-04-14 #12

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 22 commits into from

Conversation

appletreeisyellow
Copy link

@appletreeisyellow appletreeisyellow commented Apr 24, 2024

⚠️ This will not be merged. ⚠️

All the patches below are included in the last commit of April 13, 2024 in DataFusion!

This PR is based on #5 and #10 that including the following patches:

  1. Bringing us up to datafusion to 2024-04-14

  2. PATCH: add the named struct patch

    commit 66f4fcb4664fc797ffb046d5b2ebcfca65ba4cd7
    Author: Andrew Lamb <andrew@nerdnetworks.org>
    Date:   Tue Apr 2 17:21:02 2024 -0400
    
        Use `struct` instead of `named_struct` when there are no aliases (#9897)
    
  3. PATCH: include the patch request (per slack) for the upstream coalesce bug.
    apache@4d85979 / coercion vec[Dictionary, Utf8] to Dictionary for coalesce function apache/datafusion#9958

    commit f0eec349a1abed14bcb2ee8a9fbf98bbb19b8f9a (HEAD -> iox-10350/df-upgrade-2024-03-31)
    Author: Lordworms <48054792+Lordworms@users.noreply.github.com>
    Date:   Fri Apr 5 15:57:48 2024 -0500
    
        coercion vec[Dictionary, Utf8] to Dictionary for coalesce function (#9958)
    
  4. PATCH: patch for the function re-writer, visiting subqueries within expressions.
    apache@e161cd6 / fix NamedStructField should be rewritten in OperatorToFunction in subquery regression (change ApplyFunctionRewrites to use TreeNode API apache/datafusion#10032 (merged in DF on April 12, 2024)

    commit e8de1c612a986ae4b0348ce0a9d92f08d93c258c
    Author: Andrew Lamb <andrew@nerdnetworks.org>
    Date:   Wed Apr 10 11:14:02 2024 -0400
    
        fix NamedStructField should be rewritten in OperatorToFunction in subquery
    
  5. PATCH: cherry-picked apache@671cef8 / Prune pages are all null in ParquetExec by row_counts and fix NOT NULL prune apache/datafusion#10051 (merged in DF on April 13, 2024)

alamb and others added 16 commits April 5, 2024 12:40
…che#9897)

* Revert "use alias (apache#9894)"

This reverts commit 9487ca0.

* Use `struct` instead of `named_struct` when there are no aliases

* Update docs

* fmt
…pache#9958)

* for debug

finish

remove print

add space

* fix clippy

* finish

* fix clippy
…L prune (apache#10051)

* Prune pages are all null in ParquetExec by row_counts
and fix NOT NULL prune

* fix clippy

* Update datafusion/core/src/physical_optimizer/pruning.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/core/tests/parquet/page_pruning.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/core/tests/parquet/page_pruning.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/core/tests/parquet/page_pruning.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/core/tests/parquet/page_pruning.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* remove allocate vec

* better way avoid allocate vec

* simply expr

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@github-actions github-actions bot added the core label Apr 24, 2024
@appletreeisyellow appletreeisyellow changed the title WIP(iox-10577): patched df upgrade 202-04-TBD WIP(iox-10577): patched df upgrade 202-04-14 Apr 24, 2024
@crepererum
Copy link

Is the patch list still up-to-date? Based on the PR dates, many of them should be merged in the 2024-04-14 main version of DataFusion already.

@appletreeisyellow
Copy link
Author

Is the patch list still up-to-date? Based on the PR dates, many of them should be merged in the 2024-04-14 main version of DataFusion already.

@crepererum This patch list keeps a history of how we get to 2024-04-14 without breaking any CI pipeline. You are right! All the patches are merged in the 2024-04-24 main version of DataFusion already 🎉

I think it works the same to have the DataFusion version in IOx Cargo.toml to either point to this forked influxdata/arrow-datafusion repo branch or to the apache/datafusion repo main. Since I plan to update this next patch for @wiedld, I just keep pointing to this forked repo

@appletreeisyellow
Copy link
Author

The upgrade is done. Closing

@appletreeisyellow appletreeisyellow deleted the chunchun/update-df-apr-week-2 branch April 29, 2024 16:41
wiedld pushed a commit that referenced this pull request Jul 17, 2024
* Add dialect param to use CHAR instead of TEXT for Utf8 unparsing for MySQL (#12)

* Configurable data type instead of flag for Utf8 unparsing

* Fix type in comment
wiedld pushed a commit that referenced this pull request Jul 31, 2024
* Add dialect param to use CHAR instead of TEXT for Utf8 unparsing for MySQL (#12)

* Configurable data type instead of flag for Utf8 unparsing

* Fix type in comment
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants