Skip to content

matt/feat/recursive ctes/config flag #3

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 572 commits into from

Conversation

matthewgapp
Copy link
Owner

@matthewgapp matthewgapp commented Jan 11, 2024

Veeupup and others added 30 commits November 28, 2023 10:02
* move array function unit_tests to sqllogictest

Signed-off-by: veeupup <code@tanweime.com>

* add comment for array_expression internal test

---------

Signed-off-by: veeupup <code@tanweime.com>
Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
…che#8351)

* Minor: Improve the document format of JoinHashMap

* sql csv_with_quote_escape

* fix
* Minor: restore DataFrame test

* Move test to a better location

* simplify test
…pache#8354)

These utils manipulate `LogicalPlan`s and `Expr`s and may be useful in
projects that only depend on `datafusion-expr`
* Extract parquet statistics to its own module, add tests

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* rename enum

* Improve API

* Add test for reading struct array statistics

* Add test for column after statistics

* improve tests

* simplify

* clippy

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Add test showing incorrect statistics

* Rework statistics

* Fix clippy

* Update documentation and make it clear the statistics are not publically accessable

* Add link to upstream arrow ticket

---------

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
* feat:implement sql style 'find_in_set' string function

* format code

* modify test case
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
* Refactor aggregate function handling

* fix ci

* update comment

* fix ci

* simplify the code

* fix fmt

* fix ci

* fix clippy
* Implement Aliases for ScalarUDF

Signed-off-by: veeupup <code@tanweime.com>

* fix comments

Signed-off-by: veeupup <code@tanweime.com>

---------

Signed-off-by: veeupup <code@tanweime.com>
* support LargeList in array_empty

* update err info
* feat: test queries for to_timestamp(float) WIP

* feat: Float64 input for to_timestamp

* cargo fmt

* clippy

* docs: double input type for to_timestamp

* feat: cast floats to timestamp

* style: cargo fmt

* fix: float64 cast for timestamp nanos only
* Support User Defined Table Function

Signed-off-by: veeupup <code@tanweime.com>

* fix comments

Signed-off-by: veeupup <code@tanweime.com>

* add udtf test

Signed-off-by: veeupup <code@tanweime.com>

* add file header

* Simply table function example, add some comments

* Simplfy exprs

* make clippy happy

* Update datafusion/core/tests/user_defined/user_defined_table_functions.rs

---------

Signed-off-by: veeupup <code@tanweime.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* document timestamp input limis

* fix text

* prettier

* remove doc for nanoseconds

* Update datafusion/physical-expr/src/datetime_expressions.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix: make ntile work in some corner cases

* fix comments

* minor

* Update datafusion/sqllogictest/test_files/window.slt

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

---------

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>
Given that group keys inherently have few repeated values, especially
when grouping on a single column, the use of dictionary encoding is
unlikely to be yielding significant returns
* done

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add more test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Minor: Improve the documentation on `ScalarValue`

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>

---------

Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
* add benchmark

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* address clippy

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix comment

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* minor changes

* PipelineStatePropagator tree refactor

* Remove duplications by children_unbounded()

* Remove on-the-fly tree construction

* Minor changes

---------

Co-authored-by: Mustafa Akur <mustafa.akur@synnada.ai>
…8121)

* feat: support  LargeList in make_array and
array_length

* chore: add tests

* fix: update tests for nested array

* use usise_as

* add new_large_list

* refactor array_length

* add comment

* update test in sqllogictest

* fix ci

* fix macro

* use usize_as

* update comment

* return based on data_type in make_array
Ted-Jiang and others added 29 commits January 3, 2024 17:02
…apache#8737)

* Add primary key support for row_number window function

* Add comments, minor changes

* Add new test

* Review

---------

Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com>
…pache#8721)

* DistinctCountGroupsAccumulator

* test coverage

* clippy warnings

* count distinct for primitive types

* revert hashset to std

* fixed accumulator size estimation
* support LargeList in cardinality
…nforceDistribution` rule (apache#8731)

* Cleanup

* More

* Restore add_roundrobin_on_top

* Restore test files

* More

* Restore

* More

* More

* Make test stable

* For review

* Add test
* Clean internal implementation of WindowUDF

* fix doc
* support largelist in array_to_string

* reduce code duplication
…, `array_append` and `array_prepend` (apache#8636)

* reuse function for string concat

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* remove casting in string concat

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* operator to function rewrite

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fix explain

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add more test

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add column cases

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* cleanup

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* presever name

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* Update datafusion/optimizer/src/analyzer/rewrite_expr.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* rename

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* fix bug

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* fmt

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

* add rowsort

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>

---------

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
…ingPredicate and cp_solver (apache#8749)

* Minor: Improve library docs to mention TreeNode, ExprSimplifier, PruningPredicate and cp_solver

* fix link
* Add logo source files

* add another file
* Add `schema_err!` error macros with optional backtrace
…pache#8740)

* revert eb8aff7 / Materialize dictionaries in group keys

* Update tests

* Update tests
* fix: struct don't push down to TableScan

* add similar to test and apply comment

* remove catch all in outer_columns_helper

* minor

* fix clippy

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* Fix error messages in array expressions

* fix fmt
* move tests from  to sqllogictests part1

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

non-null sub-field on nullable struct-field has wrong nullity. Parallel NDSON file reading