-
Notifications
You must be signed in to change notification settings - Fork 2
Feature: read_parquet_mergetree #13
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
# Conflicts: # CMakeLists.txt # chsql/src/default_table_functions.cpp # src/chsql_extension.cpp
Hey @carlopi any chance you or someone in the team knows how to get around the windows build error? 🙏 |
Can you try to reduce the diff? Or try to copy the setup of extensions like duckdb_delta. |
@carlopi Is it enough if I tell you that the real change is only in the file https://github.com/lmangani/duckdb-extension-clickhouse-sql/pull/13/files#diff-c5bffd6b887e2ced50224f44652dab784c9c7f7ab8c46a390410cc58490391ed ? The other changes are just internal insignificant file moves. Or do you need a separate PR with the function implementation? |
Then it's likely either a |
@carlopi Aaah . It's about the windows build problem. From the MSVC++ linker logs I see that somehow the linker wants to link
Have no idea why it wants to link the same |
screenshot of @akvlad kicking the windows builder where it hurts 😄 |
amazing work @akvlad lets merge and proceed with some field testing 🎉 |
read_parquet_mergetree
Description
The
read_parquet_mergetree
chsql function provides a familiar interface for ClickHouse users by emulating aspects of the MergeTree engine strategy. Its primary purpose is to efficiently merge multiple parquet files using a specified primary SORT key - without consuming excessive memory and facilitating fast range queries on the resulting file.Syntax
Features
Parameters
FILE_ARRAY[]
: An array of file paths to mergePRIMARY_SORT_KEY
: Specifies the column(s) used as the primary sort key for merging and ordering dataBenchmark