-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add http(s) support to the command line #8753
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
f05534a
to
b26dfce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition 👍
Regarding docs, you can update them under here:
Regarding testing, I see in the file there are some unit tests, e.g.
Is it possible to create a version of this but for http/https? Failing that, maybe can comment an example usage to demonstrate this new feature will function properly.
datafusion-cli/src/exec.rs
Outdated
"http" | "https" => { | ||
Arc::new(HttpBuilder::new().with_url(url.origin().ascii_serialization()).build()?) as Arc<dyn ObjectStore> | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"http" | "https" => { | |
Arc::new(HttpBuilder::new().with_url(url.origin().ascii_serialization()).build()?) as Arc<dyn ObjectStore> | |
} | |
"http" | "https" => { | |
let builder = HttpBuilder::new().with_url(url.origin().ascii_serialization()); | |
Arc::new(builder.build()?) as Arc<dyn ObjectStore> | |
} |
small nit, just to make it easier on the eyes
Thank you @kcolford and @Jefffrey for the review. It would be great to address @Jefffrey 's comments I tried this out on the ClickBench file fetched via http and it worked like a charm ❯ create external table hits stored as parquet location 'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet';
0 rows in set. Query took 0.178 seconds.
❯ describe hits;
+-----------------------+-----------+-------------+
| column_name | data_type | is_nullable |
+-----------------------+-----------+-------------+
| WatchID | Int64 | YES |
| JavaEnable | Int16 | YES |
| Title | Binary | YES |
| GoodEvent | Int16 | YES |
| EventTime | Int64 | YES |
| EventDate | UInt16 | YES |
...
| RefererHash | Int64 | YES |
| URLHash | Int64 | YES |
| CLID | Int32 | YES |
+-----------------------+-----------+-------------+
105 rows in set. Query took 0.003 seconds. This is a really nice improvement. Let us know if you need help updating the docs or tests |
Hi @kcolford -- I was wondering if you think you might have a chance to work on this PR soon? If not perhaps I can help find someone to finish it up. Thanks again |
I think this feature is important, so if no one else gets a chance to get it over the line I will try to do so sometime this week |
Which issue does this PR close?
Closes #8752
Rationale for this change
See the linked issue.
What changes are included in this PR?
See the commit log.
Are these changes tested?
This functionality is trivial enough that I don't think it needs to be covered by unit tests and I found nowhere where we could perform full integration test for this repo.
Are there any user-facing changes?
Users will now be able to use the
http
andhttps
schemes when usingcreate external table
. Documentation hasn't been provided in this PR because it doesn't appear that this repo hosts that documentation.No changes to public APIs.