Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Unify storage_options, URI params, and ObjectStoreParams #2536

Open
wjones127 opened this issue Jun 26, 2024 · 0 comments
Open

Unify storage_options, URI params, and ObjectStoreParams #2536

wjones127 opened this issue Jun 26, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@wjones127
Copy link
Contributor

Right now, most options can be passed in either environment variables or storage_options argument. But there's some like ddbTableName that are passed down as URI parameters:

# s3+ddb:// URL scheme let's lance know that you want to
# use DynamoDB for writing to S3 concurrently
ds = lance.dataset("s3+ddb://my-bucket/mydataset?ddbTableName=mytable")

To unify these, we could make the ddbTableName and related options available as storage_options. Then we could make all query parameters in URIs parsed as storage options. That would make the following two equivalent:

lance.dataset("s3://my-bucket/my-table", storage_options={"ddbCommitLock": "true", "ddbTableName": "myDdbTable"})
lance.dataset("s3://my-bucket/my-table?ddbCommitLock=true&ddbTableName=myDdbTable")

As part of this, we could simplify things by deprecating the s3+ddb scheme in favor of simply ddb.

@wjones127 wjones127 added the enhancement New feature or request label Jun 26, 2024
@wjones127 wjones127 self-assigned this Jun 26, 2024
@wjones127 wjones127 changed the title Unify storage_options and URI params Unify storage_options, URI params, and ObjectStoreParams Sep 4, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant