Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

object_store: Should the object_store::path:Path allow a trailing '/' #7026

Open
stayrascal opened this issue Jan 27, 2025 · 1 comment
Open
Labels
question Further information is requested

Comments

@stayrascal
Copy link

Describe the bug
Hi, I noticed that the documentation of Path mentioned that Paths do not contain leading or trailing /, and the parsing logic will strip the prefix and suffix delimiter '/', which will lead some problems that the object key has a trailing '/', for example:

  1. We cannot head or put an object with a trailing '/', e.g. the upstream of object_store might want to create a 'dir', or head an object with a trailing '/' created by other application. Since the Path parsing logic will strip the trailing '/', the object key in the request is not what we expected.
// Assume there is an object "a/b/c/" existing in object storage, the bellow logic cannot head the object meta.
let path = Path::parse("a/b/c/") // the actual raw string is "a/b/c"
storage.head(&x3).await.unwrap()
  1. If we want to delete all objects(include the object has a trailing '/') under a given prefix, e.g. the delete_fixtures function doesn't work.
    Even thought the response content of object storage(e.g. the ListResponse of S3), but the trailing '/' will be remove during converting ListResponse to ListResult via Path::parse(xxx) or TryFrom::try_from, which lead to send a delete request contains unexpected object key later.
pub async fn delete_fixtures(storage: &DynObjectStore) {
    let paths = storage.list(None).map_ok(|meta| { meta.location } ).boxed();
    storage
        .delete_stream(paths)
        .try_collect::<Vec<_>>()
        .await
        .unwrap();
}

impl TryFrom<ListResponse> for ListResult {
    type Error = crate::Error;

    fn try_from(value: ListResponse) -> Result<Self> {
        let common_prefixes = value
            .common_prefixes
            .into_iter()
            .map(|x| Ok(Path::parse(x.prefix)?)) // will remove the trailing '/' if exists
            .collect::<Result<_>>()?;

        let objects = value
            .contents
            .into_iter()
            .map(TryFrom::try_from) // will remove the trailing '/' if exists
            .collect::<Result<_>>()?;

        Ok(Self {
            common_prefixes,
            objects,
        })
    }
}

To Reproduce

Expected behavior
I'm not sure the original reason that require strip the suffix delimiter during paring Path, and I maybe missing something.

I'm thinking it's reasonable that support suffix delimiter '/' during paring Path to support the above all mentioned case, because we cannot assume that all objects are created by object_store, if there are some objects with trailing '/' no matter it's a 'file' or 'dir', there might some unexpected behaviors in object_store.

Please correct me if I misunderstand something, and I'm happy to fix that if we are agree with support a trailing '/'.

Additional context

@stayrascal stayrascal added the bug label Jan 27, 2025
@tustvold tustvold added question Further information is requested and removed bug labels Jan 27, 2025
@tustvold
Copy link
Contributor

tustvold commented Jan 27, 2025

This is an intentional invariant documented here - https://docs.rs/object_store/latest/object_store/path/struct.Path.html.

Allowing for trailing / breaks portability as different systems interpret them differently. We intentionally do not expose a notion of directory, as this is not a concept found in object stores.

That being said we probably could/should skip over objects with a trailing / when listing. I would be happy to review a PR that made this change (edit: I think it might already do this)

Edit: I've created #7030 for the broader topic of directory support

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants