Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How transport + multihash + metadata + http multiaddr translate to requested resource #230

Open
hannahhoward opened this issue Oct 3, 2024 · 3 comments

Comments

@hannahhoward
Copy link

hannahhoward commented Oct 3, 2024

Describe the issue

We want to add a new custom type of metadata protocol to records we put into IPNI. While we definitely want to get this in the Multicodecs table eventually, for now we'll use the private reserved space.

We want to be able to translate the HTTP Multiaddr for a provider, plus components of either the multihash used for lookup and/or the metadata for the protocol to generate an HTTP request URLs in the client that is actually using the transport after speaking to IPNI. Ideally we'd like maximum flexibility for so we can update the provider multiaddr and in doing so implicitly update all of the request URLs, even across different metadata for different context IDs in that provider.

How Trustless HTTP Gateway Does This Already

To my knowledge transport-ipfs-gateway-http is the only well defined http transport. It doesn't this translation implicitly: I take the http multiaddr of the provider record, add ipfs/ to it, and then add the stringified CID version of the multihash I just looked up plus any additional parameters, construct a request URL.

What we want to do

For our protocol, we would like to offer a slightly more flexible schema that also uses the metadata to construct a URL:

Let's assume our metadata looks like this:

type MyTransportMetadata struct {
    someCID Link
}

Now I'd like to specify directly in the multiaddr a way to construct a URL using someCID. Importantly, providers should have flexibility in the url they construct.

For example, we store some stuff in S3 with a path that's effectively at a path of /${someCID}/${someCID}.car, but another provider might just want to use /cidsCollection/${someCID}

The question is how to do this

Custom multiaddr type

Per https://github.com/libp2p/specs/blob/master/http/transport-component.md if you want application level semantics in your multiaddr, you should actually use your private multicodec + put any application data afterward.

So /http/my-transport/${someCID}/${someCID}.car for our case (with appropriate special chars insert so the slashes don't cause new multiaddr segments, like in http-path)

Unfortunately, the problem here is that if you use the custom multicodec in the multiaddr, IPNI won't process or store the multiaddr at all (but I think won't error due to #30).

Just use /http-path

As I read the http-path multiaddr spec: https://github.com/multiformats/multiaddr/blob/master/protocols/http-path.md, there's nothing concrete that would prevent me from just throwing some identifiers for metadata parameters into an http-path component of a multiaddr. This would have the effect of rending that multiaddr not actually translatable to a single HTTP resource on its own, but rather it would need to combine the multiaddr with the metadata parameters to generate the HTTP

Proposed path forward

I think we're just going to use http-path for now, unless you all have strong objections, and focus on a more correct solution if / when we get a permaneant address in the multicodec table.

Other possible paths:

  • come to an agreement to put the custom multiaddr types into IPNI -- I'm pretty sure you did this previously for httppath when it was just a custom codec you all supported
  • engage more deeply on the multiaddr spec and try to get something formally adopted for this kind of API path template

Ultimately the upside of just using http-path for now is we can easily change it later by updating our provider records if we find a better way.

cc: @MarcoPolo

@willscott
Copy link
Member

http-path in the multiaddr for provider since it's provider-level property seem pretty reasonable to me
I don't think that involves the use of the ipni metadata at all, right?

@bajtos
Copy link

bajtos commented Oct 3, 2024

To my knowledge transport-ipfs-gateway-http is the only well defined http transport. It doesn't this translation implicitly: I take the http multiaddr of the provider record, add ipfs/ to it, and then add the stringified CID version of the multihash I just looked up plus any additional parameters, construct a request URL.

(...)
Now I'd like to specify directly in the multiaddr a way to construct a URL using someCID. Importantly, providers should have flexibility in the url they construct.

For example, we store some stuff in S3 with a path that's effectively at a path of /${someCID}/${someCID}.car, but another provider might just want to use /cidsCollection/${someCID}

The question is how to do this

Have you considered using HTTP redirects on the provide side? The provider can respond to /ipfs/${someCID} by redirecting the client to /${someCID}/${someCID}.car or /cidsCollection/${someCID}, depending on where & how they serve the content.

Of course, the extra request roundtrip increases the latency, plus the provider has to pay the costs of handling this extra request, so this may not be a feasible solution for your use case.

From Spark's point of view, your proposal looks very reasonable 👍🏻

It adds a bit of complexity on our side, as well as to everyone else who may want to retrieve content after discovering the provider by querying IPNI, as we need to implement the new addressing based on http-path template.

@hannahhoward
Copy link
Author

as I said, we're going to just do http path for now, with a caveat that we can easily evolve over time.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants