-
Notifications
You must be signed in to change notification settings - Fork 645
Encoded URLs with a + sign fail #4891
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
So it looks like we should be URL encoding To fix this, we'll need to research whether we need to upload both URL encoded and unencoded for compatibility reasons or not... |
Any update on this one? It's failing rust builds for us and wanted to see if there is any light at the end of the tunnel. Thank you so much! |
Hi, yes, we talked about this in our most recent meeting but due to vacations and other obligations, no one has the bandwidth to work on this immediately. If you have time to help, researching whether cargo can handle URLs using spaces returned from crates.io's |
@carols10cents - Are there any further updates on this issue and the timing of when the code fix will be merged? Thanks! |
No, it needs to be reviewed and tested and no one has had time to do that yet. |
together with @jdno I've done a bit of investigation on this topic to figure out the root causes and potential solutions: Investigationhttps://datatracker.ietf.org/doc/html/rfc3986#section-2.2 defines https://datatracker.ietf.org/doc/html/rfc1866#section-8.2.1 specifies that space https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html defines cargocargo requests dependencies without any percent-encoding in the URL path. e.g. https://crates.io/api/v1/crates/libgit2-sys/0.12.25+1.3.0/download crates.ioFor crates.io,
static.crates.ioFor static.crates.io,
Additional contexthttps://github.com/rust-lang/crates.io-index/blob/60eb74e915ef71515ada464494839a372bcae87f/config.json#L2 Proposed SolutionSummary: We switch S3 over to use Detailed Plan:
This should not disrupt the regular downloads, and it should still allow us to We could optionally fix that last part by adjusting cargo to always encode Alternatives
@rust-lang/crates-io @rust-lang/cargo @rust-lang/infra please let me know if you have any objections, suggestions or other comments. if not, I would start implementing the proposed solution in the next weeks. |
Thanks for the write-up! It's pretty great to just follow the spec, though I guess it may break some 3rd-party registries if we do the adjustment on Cargo side. Could you open an issue on rust-lang/cargo so that Cargo team can also track it? |
Step 3 is being implemented by #6665 @rust-lang/infra I will need your help on step 4 since I don't have access to the CDN infrastructure. |
/cc @dtolnay since this might become relevant to https://github.com/dtolnay/get-all-crates depending on when step 4 is implemented :) |
@Turbo87 could you ask in the t-infra channel on Zulip, please? all of us get a lot of pings and it will be easier to coordinate the discussion. (i don't have permissions to modify S3 or CloudFront) |
done :) |
rust-lang/simpleinfra#313 has been merged and deployed this morning, which implements step 4 of the plan above. #6666 has subsequently been merged, which implements step 5 of the plan above. This PR will be deployed in the near future too. The only thing remaining now is to remove all files from S3 that include a space character instead of the encoded Since only cleanup for this is remaining and the original issue should be resolved I'll close the issue now :) |
An issue was reported in rust-lang/crates.io#4891 related to how the `+` character is encoded in URLs. The issue was fixed by ensuring files are uploaded with consistent file names and by rewriting URLs with the wrong encoding in the Content Delivery Networks. The test group will request various URLs with different encoding from the CDNs to ensure they work correctly.
An issue was reported in rust-lang/crates.io#4891 related to how the `+` character is encoded in URLs. The issue was fixed by ensuring files are uploaded with consistent file names and by rewriting URLs with the wrong encoding in the Content Delivery Networks. The test group will request various URLs with different encoding from the CDNs to ensure they work correctly.
Some clients, proxies, and/or repository managers may encode URLs to the upstream (for instance JFrog Artifactory). Due to blocks on crates.io some of these requests fail if there is a + # the URL
e.g. If we go to this URL it works fine:
https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25+1.3.0.crate
But if we go to the encoded URL it gives us a 403 Access Denied
https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25%2B1.3.0.crate
This has also exposed some more strange behavior.
If we encode the + sign as a space (%20 instead of %2B), it works.
https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25%201.3.0.crate
Or even if I put a space instead in the URL (though that may be my browser encoding for me)
"https://static.crates.io/crates/libgit2-sys/libgit2-sys-0.12.25 1.3.0.crate"
Since URL encoding is fairly standard and other encoding works, I believe this is a bug
The text was updated successfully, but these errors were encountered: