-
Notifications
You must be signed in to change notification settings - Fork 2.6k
provide a way to not download tests/
, examples
and other such common directories
#13491
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
For Breaking down the extraneous files you mentioned:
The challenge is knowing what is needed for build or not among the files. A heuristic can work but that can only go so far (much like we provide There is also the question of "what is essential". For example, explicitly referenced targets are sometimes required just to parse manifests today (see #13456). There would need to be figured out to get this to work. |
For the most part, we are talking text here. I wonder if we'd be better spending our time focusing on aspects of this, like |
I would like to call out that the |
Ah I looked for cargo fetch issues, I missed that one. It's not exactly the same problem but very close
I also agree that crates.io should have everything, tests and examples included, I'm not advocating for dropping that! I would like the ability to not download them but as I said, it's hard to know what to download and what not to. Does cargo check hashes again once a library has been fetched ? If not if would allow me to just delete the unneeded directories and files on my side without interfering with cargo itself |
I don't think we re-check hashes but i would not rely on that never changing. |
It should be a line or two of bash, it can be deleted if an update breaks it :) |
Update: cargo is too well-made and check hashes even in vendored dependencies so my two lines of bash won't work 😞 |
A simpler way to do this then would be to allow not checking hashes when compiling ? |
Problem
By default
cargo
includes everything from the root of the package (or almost, see https://doc.rust-lang.org/cargo/reference/manifest.html?highlight=files#the-exclude-and-include-fields)This is very nice for archiving purposes by crates.io, it allows example scraping from docs.rs and probably other purposes I haven't encountered, but it has an issue: it makes lots of CI downloads heavier than they need to be.
Looking through the dependencies we use at $work I see around a 100MiB of unused files just counting
tests/
, but there are also.github
,benches/
,examples/
and probably more I missed, and we have only ~600 deps out of the 138k crates on crates.io (without even counting all the available versions).We have a cache to avoid spamming crates.io whenever possible so it's not like we request an extra 100MiB on each CI but still, I don't like putting pressure on it if we can avoid it.
As linked above, https://doc.rust-lang.org/cargo/reference/manifest.html?highlight=files#the-exclude-and-include-fields is intended to help with that but it's not automatic: people need to consider and maintain it for each crate (and from the start, else older versions will still have all the unused files)
Proposed Solution
We may need a way to say to cargo
only fetch the absolute necessary to build the crate
. Two issues with that:src/, Cargo.toml, license files, build.rs
? Some-sys
crates may need moreIt could take the form of
cargo fetch --no-extras
(to be bikeshedded) ?Notes
Note: I'm excluding binary crates here, but they could easily be included too, if only so
cargo install
downloads less if possible, but I don't think binary crates are downloaded as much as library ones.I also don't know if the benefits would be worth it, since it could mean one of:
I have no data saying the bandwidth gains would be enough to offset either the compute or space gains, if anyone knows (the infra team ?) I would be happy to be proven wrong!
The text was updated successfully, but these errors were encountered: