When should we next squash the index? #47

Eh2406 · 2019-06-26T21:18:19Z

Last (only) time: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 we had 100k+ commits and we thought we weighted a little too long (given how smoothly it went), now we have 51k + ~1.5k/week.

The Cargo team discussed this today and we think we should do this soon. Not interrupt whatever you are working on, but when you have a chance. Who has the permissions to run that script? Is it just @alexcrichton?

As the index grows we should have a policy for when we plan to do the squash. When we have a policy we should plan to make a bot to ensure we follow it. It is reasonable to say that it is too soon. Or we could make a simple policy for now and grow it as we need. The Cargo team discussed a policy like "when we remember approximately every 3-6 months" or "... approximately at 50k commits" or "... approximately when the squash is half the size of the history"

alexcrichton · 2019-06-27T07:54:28Z

The actual script:

the script

set -ex

now=`date '+%Y-%m-%d'`
git fetch origin
git reset --hard origin/master
head=`git rev-parse HEAD`
git push -f git@github.com:rust-lang/crates.io-index $head:refs/heads/snapshot-$now

msg=$(cat <<-END
Collapse index into one commit

Previous HEAD was $head, now on the \`snapshot-$now\` branch

More information about this change can be found [online] and on [this issue]

[online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440
[this issue]: https://github.com/rust-lang/crates-io-cargo-teams/issues/47
END
)

new_rev=$(git commit-tree HEAD^{tree} -m "$msg")

git push \
  git@github.com:rust-lang/crates.io-index \
  $new_rev:refs/heads/master \
  --force-with-lease=refs/heads/master:$head

Edit: to include the critical --force-with-lease that was in https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440/31?u=eh2406

only requires push access to the crates.io-index, which any admin of the rust-lang GitHub organization has (and probably more).

I think it'd be best to do some measurements here directly correlated with the metrics we care about. The original rationale for squashing was that initial clones took quite a long time downloading so much history. As a result I would suspect that we should establish thresholds along the lines of "how big is the download and how much would we save with a squash"?

sgrif · 2019-06-27T20:46:23Z

which any admin of the rust-lang GitHub organization has (and probably more).

I'm also able to do it, and bors of course can (dunno if bors is an admin). I think that's it though.

This was discussed at the crates.io meeting. Here were the key points.

It makes sense for some amount of this to be automated. crates.io has the infra for this, and is happy to have this on our servers.
The last time this happened there was some public communication around it. Is this something that we want to do for each squash? If not, we can just automate the whole process.
@sgrif points out that at least one crawler looks at commit dates to avoid having to ask crates.io for update times, so we should ensure that the history branch remains consistently named. The script already does this, but we should make sure that piece doesn't change
We can automate this either by time or by commit count.
If we don't want to completely automate the whole process, we can still automate checking whatever threshold we want and just send an email to relevant folks.
The crates.io team is happy to take on the work around this (it's relatively minor for us)
@joshtriplett is interested in seeing if it's possible to set this up so that git is able to do a smaller update for folks who would have done a fast forward had we not squashed.
Ultimately if we get the "when to squash" threshold wrong, there was consensus that the cost is relatively minor

The main unresolved questions, which we'd like to get answers from the Cargo team on, are:

Are we ok with completely automating the whole process, and therefore losing the ability to communicate beforehand?
Should the threshold be time based or commit based?
What should the threshold be?

My personal answer to those questions, which does not represent consensus among any team(s) are:

Yes, we should automate it and not communicate when it happens. Nobody noticed last time. The only parties affected are crawlers, who have to handle this either way, and are better served by this being automated (and therefore ensured consistent).
Commit based. While time based is probably better for crawlers (they can just assume there's a history branch every 6 months, etc), our primary focus should be on human users of Cargo. Commit count is ultimately the main factor behind all the problems we intend to solve by squashing.
75k commits. This is very uninformed, and entirely based on the issue description saying we thought 100k was too long to wait last time. This is an easy number to make configurable, and we should probably just experiment with what feels like the best balance.

Eh2406 · 2019-06-27T22:02:14Z

A follow up to @joshtriplett suggestion.

To clone the index as is
git clone -b master --single-branch https://github.com/rust-lang/crates.io-index.git
downloads 61.9MiB
Then to fetch the squash that I made is
git fetch https://github.com/Eh2406/crates.io-index.git master
does not redownload the data!
If I dell that checkout, and clone the index from my squash
git clone -b master --single-branch https://github.com/Eh2406/crates.io-index.git
downloads 17.26MiB

So apparently we can get git to do this correctly! (Others should check if they are getting the same results.) The thing I tried https://github.com/Eh2406/crates.io-index/commit/65419fd5f5b9758b95fa08f207276639b1426e43 is to add a new squash commit on top of the existing one from last time. I did not make a script just did it manually. It may be sufficient to just share the same root commit, if someone wants to give that a try.

Eh2406 · 2019-06-27T22:25:21Z

Looks like it works with the root in common, using git fetch https://github.com/Eh2406/crates.io-index.git test.

The root can be found with root = git rev-list --max-parents=0 HEAD
Then the penultimate line can be new_rev=$(git commit-tree HEAD^{tree} -m "$msg" -p $root)
And everything should work.

alexcrichton · 2019-07-01T23:50:56Z

For my own personal takes on some of the unresolved questions:

Are we ok with completely automating the whole process, and therefore losing the ability to communicate beforehand?

I don't have any problem with losing communication about this, I don't think it's really all that important especially now that it went so smoothly the first time. I do have a slightly different concern though. I think it would be a failure mode of Cargo if the index were automatically rolled up every day (defeating the purpose of delta updates), and having a fully automated process may cause us to not realize we're getting close to that situation.

I am, however, very much in favor of automation. So to allay my concern I would request that a notification of some form be sent out to interested team members when a squash happens. (aka I just want an email of some form)

Should the threshold be time based or commit based?

I would personally measure this in megabytes of data to download rather then either metric you mentioned, but commits are likely a good proxy for the megabytes being downloaded. My ideal metric would be something like "we shave 100MB off a clean download of the index", and the 100 number there is pulled out of thin air and could be more like 50 or something like that.

What should the threshold be?

I think the first index squash went from roughly 90MB to 10MB (ish) for a clean initial download. Along those lines I'd say that a squash should save at least 70MB before squashing.

Eh2406 · 2019-07-08T16:05:03Z

I think it would be a failure mode of Cargo if the index were automatically rolled up every day (defeating the purpose of delta updates)

@alexcrichton One question, if git can download a roll up in O(delta) work would you still think this is a failure mode?

alexcrichton · 2019-07-08T17:12:23Z

AFAIK git just downloads objects and doesn't do any diffing at the fetch layer. Delta updates work because most indexes have a huge shared history. If we roll into one commit frequently there's no shared history so git will keep downloading the entire new history, which would be fresh each time.

So to answer your question, I don't believe git can have any sort of delta update when the history is changed and so I would still consider it a failure mode.

smarnach · 2019-07-08T20:04:31Z

For users who already have the latest version of the index, Git will generally see that the tree object for the single squashed commit is identical to the tree object it already has (since it has the same hash), so it will only donwload the single new commit object.

So another solution may be to always keep, say, the last month's worth of commits in the history, and only squash the bits that are older than one month. All users who have updated in the month before squashing will be able to download deltas, and only users with an even older version of the index will have to redownload it in full.

When squashing the old commits, all commits on top of them will have to be rewritten, so users will have to redownload the commit objects. However, commit objects hardly contain any data, and the associated tree objects are identical, so they won't be retransmitted.

I did some experiments for this approach, and got somewhat mixed results with what Git is able to detect, but I believe it is possible to make it work. It would require some work to figure out the details, though.

smarnach · 2019-07-08T21:14:09Z

We had some discussion in the crates.io Discord channel (can't figure out how to permalink it), and things aren't quite as easy as indicated in my previous comment. I may have time to do some experiments later this week, but I don't make any promises.

Eh2406 · 2019-07-08T21:23:44Z

link to the discussion: https://discordapp.com/channels/442252698964721669/448525639469891595/597888610376613901

Eh2406 · 2019-07-10T21:56:51Z

We did not have time to discuss this at the Cargo meeting today. So we don't have any new answers for @sgrif.

I would request that a notification of some form be sent out

I was thinking maybe we open and issue on the index repo and have the script add a comment there, then anyone interested (in teams or not) can subscribe to that issue to get notifications. I would want to look into @Nemo157 suggestions for how to get git not to download the history at all well before we start doing a squash every week.

I think the first index squash went from roughly 90MB to 10MB (ish)

>git clone -b master --single-branch https://github.com/rust-lang/crates.io-index.git
...
Receiving objects: 100% (297740/297740), 67.54 MiB | 5.79 MiB/s, done.

>git clone -b master --single-branch https://github.com/smarnach/crates.io-index
Cloning into 'crates.io-index'...
...
Receiving objects: 100% (36539/36539), 14.01 MiB | 5.75 MiB/s, done.

So it looks like we save ~54 MiB today. Assuming a linear size per commit then we would hit 70 MiB saved at ~ 72K Commits. So it looks like people's instincts are approximately in the same ballpark.

joshtriplett · 2019-07-17T20:22:07Z

It sounds like we don't need to keep a window of commits on the main branch, and we just need to archive the squashed-away commits on an archive branch? And since the server has those available it can do deltas from those objects? That sounds perfect.

Eh2406 · 2019-07-17T21:45:44Z

We discussed this at the Cargo meeting today.

The main unresolved questions, which we'd like to get answers from the Cargo team on, are:

Are we ok with completely automating the whole process, and therefore losing the ability to communicate beforehand?

Yes! Several of us would like some form of notification when it happens, but it does not need to be in advance and we do not need to publicize the event.

Should the threshold be time based or commit based?

We realized that it was hard to make a decision do to a bikeshed effect, we all had different opinions but not strong enough to convince anyone. So we decided whatever is easiest for you to set up. If you need someone to make a decision, A daly check if we are over the commit limit.

What should the threshold be?

After some discussion @ehuss pointed out that it is already noticeable, and @nrc pointed out that we want to have the script do something the first time it runs. We don't want it to break things on some random day in 3 month when we have non of this paged in. So if it is time based then every 6 months, if it is commit based then 50k. Most importantly We can monitor it and adjust the threshold later if needed.

We had some discussion of whether this will cause existing users to download the full index on each squash day. My understanding from our discussion with @Nemo157 and @smarnach on discord is that the current plan will not trigger a full download. The Github repo will always have a commit referencing all tree objects that the client will have, so Github will have what it needs to do a delta even when master has just been squashed. No git-gc can remove the tree objects as there used by a backup branch. @ehuss wanted to recheck to make sure that this works as hoped.

sgrif · 2019-07-17T22:51:09Z

Will move forward with a prototype that squashes when the commit count is >50k

ehuss · 2019-07-25T18:47:31Z

I've been doing some tests, and Alex's original script seems to work pretty well. I've tried with a copy fetched by cargo that is anywhere from 10 to 1,000 to 10,000 commits old, and it seemed to properly download just the minimum necessary.

A fresh download (delete CARGO_HOME) from a squashed index is about a 15MB download, which uses about 16MB of disk space. Compare that to the current size which is about 73MB download using about 79MB of disk space.

The only issue I see is that for existing users, it does not release the disk usage. The only way I've determined to delete the old references is to run:

git reflog expire --expire=now --all
git gc --prune=now

Cargo currently has a heuristic where it automatically runs git gc occasionally. Perhaps it could be extended to run the above commands? It could be a big win for disk usage. What do people think?

alexcrichton · 2019-07-25T18:57:11Z

I'd be totally down for expanding Cargo's gc commands, and if Cargo can share indexes even across squashes that's even better!

Eh2406 · 2019-09-06T14:41:50Z

@ehuss looks (https://git-scm.com/docs/git-reflog) like the git gc dose a --expire=90days by default and we can change the gc.reflogExpire config to set a shorter duration.

@sgrif what is the progress on the prototype?

alexcrichton · 2019-10-17T14:36:59Z

@sgrif this recently came up again on internals, wanted to ping again if you've got progress on a prototype?

I don't mind running the script manually nowadays one more time before we get automation set up again. If I don't hear back from you in a week or so I'll go ahead and do that and we can continue along the automation track!

Previous HEAD was e669e72, now on the `snapshot-2019-10-17` branch More information about this change can be found online: * https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 * rust-lang/crates-io-cargo-teams#47 * https://internals.rust-lang.org/t/re-squash-the-crates-io-repository/11121

alexcrichton · 2019-10-17T14:49:38Z

Ok I briefly talked with @sgrif on IRC and the index has been squashed! We'll be sure to have automation for the next one :)

ehuss · 2020-03-13T19:46:23Z

It looks like the index has grown considerably since the last squash (looks like it is 75MB now, and can be squashed down to about 20MB). @rust-lang/crates-io is there any progress on automating the process? Is there anything I can do to help? If there are barriers to setting up a cron job, can someone run the script manually?

alexcrichton · 2020-03-25T20:16:23Z

I've re-squashed the index

gziskind · 2020-07-24T14:22:40Z

When you squash the index in the future, are you able squash it for, as an example, everything older than 1 week instead of every commit in the repo at the time its squashed?

I only ask because I currently am using the commit history as a changes feed for the crates index and if all commits are squashed one day, i would potentially lose any changes since the last time my automated process checked the commit history. This would give me a week buffer to run it before losing any information

Eh2406 · 2020-07-24T14:34:57Z

I don't think so. A commit with a long history does not have the same hash as a commit with 1 week of history. So if you only walk master, your just going to see new commits that happen to do the same thing as the old commits but are not equal. The code to handle that, may as well be code to walk the backup branches, feels like the same level of complexity.

Nemo157 · 2020-07-28T10:50:23Z

If you just compare the trees rather than walking commits it should work fine (e.g. from looking at the code I think crates-index-diff should work fine across a squash, and I don't recall docs.rs which uses it having any issues around March).

Eh2406 · 2020-11-13T04:12:56Z

Looks like it may be that time once again.

jtgeibel · 2020-11-19T23:03:00Z

Looks like it may be that time once again.

This was last squashed on 2020-08-04, so we will need to automate the squashing if we're looking at doing this every few months.

Previous HEAD was 1b7e17a, now on the `snapshot-2020-11-20` branch More information about this change can be found [online] and on [this issue] [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2021-05-03T14:13:27Z

to avoid getting the credentials out of Heroku at all, what we could do is to put the script on the crates.io repo, do a deploy and then just heroku run -a crates-io scripts/squash-index.sh.

@pietroalbini that was my original plan, but then I remembered that the deployed slug on Heroku doesn't include source/files from the git repo. With some tweaks something like scripts/squash-index.sh > heroku run -a crates-io should work, but I expect we can have the squash integrated in the codebase by the time we want to run it again so hopefully this is the last time we run a script like this locally.

Previous HEAD was a5dcd84, now on the `snapshot-2021-05-05` branch More information about this change can be found [online] and on [this issue] [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

pietroalbini · 2021-05-05T19:43:41Z

Ran a manual squash: rust-lang/crates.io-index@4a44357

This adds a background job that squashes the index into a single commit. The current plan is to manually enqueue this job on a 6 week schedule, roughly aligning with new `rustc` releases. Before deploying this, will need to make sure that the SSH key is allowed to do a force push to the protected master branch. This job is derived from a [script] that was periodically run by the cargo team. There are a few minor differences relative to the original script: * The push of the snapshot branch is no longer forced. The job will fail if run more than once on the same day. (If the first attempt fails before pushing a new root commit upstream, then retries should succeed as long as the snapshot can be fast-forwarded.) * The push of the new root commit to the origin no longer uses `--force-with-lease` to reject the force push if new commits have been pushed there in parallel. Other than the occasional manual changes to the index (such as deleting crates), background jobs have exclusive write access to the index while running. Given that such manual changes are rare, this job completes quickly, and such manual tasks should be automated too, this is low risk. The alternative is to shell out to git because `libgit2` (and thus the `git2` crate) do not yet support this portion of the protocol. [script]: rust-lang/crates-io-cargo-teams#47 (comment)

jtgeibel · 2021-05-07T17:44:05Z

In today's crates.io team meeting, the team agreed that in terms of workload/coordination we have no concerns with scheduling an index squash every ~6 weeks. I have an initial implementation migrating the script into a background job at rust-lang/crates.io@a7efdcd. The main open item is working with infra to determine if we want to allow the SSH key used by the service to do a forced push to the repo or if that should be reserved for a special SSH key. Until now, the service has treated the index as fast-forward-only.

This adds a background job that squashes the index into a single commit. The current plan is to manually enqueue this job on a 6 week schedule, roughly aligning with new `rustc` releases. Before deploying this, will need to make sure that the SSH key is allowed to do a force push to the protected master branch. This job is derived from a [script] that was periodically run by the cargo team. There are a few minor differences relative to the original script: * The push of the snapshot branch is no longer forced. The job will fail if run more than once on the same day. (If the first attempt fails before pushing a new root commit upstream, then retries should succeed as long as the snapshot can be fast-forwarded.) * The push of the new root commit to the origin no longer uses `--force-with-lease` to reject the force push if new commits have been pushed there in parallel. Other than the occasional manual changes to the index (such as deleting crates), background jobs have exclusive write access to the index while running. Given that such manual changes are rare, this job completes quickly, and such manual tasks should be automated too, this is low risk. The alternative is to shell out to git because `libgit2` (and thus the `git2` crate) do not yet support this portion of the protocol. [script]: rust-lang/crates-io-cargo-teams#47 (comment)

Previous HEAD was baed40a, now on the `snapshot-2021-06-23` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

This adds a background job that squashes the index into a single commit. The current plan is to manually enqueue this job on a 6 week schedule, roughly aligning with new `rustc` releases. Before deploying this, will need to make sure that the SSH key is allowed to do a force push to the protected master branch. This job is derived from a [script] that was periodically run by the cargo team. There are a few minor differences relative to the original script: * The push of the snapshot branch is no longer forced. The job will fail if run more than once on the same day. (If the first attempt fails before pushing a new root commit upstream, then retries should succeed as long as the snapshot can be fast-forwarded.) * The push of the new root commit to the origin no longer uses `--force-with-lease` to reject the force push if new commits have been pushed there in parallel. Other than the occasional manual changes to the index (such as deleting crates), background jobs have exclusive write access to the index while running. Given that such manual changes are rare, this job completes quickly, and such manual tasks should be automated too, this is low risk. The alternative is to shell out to git because `libgit2` (and thus the `git2` crate) do not yet support this portion of the protocol. [script]: rust-lang/crates-io-cargo-teams#47 (comment)

Previous HEAD was ebab036, now on the `snapshot-2021-06-26` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

Add a background job for squashing the index This adds a background job that squashes the index into a single commit. The current plan is to manually enqueue this job on a 6 week schedule, roughly aligning with new `rustc` releases. Before deploying this, will need to make sure that the SSH key is allowed to do a force push to the protected master branch. This job is derived from a [script] that was periodically run by the cargo team. Relative to the original script, the push of the snapshot branch is no longer forced. The job will fail if run more than once on the same day. (If the first attempt fails before pushing a new root commit upstream, then retries should succeed as long as the snapshot can be fast-forwarded.) [script]: rust-lang/crates-io-cargo-teams#47 (comment)

jtgeibel · 2021-07-02T22:55:17Z

The background job to run the squash has been merged, and was just run. Squashed commit: rust-lang/crates.io-index@3804ec0

Previous HEAD was 4181c62, now on the `snapshot-2021-07-02` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2021-09-24T13:18:43Z

The cargo index has been squashed again: rust-lang/crates.io-index@8fe6ce0

Previous HEAD was f954048, now on the `snapshot-2021-09-24` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

adamncasey · 2021-12-20T12:06:46Z

I've started noticing that crates io index fetching is taking a while again on slow connections/cpus. It looks like we're at more commits (44k) than before we last squashed(34k). Is it time to schedule a new squash?

Previous HEAD was 94b5429, now on the `snapshot-2021-12-21` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2021-12-21T00:25:55Z

Thanks for reminder @adamncasey. The index has been squashed.

Previous HEAD was rust-lang/crates.io-index@94b5429, now on the snapshot-2021-12-21 branch

Previous HEAD was ba5efd5, now on the `snapshot-2022-03-02` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2022-03-02T03:05:55Z

The index has been squashed.

Previous HEAD was ba5efd5, now on the snapshot-2022-03-02 branch. The snapshot-2021-12-21 branch has been deleted, and the new snapshot branch has been archived to the rust-lang/crates.io-index-archive repo.

ehuss · 2022-07-02T17:49:20Z

@jtgeibel I was wondering if you could look at squashing again. I'm not sure if that is in a cron job or if it is still manual. It looks like it has been about 4 months since the last squash.

The index is currently 237MB which is about the largest I've ever seen it, which can take a considerable amount of time to clone and unpack.

Previous HEAD was 075e7a6, now on the `snapshot-2022-07-06` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2022-07-06T02:46:48Z

Thanks for the ping @ehuss, invoking the squash is still manual. We still need to automate the archiving (to the archive repo) and eventual deletion of the snapshot branches (from the main repo).

Previous HEAD was 075e7a6 and is now the snapshot-2022-07-06 branch in the archive repo. I plan to remove this branch from the main repo in 7-10 days.

ehuss · 2022-08-29T19:05:05Z

@jtgeibel Just checking in again to see if we can get another squash. The index is currently over 150MB and 34434 commits and takes about a minute to clone on a fast-ish system.

Previous HEAD was 31a1d8c, now on the `snapshot-2022-08-31` branch More information about this change can be found [online] and on [this issue]. [online]: https://internals.rust-lang.org/t/cargos-crate-index-upcoming-squash-into-one-commit/8440 [this issue]: rust-lang/crates-io-cargo-teams#47

jtgeibel · 2022-08-31T01:01:33Z

Previous HEAD was 31a1d8c9b1f6851c9b248813b5bb883ba5297883, now archived in the snapshot-2022-08-31 branch.

This is the next to smallest snapshot in terms of commits. I just deleted a temporary branch that was left behind on the main repo, so it is possible we weren't getting optimal compression server side. I plan to remove the snapshot branch from the main repo in about 10 days.

sgrif mentioned this issue Jun 27, 2019

Crates.io meeting agenda 2019-06-27 20:00 UTC (Discord, 30 min) #45

Closed

sgrif mentioned this issue Jun 27, 2019

Crates.io meeting agenda 2019-07-11 20:00 UTC (Discord, 30 min) #48

Closed

jtgeibel mentioned this issue May 14, 2021

Add a background job for squashing the index rust-lang/crates.io#3592

Merged

carols10cents mentioned this issue Feb 11, 2022

Respond to GH email with timeline rust-lang/crates.io-index-archive#5

Closed

milahu mentioned this issue Mar 13, 2022

six million sha256 hashes for 300K packages - a use case dolthub/dolt#2969

Closed

Eh2406 mentioned this issue Sep 9, 2022

Archive this repo #105

Open

ilyapopov mentioned this issue Nov 30, 2022

Index may need squashing again rust-lang/crates.io#5563

Closed

When should we next squash the index? #47

When should we next squash the index? #47

Comments

Eh2406 commented Jun 26, 2019

alexcrichton commented Jun 27, 2019 • edited Loading

sgrif commented Jun 27, 2019

Eh2406 commented Jun 27, 2019

Eh2406 commented Jun 27, 2019

alexcrichton commented Jul 1, 2019

Eh2406 commented Jul 8, 2019

alexcrichton commented Jul 8, 2019

smarnach commented Jul 8, 2019

smarnach commented Jul 8, 2019

Eh2406 commented Jul 8, 2019

Eh2406 commented Jul 10, 2019

joshtriplett commented Jul 17, 2019 • edited Loading

Eh2406 commented Jul 17, 2019

sgrif commented Jul 17, 2019

ehuss commented Jul 25, 2019

alexcrichton commented Jul 25, 2019

Eh2406 commented Sep 6, 2019

alexcrichton commented Oct 17, 2019

alexcrichton commented Oct 17, 2019

ehuss commented Mar 13, 2020

alexcrichton commented Mar 25, 2020

gziskind commented Jul 24, 2020

Eh2406 commented Jul 24, 2020

Nemo157 commented Jul 28, 2020

Eh2406 commented Nov 13, 2020

jtgeibel commented Nov 19, 2020

jtgeibel commented May 3, 2021

pietroalbini commented May 5, 2021 • edited Loading

jtgeibel commented May 7, 2021

jtgeibel commented Jul 2, 2021

jtgeibel commented Sep 24, 2021

adamncasey commented Dec 20, 2021 • edited Loading

jtgeibel commented Dec 21, 2021

jtgeibel commented Mar 2, 2022

ehuss commented Jul 2, 2022

jtgeibel commented Jul 6, 2022

ehuss commented Aug 29, 2022

jtgeibel commented Aug 31, 2022

alexcrichton commented Jun 27, 2019 •

edited

Loading

joshtriplett commented Jul 17, 2019 •

edited

Loading

pietroalbini commented May 5, 2021 •

edited

Loading

adamncasey commented Dec 20, 2021 •

edited

Loading