-
Notifications
You must be signed in to change notification settings - Fork 142
gc: add --expire-to
option
#1843
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
/submit |
Submitted as pull.1843.git.1735041177817.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
There are issues in commit 4254269: |
4254269
to
5797579
Compare
Submitted as pull.1843.v2.git.1735611513.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -69,6 +69,12 @@ be performed as well. | |||
the `--max-cruft-size` option of linkgit:git-repack[1] for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> Documentation/git-gc.txt | 6 ++++++
> builtin/gc.c | 6 +++++-
> t/t6500-gc.sh | 6 ++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
> +
> --prune=<date>::
> Prune loose objects older than date (default is 2 weeks ago,
> overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..77904694c9f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
> } else {
> strvec_push(&repack, "-A");
> if (cfg->prune_expire)
> @@ -675,7 +678,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +696,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
> +
> run_and_wait_for_gc () {
> # We read stdout from gc for the side effect of waiting until the
> # background gc process exits, closing its fd 9. Furthermore, the
> --
> gitgitgadget
>
Hi, Jeff King, could you come and help take a look at this patch?
I would be very grateful if you have time!
ZheNing Hu
@@ -69,6 +69,12 @@ be performed as well. | |||
the `--max-cruft-size` option of linkgit:git-repack[1] for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
This patch has been sitting for weeks with no review. Does anyone want
to help take a look?
ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> Documentation/git-gc.txt | 6 ++++++
> builtin/gc.c | 6 +++++-
> t/t6500-gc.sh | 6 ++++++
> 3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
> +
> --prune=<date>::
> Prune loose objects older than date (default is 2 weeks ago,
> overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..77904694c9f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
> } else {
> strvec_push(&repack, "-A");
> if (cfg->prune_expire)
> @@ -675,7 +678,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +696,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
> +
> run_and_wait_for_gc () {
> # We read stdout from gc for the side effect of waiting until the
> # background gc process exits, closing its fd 9. Furthermore, the
> --
> gitgitgadget
>
@@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) | |||
static void add_repack_all_option(struct gc_config *cfg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 77904694c9f..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> static void add_repack_all_option(struct gc_config *cfg,
> struct string_list *keep_pack)
> {
> - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> + && !(cfg->cruft_packs && cfg->repack_expire_to))
> strvec_push(&repack, "-a");
I expected to see a mention of repack_expire_to here, but not
cfg->cruft_packs. These two are AND-ed together so we are only disabling
"repack -a" when both options ("--expire-to" and "--cruft") are passed.
Can we --expire-to without cruft? I.e., what should happen with:
git gc --expire-to=some-path --prune=now --no-cruft
Looking at the underlying git-repack, it seems that we only respect
--expire-to at all when used with "--cruft", and don't otherwise
consider it. Which is what the manpage says ("Only useful with --cruft
-d").
But if we look at this proposed patch for example:
https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/
then it is expanding how --expire-to is used during the pruning step.
OTOH, I think the way your patch 1 is structured means that we'd always
pass --expire-to to git-repack anyway, and I _think_ even with the patch
linked above that "repack -a -d --expire-to=whatever" would do the right
thing.
In which case the problem really is the combination of cruft packs and
expire-to. Just cruft packs by themselves do not need to override using
"-a" for "--prune=now" because we know that any such cruft pack would be
empty.
So I think this logic is correct. Taylor might have more thoughts,
though (and ideas on whether he intends to revisit that earlier patch).
I do think this change should probably be done as part of patch 1,
rather than introducing a buggy state and then fixing it in patch 2.
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道:
>
> On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 77904694c9f..8656e1caff0 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> > static void add_repack_all_option(struct gc_config *cfg,
> > struct string_list *keep_pack)
> > {
> > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> > + && !(cfg->cruft_packs && cfg->repack_expire_to))
> > strvec_push(&repack, "-a");
>
> I expected to see a mention of repack_expire_to here, but not
> cfg->cruft_packs. These two are AND-ed together so we are only disabling
> "repack -a" when both options ("--expire-to" and "--cruft") are passed.
> Can we --expire-to without cruft? I.e., what should happen with:
>
> git gc --expire-to=some-path --prune=now --no-cruft
>
> Looking at the underlying git-repack, it seems that we only respect
> --expire-to at all when used with "--cruft", and don't otherwise
> consider it. Which is what the manpage says ("Only useful with --cruft
> -d").
>
Yes, this is the current state of git-repack. The --expire-to option can
only be used with --cruft, which is why I use cruft_packs && repack_expire_to
as a double safeguard.
When using --no-cruft, the option --expire-to becomes irrelevant.
So leaving `git gc --prune=now` as is at this point: passing -a as a
parameter to repack seems reasonable.
> But if we look at this proposed patch for example:
>
> https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/
>
> then it is expanding how --expire-to is used during the pruning step.
> OTOH, I think the way your patch 1 is structured means that we'd always
> pass --expire-to to git-repack anyway, and I _think_ even with the patch
> linked above that "repack -a -d --expire-to=whatever" would do the right
> thing.
>
I've taken a look at the patch, and I believe Taylor's changes are primarily
aimed at extending the --expire-to functionality within the --cruft feature,
rather than expecting --expire-to to be used on its own.
> In which case the problem really is the combination of cruft packs and
> expire-to. Just cruft packs by themselves do not need to override using
> "-a" for "--prune=now" because we know that any such cruft pack would be
> empty.
>
> So I think this logic is correct. Taylor might have more thoughts,
> though (and ideas on whether he intends to revisit that earlier patch).
>
> I do think this change should probably be done as part of patch 1,
> rather than introducing a buggy state and then fixing it in patch 2.
>
Yes, I agree with that, and perhaps a single patch will suffice.
> -Peff
- ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trong danh sách gửi thư Git , ZheNing Hu đã viết ( trả lời bài này ):
Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道: > > On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote: > > > diff --git a/builtin/gc.c b/builtin/gc.c > > index 77904694c9f..8656e1caff0 100644 > > --- a/builtin/gc.c > > +++ b/builtin/gc.c > > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > > static void add_repack_all_option(struct gc_config *cfg, > > struct string_list *keep_pack) > > { > > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > > strvec_push(&repack, "-a"); > > I expected to see a mention of repack_expire_to here, but not > cfg->cruft_packs. These two are AND-ed together so we are only disabling > "repack -a" when both options ("--expire-to" and "--cruft") are passed. > Can we --expire-to without cruft? I.e., what should happen with: > > git gc --expire-to=some-path --prune=now --no-cruft > > Looking at the underlying git-repack, it seems that we only respect > --expire-to at all when used with "--cruft", and don't otherwise > consider it. Which is what the manpage says ("Only useful with --cruft > -d"). > Yes, this is the current state of git-repack. The --expire-to option can only be used with --cruft, which is why I use cruft_packs && repack_expire_to as a double safeguard. When using --no-cruft, the option --expire-to becomes irrelevant. So leaving `git gc --prune=now` as is at this point: passing -a as a parameter to repack seems reasonable. > But if we look at this proposed patch for example: > > https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/ > > then it is expanding how --expire-to is used during the pruning step. > OTOH, I think the way your patch 1 is structured means that we'd always > pass --expire-to to git-repack anyway, and I _think_ even with the patch > linked above that "repack -a -d --expire-to=whatever" would do the right > thing. > I've taken a look at the patch, and I believe Taylor's changes are primarily aimed at extending the --expire-to functionality within the --cruft feature, rather than expecting --expire-to to be used on its own. > In which case the problem really is the combination of cruft packs and > expire-to. Just cruft packs by themselves do not need to override using > "-a" for "--prune=now" because we know that any such cruft pack would be > empty. > > So I think this logic is correct. Taylor might have more thoughts, > though (and ideas on whether he intends to revisit that earlier patch). > > I do think this change should probably be done as part of patch 1, > rather than introducing a buggy state and then fixing it in patch 2. > Yes, I agree with that, and perhaps a single patch will suffice. > -Peff - ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5797579
to
0842ec3
Compare
/submit |
Submitted as pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
Today I do not have enough time to do my usual commit log message
critique. Please use "git show -s --format=reference" when
referring to an earlier commit.
> Note: When git-gc is used with both `--cruft` and `--expire-to`,
> it does not pass `-a` to git-repack to delete all unreachable
> objects as `git gc --prune=now` originally did. Instead, it
> generates a cruft pack in the directory specified by expire-to.
Is this less important than "we added --expire-to to gc that is
passed down to underlying repack" in the previous paragraph?
Not removing the unreachables too early with "repack -a" is an
essential part of the design of this new feature to allow us not to
lose the cruft objects, so I was a bit surprised that this was
described as a "Note:".
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more.
Does "When packing unreachable objects into a cruft pack" mean that
this option is only meaningful with "--cruft"? As "--cruft" is on
by default, is it an error to pass "--no-cruft" when you use this
option?
"for more" -> "for more information" or something?
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> static void add_repack_all_option(struct gc_config *cfg,
> struct string_list *keep_pack)
> {
> - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> + && !(cfg->cruft_packs && cfg->repack_expire_to))
> strvec_push(&repack, "-a");
Hmph. When "--expire-to=<there>" is given, we are dropping these
unreachable objects right away, but we said "--no-cruft", then we
say "repack -a". If we have both "--cruft" and "--expire-to=<there>",
then ...
> else if (cfg->cruft_packs) {
> strvec_push(&repack, "--cruft");
> @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
... we do the usual "repack --cruft --expire-to=<there>" in the next
block.
> @@ -675,7 +679,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
OK.
> @@ -694,6 +697,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
OK.
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> + mkdir expired &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
As "--cruft" is on by default, the command line does not have to
have it, but being explicit is good.
Should we also see what happens when "--no-cruft" is given?
Thanks. |
This patch series was integrated into seen via git@aa1682c. |
This branch is now known as |
This patch series was integrated into seen via git@9984f53. |
This patch series was integrated into seen via git@4ae0c92. |
There was a status update in the "New Topics" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Needs review. source: <pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@d36a896. |
This patch series was integrated into seen via git@c716f47. |
This patch series was integrated into seen via git@7f3e7d1. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Needs review. source: <pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com> |
On the Git mailing list, ZheNing Hu wrote (reply to this): Junio C Hamano <gitster@pobox.com> 于2025年1月17日周五 02:23写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > This commit extends the functionality of `git gc`
> > by adding a new option, `--expire-to=<dir>`. Previously,
> > this feature was implemented in `git repack` (see 91badeb),
> > allowing users to specify a directory where unreachable and
> > expired cruft packs are stored during garbage collection.
> > However, users had to run `git repack --cruft --expire-to=<dir>`
> > followed by `git prune` to achieve similar results within `git gc`.
> >
> > By introducing `--expire-to=<dir>` directly into `git gc`,
> > we simplify the process for users who wish to manage their
> > repository's cleanup more efficiently. This change involves
> > passing the `--expire-to=<dir>` parameter through to `git repack`,
> > making it easier for users to set up a backup location for cruft
> > packs that will be pruned.
>
> Today I do not have enough time to do my usual commit log message
> critique. Please use "git show -s --format=reference" when
> referring to an earlier commit.
>
Okay, I will change to using this format.
> > Note: When git-gc is used with both `--cruft` and `--expire-to`,
> > it does not pass `-a` to git-repack to delete all unreachable
> > objects as `git gc --prune=now` originally did. Instead, it
> > generates a cruft pack in the directory specified by expire-to.
>
> Is this less important than "we added --expire-to to gc that is
> passed down to underlying repack" in the previous paragraph?
>
I had thought that adding --expire-to to gc was key in this patch,
but the change to the implementation of --prune=now should
indeed be mentioned more.
> Not removing the unreachables too early with "repack -a" is an
> essential part of the design of this new feature to allow us not to
> lose the cruft objects, so I was a bit surprised that this was
> described as a "Note:".
>
You're right. This section shouldn't use a note; it should provide
a more detailed explanation instead.
> > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> > index 370e22faaeb..b4c0cf02972 100644
> > --- a/Documentation/git-gc.txt
> > +++ b/Documentation/git-gc.txt
> > @@ -69,6 +69,12 @@ be performed as well.
> > the `--max-cruft-size` option of linkgit:git-repack[1] for
> > more.
> >
> > +--expire-to=<dir>::
> > + When packing unreachable objects into a cruft pack, write a cruft
> > + pack containing pruned objects (if any) to the directory `<dir>`.
> > + See the `--expire-to` option of linkgit:git-repack[1] for
> > + more.
>
> Does "When packing unreachable objects into a cruft pack" mean that
> this option is only meaningful with "--cruft"? As "--cruft" is on
> by default, is it an error to pass "--no-cruft" when you use this
> option?
>
It (--expired-to) can currently only be used together with --cruft.
Using --no-cruft together with --expire-to will not result in an error,
but --expired-to will not take effect either.
I should mention in the document that --expire-to and --cruft
need to be used together, otherwise --expire-to will not
have any effect.
> "for more" -> "for more information" or something?
>
OK, "for more information".
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index d52735354c9..8656e1caff0 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -136,6 +136,7 @@ struct gc_config {
> > char *prune_worktrees_expire;
> > char *repack_filter;
> > char *repack_filter_to;
> > + char *repack_expire_to;
> > unsigned long big_pack_threshold;
> > unsigned long max_delta_cache_size;
> > };
> > @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> > static void add_repack_all_option(struct gc_config *cfg,
> > struct string_list *keep_pack)
> > {
> > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> > + && !(cfg->cruft_packs && cfg->repack_expire_to))
> > strvec_push(&repack, "-a");
>
> Hmph. When "--expire-to=<there>" is given, we are dropping these
> unreachable objects right away, but we said "--no-cruft", then we
> say "repack -a". If we have both "--cruft" and "--expire-to=<there>",
> then ...
>
> > else if (cfg->cruft_packs) {
> > strvec_push(&repack, "--cruft");
> > @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> > if (cfg->max_cruft_size)
> > strvec_pushf(&repack, "--max-cruft-size=%lu",
> > cfg->max_cruft_size);
> > + if (cfg->repack_expire_to)
> > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
>
> ... we do the usual "repack --cruft --expire-to=<there>" in the next
> block.
>
> > @@ -675,7 +679,6 @@ struct repository *repo UNUSED)
> > const char *prune_expire_sentinel = "sentinel";
> > const char *prune_expire_arg = prune_expire_sentinel;
> > int ret;
> > -
> > struct option builtin_gc_options[] = {
> > OPT__QUIET(&quiet, N_("suppress progress reporting")),
> > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
>
> OK.
>
> > @@ -694,6 +697,8 @@ struct repository *repo UNUSED)
> > PARSE_OPT_NOCOMPLETE),
> > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> > N_("repack all other packs except the largest pack")),
> > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> > + N_("pack prefix to store a pack containing pruned objects")),
> > OPT_END()
> > };
>
> OK.
>
> > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> > index ee074b99b70..d4b0653a9b7 100755
> > --- a/t/t6500-gc.sh
> > +++ b/t/t6500-gc.sh
> > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> > '
> >
> > +test_expect_success '--expire-to sets appropriate repack options' '
> > + mkdir expired &&
> > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> > +'
>
> As "--cruft" is on by default, the command line does not have to
> have it, but being explicit is good.
>
> Should we also see what happens when "--no-cruft" is given?
>
--expire-to with --no-cruft will still run repack -a, I will add
corresponding tests.
> Thanks.
Thanks. |
This patch series was integrated into seen via git@77d4d83. |
This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in 91badeb (builtin/repack.c: implement `--expire-to` for storing pruned objects, 2022-10-24), which allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Due to the original `git gc --prune=now` deleting all unreachable objects by passing the `-a` parameter to git repack. With the addition of the `--cruft` and `--expire-to` options, it is necessary to modify this default behavior: instead of deleting these unreachable objects, they should be merged into a cruft pack and collected in a specified directory. Therefore, we do not pass `-a` to the repack command but instead pass `--cruft`, `--expire-to`, and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com>
0842ec3
to
6946ccd
Compare
/submit |
Submitted as pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@099b60c. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Needs review. source: <pull.1843.v3.git.1736994932003.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@3e9f614. |
This patch series was integrated into seen via git@4cda6dc. |
This patch series was integrated into seen via git@b47a572. |
This patch series was integrated into seen via git@04f3677. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Will merge to 'next'? source: <pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@dea32b4. |
This patch series was integrated into seen via git@95558e3. |
This patch series was integrated into seen via git@b58c11b. |
This patch series was integrated into seen via git@590e903. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Will merge to 'next'? source: <pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@17fb9d3. |
On the Git mailing list, Junio C Hamano wrote (reply to this): "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in 91badeba32 (builtin/repack.c:
> implement `--expire-to` for storing pruned objects, 2022-10-24),
> which allowing users to specify a directory where unreachable
> and expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Due to the original `git gc --prune=now` deleting all unreachable
> objects by passing the `-a` parameter to git repack. With the
> addition of the `--cruft` and `--expire-to` options, it is necessary
> to modify this default behavior: instead of deleting these
> unreachable objects, they should be merged into a cruft pack and
> collected in a specified directory. Therefore, we do not pass `-a`
> to the repack command but instead pass `--cruft`, `--expire-to`,
> and `--cruft-expiration=now` to repack.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
This hasn't seen any reaction for a while.
Does anybody have further comments? Otherwise let's mark it for
'next'.
Thanks.
> Documentation/git-gc.txt | 7 +++++++
> builtin/gc.c | 9 +++++++--
> t/t6500-gc.sh | 33 +++++++++++++++++++++++++++++++++
> 3 files changed, 47 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..0eac8e85f08 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,13 @@ be performed as well.
> the `--max-cruft-size` option of linkgit:git-repack[1] for
> more.
>
> +--expire-to=<dir>::
> + When packing unreachable objects into a cruft pack, write a cruft
> + pack containing pruned objects (if any) to the directory `<dir>`.
> + This option only has an effect when used together with `--cruft`.
> + See the `--expire-to` option of linkgit:git-repack[1] for
> + more information.
> +
> --prune=<date>::
> Prune loose objects older than date (default is 2 weeks ago,
> overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
> char *prune_worktrees_expire;
> char *repack_filter;
> char *repack_filter_to;
> + char *repack_expire_to;
> unsigned long big_pack_threshold;
> unsigned long max_delta_cache_size;
> };
> @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> static void add_repack_all_option(struct gc_config *cfg,
> struct string_list *keep_pack)
> {
> - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> + && !(cfg->cruft_packs && cfg->repack_expire_to))
> strvec_push(&repack, "-a");
> else if (cfg->cruft_packs) {
> strvec_push(&repack, "--cruft");
> @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg,
> if (cfg->max_cruft_size)
> strvec_pushf(&repack, "--max-cruft-size=%lu",
> cfg->max_cruft_size);
> + if (cfg->repack_expire_to)
> + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
> } else {
> strvec_push(&repack, "-A");
> if (cfg->prune_expire)
> @@ -675,7 +679,6 @@ struct repository *repo UNUSED)
> const char *prune_expire_sentinel = "sentinel";
> const char *prune_expire_arg = prune_expire_sentinel;
> int ret;
> -
> struct option builtin_gc_options[] = {
> OPT__QUIET(&quiet, N_("suppress progress reporting")),
> { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +697,8 @@ struct repository *repo UNUSED)
> PARSE_OPT_NOCOMPLETE),
> OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
> N_("repack all other packs except the largest pack")),
> + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> + N_("pack prefix to store a pack containing pruned objects")),
> OPT_END()
> };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..74f7bd09046 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,39 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
> test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
> '
>
> +test_expect_success '--expire-to sets repack --expire-to' '
> + rm -rf expired &&
> + mkdir expired &&
> + expire_to="$(pwd)/expired/pack" &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to="$expire_to" &&
> + test_subcommand $cruft_max_size_opts --expire-to="$expire_to" <trace2.txt
> +'
> +
> +test_expect_success '--expire-to with --prune=now sets repack --expire-to' '
> + rm -rf expired &&
> + mkdir expired &&
> + expire_to="$(pwd)/expired/pack" &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --prune=now --expire-to="$expire_to" &&
> + test_subcommand git repack -d -l --cruft --cruft-expiration=now --expire-to="$expire_to" <trace2.txt
> +'
> +
> +
> +test_expect_success '--expire-to with --no-cruft sets repack -A' '
> + rm -rf expired &&
> + mkdir expired &&
> + expire_to="$(pwd)/expired/pack" &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --expire-to="$expire_to" &&
> + test_subcommand git repack -d -l -A --unpack-unreachable=2.weeks.ago <trace2.txt
> +'
> +
> +test_expect_success '--expire-to with --no-cruft sets repack -a' '
> + rm -rf expired &&
> + mkdir expired &&
> + expire_to="$(pwd)/expired/pack" &&
> + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --prune=now --expire-to="$expire_to" &&
> + test_subcommand git repack -d -l -a <trace2.txt
> +'
> +
> run_and_wait_for_gc () {
> # We read stdout from gc for the side effect of waiting until the
> # background gc process exits, closing its fd 9. Furthermore, the
>
> base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f |
This patch series was integrated into seen via git@bfa8551. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Will merge to 'next'. cf. <> source: <pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@9e631d0. |
This patch series was integrated into next via git@e075d60. |
This patch series was integrated into seen via git@4eb6b33. |
This patch series was integrated into seen via git@56d7e7d. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Will merge to 'master'. source: <pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@28f130d. |
There was a status update in the "Cooking" section about the branch "git gc" learned the "--expire-to" option and passes it down to underlying "git repack". Will merge to 'master'. source: <pull.1843.v4.git.1737704954987.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@7bb073a. |
This patch series was integrated into seen via git@5b9d01b. |
This patch series was integrated into master via git@5b9d01b. |
This patch series was integrated into next via git@5b9d01b. |
Closed via 5b9d01b. |
I want to perform a "safe" garbage collection for the Git repository
on the server, which avoids data corruption issues caused by
concurrent pushes during git GC. To achieve this, I currently need to
use
git repack --cruft --expire-to=<dir>
andgit prune
in combination. However, it would be simpler if we could directly use
--expire-to=<dir>
with the git-gc command.v1: add --expire-to option to gc
v1 -> v2: fix git gc --prune=now with --expire-to
v2 -> v3: squash two patch into one patch
v3 -> v4: modify docs, commit message, and give more tests
cc: gitster@pobox.com
cc: me@ttaylorr.com
cc: peff@peff.net