-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Make LFS http_client parallel within a batch. #32369
Conversation
This is somewhat related to #32307 , as without that mirroring upstream lfs files is limited to 20 concurrent threads. For my own internal mirroring, setting |
modules/lfs/http_client.go
Outdated
} else { | ||
err := dc(object.Pointer, nil, object.Error) | ||
if errors.Is(object.Error, ErrObjectNotExist) { | ||
log.Warn("Ignoring missing upstream LFS object %-v: %v", object.Pointer, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to point out that this actually reduces the usefulness of this logline. I moved it out of modules/repository/repo.go
, but there it had access to which repository it was operating on, so you know where to investigate if you want to track down the lfs pointer/blob. Here in modules/lfs/http_client.go
, I don't have access to the repository name. I was thinking I could add a value to the Context which is passed down from repo to here?
dbb14d5
to
82754f0
Compare
modules/lfs/http_client.go
Outdated
} | ||
continue | ||
} | ||
func(groupCtx context.Context, object *ObjectResponse, dc DownloadCallback, uc UploadCallback, transferAdapter TransferAdapter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://go.dev/blog/loopvar-preview
No need such trick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please advise, are you suggesting that I can just do this?:
for _, object := range result.Objects {
go performSingleOperation(groupCtx, object, dc, uc, transferAdapter)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think so if it is right.
Maybe just for { errGroup.Go(...) }
is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So actually, this closure seems necessary. Admittedly I used this pattern because I'd seen it in practice elsewhere, not because I fully understood why it was in use, but errGroup.Go
only takes a function without arguments; you cannot pass arguments to functions inside errGroup. An anonymous function enclosure like this is the only way I see to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly I used this pattern because I'd seen it in practice elsewhere,
In history, it is due to https://go.dev/blog/loopvar-preview
(https://github.com/go-gitea/gitea/pull/32369/files#r1821738812)
not because I fully understood why it was in use
Well, that's why I would like to clarify the problem. My philosophy is that "do not write the code which you do not understand"
but
errGroup.Go
only takes a function without arguments; you cannot pass arguments to functions inside errGroup.
I do not see why it is a problem.
(the demo code in https://github.com/go-gitea/gitea/pull/32369/files#r1823705885)
for _, object := range result.Objects {
errGroup.Go(func() error {
return performSingleOperation(groupCtx, object, dc, uc, transferAdapter)
})
}
76ccd7c
to
46b730a
Compare
had one more force-push after |
modules/lfs/http_client.go
Outdated
return err | ||
} | ||
// this was NOT an 'upload' request inside the batch request, meaning it must be a 'download' request | ||
err := dc(object.Pointer, nil, object.Error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still have the question here (a new thread):
Now the code is:
if uc != nil {
if err := uc(); err != nil {
return err
}
// did you miss a return here?
}
// It seems that when uc != nil and uc() == nil, then dc is called by mistake?
err := dc()
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, but no I believe this is correct. What I think may be confusing is that the existing code on main assumes that the uploadCallback will return an error if it is called with a non-nil object.Error. This whole closure is predicated by that, we are guaranteed to have a non-nil object.Error and thus the existing logic (which perhaps not very defensive) is not flawed, a call to uc or dc will always also return an error. So the flow of this code is like:
- object.Error is non-nil, meaning this specific oid within a batch has some kind of error, maybe it was an upload request, maybe it was a download request, we don't know yet
- if the upload callback was non-nill, it means this was an upload request, execute the upload callback with the object.Error and get a guaranteed error returned
- if it was not an upload callback, we must have a download callback, as mentioned in the previous thread there are currently only two kinds of operations allowed by the lfs batch api. Previously this was in an 'else' block, I removed that by advisement of the linter, as the else block would always get executed and it outdents this code even further.
Put another way, when object.Error != nil
this closure will always return either error
from uploadCallback or downloadCallback, OR nil
from downloadCallback if the error should be ignored (404). No other value can be returned from this, it's just a switch for executing the correct upload/dowload callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. if the upload callback was non-nil, it means this was an upload request, execute the upload callback with the object.Error and get a guaranteed error returned
That's the problem. Which test or document guarantees that "upload callback returns an error" in this case?
The changed logic is fragile:
- It does change the original if-flow -- if "upload callback" returns nil. The old code just "continues" to next without calling "download callback", but new code calls "download callback".
- It is counter-intuitive, code readers must think a lot more about how "upload callback" work to understand whether there is a bug.
- If the upload callback behaviors would be changed in the future (for example: ignores some errors), the new logic is difficult to maintain.
To be intuitive, defensive programming, and easier to understand, I think we should choose one explicit behavior:
if uc != nil {
if err := uc(); err != nil {
return err
}
// choose one explicit behavior:
// * "return nil" means the upload callback could decide to ignore some kinds of errors, not stop the uploading
// * "return err" means whatever upload callback does, any error should stop the uploading
//
// Actually "return nil" is the old behavior, I think it was well-designed,
// and it makes it possible to ignore some errors.
return nil
}
46b730a
to
ef84dfa
Compare
@@ -113,6 +115,7 @@ func (c *HTTPClient) Upload(ctx context.Context, objects []Pointer, callback Upl | |||
return c.performOperation(ctx, objects, nil, callback) | |||
} | |||
|
|||
// performOperation takes a slice of LFS object pointers, batches them, and performs the upload/download operations concurrently in each batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing, I do not think "batch size" means "concurrency limit" in this case. If I understand correctly:
- "batch size" means how many objects to be processed in one time.
- "concurrency limit" means how many concurrent connections to the LFS server.
Maybe "batch size" could be 100 or 1000 without causing problem, but "concurrency limit" should be much smaller, eg: 5 or 10. A lot of connections to a LFS server might trigger their rate-limit protection, or be considered as somewhat DoS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always in favor of making things more configurable, however as an administrator I think most of the time you would want [lfs_client].BATCH_SIZE
= (the number of concurrent threads for upload/download within a batch). I'm happy to add a new config though, what should we call it and what should its default be? [lfs_client].BATCH_MAX_THREADS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe [lfs_client].MAX_OPERATION_CONCURRENCY
? (since there is no "thread" concept in Golang)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, how about [lfs_client].BATCH_OPERATION_CONCURRENCY
, as I feel there may be other single-threaded operations still in this code that we might want configs for then, and this is specific to lfs client batch operations? I updated this branch with that config, and defaulted it to what [lfs_client].BATCH_SIZE
if unset or <1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to match the default git-lfs's behavior:
lfs.concurrenttransfers
The number of concurrent uploads/downloads. Default 3.
2e5a251
to
13f2a7c
Compare
Signed-off-by: Royce Remer <royceremer@gmail.com>
13f2a7c
to
4653cad
Compare
last force-push was just a rebase from main, this should be ready to review again. I believe I've addressed all comments although suspect there may be opinions on my choice of the new config key under |
Made 2 commits:
|
@@ -66,6 +67,11 @@ func loadLFSFrom(rootCfg ConfigProvider) error { | |||
LFSClient.BatchSize = 20 | |||
} | |||
|
|||
if LFSClient.BatchOperationConcurrency < 1 { | |||
// match the default git-lfs's `lfs.concurrenttransfers` | |||
LFSClient.BatchOperationConcurrency = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default in lfs client is actually 8, not 3: https://github.com/git-lfs/git-lfs/blob/main/lfshttp/client.go#L89 , but I've documented what you set here: https://gitea.com/gitea/docs/pulls/88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch .... actually there are different documents (I read an out-dated one) .... the default value also ever changed. 😅
The new document: https://github.com/git-lfs/git-lfs/blob/main/docs/man/git-lfs-config.adoc also says 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update it to 8 and update the documents.
-> Use 8 as default value for git lfs concurrency #32421
Thank you very much.
* giteaofficial/main: (21 commits) Use 8 as default value for git lfs concurrency (go-gitea#32421) Fix milestone deadline and date related problems (go-gitea#32339) Only query team tables if repository is under org when getting assignees (go-gitea#32414) Refactor RepoRefByType (go-gitea#32413) Refactor template ctx and render utils (go-gitea#32422) Refactor DateUtils and merge TimeSince (go-gitea#32409) Refactor markup package (go-gitea#32399) Add some handy markdown editor features (go-gitea#32400) Make LFS http_client parallel within a batch. (go-gitea#32369) Refactor repo legacy (go-gitea#32404) Replace DateTime with proper functions (go-gitea#32402) Fix git error handling (go-gitea#32401) Fix created_unix for mirroring (go-gitea#32342) Replace DateTime with DateUtils (go-gitea#32383) improve performance of diffs (go-gitea#32393) Refactor tests to prevent from unnecessary preparations (go-gitea#32398) Add artifacts test fixture (go-gitea#30300) Fix `missing signature key` error when pulling Docker images with `SERVE_DIRECT` enabled (go-gitea#32365) Fix a number of typescript issues (go-gitea#32308) Update go dependencies (go-gitea#32389) ...
No description provided.