-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
WIP: Single Active Replication #21347
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #21347 +/- ##
==========================================
+ Coverage 45.36% 46.23% +0.86%
==========================================
Files 244 247 +3
Lines 13333 13883 +550
Branches 2719 2875 +156
==========================================
+ Hits 6049 6419 +370
- Misses 6983 7127 +144
- Partials 301 337 +36
Flags with carried forward coverage won't be shown. Click here to find out more. |
6763a2c
to
a997836
Compare
Thanks, @bupd, for your contribution! I believe this is a valuable new feature for Harbor that should follow the proposal process. There are several questions we need to briefly discuss:
|
We were planning to use
Indeed PolicyID was never needed before. We can add it anyway to the job arguments so we can scan observations, parse job arguments and match policyID with the current.
I believe even checking an artefact for existence produces an additonal HTTP request. So, we still adding some extra efficiency. |
Regarding that statement. there is a fundamental deficit with harbor replications, when the layer is in the copy process, new replication will try to copy the same blob again. If you have a slow connection like DSL the same blob will be copied over and over again. This results in the domino effect. Each new replication halves the throughput, resulting in slower transfers, and because all replications run in parallel, the same effect happens with the next blob/layer. We have made an experiment with:
|
Signed-off-by: bupd <bupdprasanth@gmail.com> Co-authored-by: Maksym Trofimenko <maksym@container-registry.com>
Signed-off-by: bupd <bupdprasanth@gmail.com> Co-authored-by: Maksym Trofimenko <maksym@container-registry.com>
* Adds Skip If Runnning to Replication Policy Signed-off-by: bupd <bupdprasanth@gmail.com>
* Adds Checkbox in replication policy UI * Updates swagger & sql schema Signed-off-by: bupd <bupdprasanth@gmail.com>
Signed-off-by: bupd <bupdprasanth@gmail.com>
a997836
to
a901167
Compare
* renamed all occurences to single active replication * updated description and tests Signed-off-by: bupd <bupdprasanth@gmail.com>
a901167
to
f50e1c2
Compare
fa151c0
to
9d9b400
Compare
Signed-off-by: bupd <bupdprasanth@gmail.com>
Signed-off-by: bupd <bupdprasanth@gmail.com>
Hello @wy65701436 The below are my observations.
While single active replication is checked. Replication executions per policy will become to atmost one. So if an execution is already running. other scheduled executions will be defered.
Yes exactly, Harbor only skips artifacts that have already been successfully replicated. In the below scenario I tried replicating three images each of size: 512mb. but the resulting repository quota Project QuotaWhere the quota used should only be less than 1.5gb.
The below is the overall quotaAttaching here for reference. Thanks You can check the two instances used on this experiment: reg1.bupd.xyz & reg2.bupd.xyz |
Observation on Replication Performance with and without FeatureSpeed Test Results:Upload Speed: 9.50 Mbps (Data Used: 4.3 MB) Images to Replicate: Variations of docker.io/vad1mo/1gb-random-fileWorkflow 1 (with Feature):1 x image (512 MB) InsightReplication with the feature enabled completed within the expected timeframe, showing stable upload speed and no packet loss. Workflow 2 (without Feature):Started: 2:55 PM Status of InstanceInsightThe replication process failed without the feature enabled, highlighting that the feature is likely essential for successful image transfers between reg1.bupd.xyz and reg2.bupd.xyz. This indicates that the feature could be critical for ensuring stable uploads and completing transfers within the expected time. Conclusion:This observation suggests that this feature is necessary for reliable replication between registries. The data transfer rates and lack of packet loss when the feature is used make it an essential component for stable image replication. |
you can say a workaround here:
The workaround of setting a longer replication interval, like once a day, fails to address the need for timely synchronization across registries. For users who rely on Harbor to maintain identical registries at different locations, frequent replication (e.g., every 5 minutes) is necessary to ensure minimal discrepancies between registries. By suggesting a longer interval, users may end up with outdated or inconsistent images, undermining the core functionality of replication. Thanks. |
checks policyID before creating tasks Signed-off-by: Maksym Trofimenko <maksym@container-registry.com>
Head branch was pushed to by a user without write access
Signed-off-by: Maksym Trofimenko <maksym@container-registry.com>
Single Active Replication per replication policy
Proposal: goharbor/community#256
Summary
This PR addresses a long-standing issue where overlapping replications of the same artifact can occur in Harbor, leading to unnecessary resource consumption and poor performance. By introducing a "Disable parallel replication" checkbox in the replication policy, it ensures that replication tasks for the same policy do not run in parallel, preventing bandwidth overload and queue backups, especially for large artifacts.
Similar Issues
Related Issues
Why do we need this
Changes Made
Single active replication
Checkbox in Replication UI.single_active_replication
column in sql.Screenshots
Observation on Replication Performance with and without Feature
Speed Test Results:
Upload Speed: 9.50 Mbps (Data Used: 4.3 MB)
Latency: 37.96 ms (Jitter: 1.76 ms, Min: 25.11 ms, Max: 43.32 ms)
Packet Loss: 0.0%
Result URL: Speedtest Result
Images to Replicate: Variations of docker.io/vad1mo/1gb-random-file
Workflow 1 (with Feature):
1 x image (512 MB)
3 x images (~1.5 GB)
From: reg1.bupd.xyz to reg2.bupd.xyz
Replication: Normal
Started: 2:27 PM
Completed: 2:49 PM
Bandwidth Used: 1.42 GB
Theoretical Time: 22.5 minutes for 1.5 GB
Actual Time: 22 minutes (No packet loss or bandwidth issues)
Insight
Replication with the feature enabled completed within the expected timeframe, showing stable upload speed and no packet loss.
Workflow 2 (without Feature):
Started: 2:55 PM
Name: Destroyer
Bandwidth Used: 13+ GB
Result: Failed
Time Taken: ~4+hrs
Status of Instance (no longer functioning)
Insight
The replication process failed without the feature enabled, highlighting that the feature is likely essential for successful image transfers between reg1.bupd.xyz and reg2.bupd.xyz. This indicates that the feature could be critical for ensuring stable uploads and completing transfers within the expected time.
Conclusion:
This observation suggests that this feature is necessary for reliable replication between registries. The data transfer rates and lack of packet loss when the feature is used make it an essential component for stable image replication.
Todo
Please indicate you've done the following:
Fix #19937