-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix manifest corruption #1756
Fix manifest corruption #1756
Conversation
…ncy on os.file.Sync()
…y on os.file.Sync()
4f2f2cf
to
bf9578f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome find @iluminae! Thanks for fixing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @iluminae and @manishrjain)
a discussion (no related file):
Awesome @iluminae, this is a nice catch.
Thanks @jarifibrahim for the review. :)
fyi @billprovince @joshua-goldstein - +1 from me to get this in |
…rrent compactions (#1756) `manifestFile.rewrite` closes the file due to which `manifest.fp.Sync()` can fail. This leads to an updated manifest but non-updated tables set. Hence, the compaction process enters into an infinite loop failing with the `MANIFEST removes non-existing table` error. This PR fixes it. (cherry picked from commit 6ed45ae)
…rrent compactions (#1756) `manifestFile.rewrite` closes the file due to which `manifest.fp.Sync()` can fail. This leads to an updated manifest but non-updated tables set. Hence, the compaction process enters into an infinite loop failing with the `MANIFEST removes non-existing table` error. This PR fixes it. (cherry picked from commit 6ed45ae)
…rrent compactions (#1756) `manifestFile.rewrite` closes the file due to which `manifest.fp.Sync()` can fail. This leads to an updated manifest but non-updated tables set. Hence, the compaction process enters into an infinite loop failing with the `MANIFEST removes non-existing table` error. This PR fixes it. (cherry picked from commit 6ed45ae)
…rrent compactions (#1756) (#1820) This PR is a cherry pick of #1756. Addresses [this issue](https://discuss.dgraph.io/t/badgerdb-manifest-corruption-issue-solution/15915) on Discuss and[ this issue](#1819) on Badger. Co-authored-by: Kenan Kessler <kesken88@gmail.com>
…rrent compactions (#1756) (#1820) This PR is a cherry pick of #1756. Addresses [this issue](https://discuss.dgraph.io/t/badgerdb-manifest-corruption-issue-solution/15915) on Discuss and[ this issue](#1819) on Badger. Co-authored-by: Kenan Kessler <kesken88@gmail.com>
Update Badger from `7677fcb` to `6ed45ae` (on main-deprecated branch). This fixes dgraph-io/badger#1752 and dgraph-io/badger#1756. First CI failed [here](https://github.com/dgraph-io/dgraph/actions/runs/4208376321/jobs/7304322951). This issue was resolved by bumping up the Go version used in the container.
…rrent compactions (dgraph-io#1756) `manifestFile.rewrite` closes the file due to which `manifest.fp.Sync()` can fail. This leads to an updated manifest but non-updated tables set. Hence, the compaction process enters into an infinite loop failing with the `MANIFEST removes non-existing table` error. This PR fixes it.
This fixes the manifest corruption posted about here
I was able to reproduce the race condition every time by injecting a sleep into the sync call during tests only. I am not sure if you guys would rather have this or a test that is much longer and may or may not reproduce based on high concurrency (but does not inject the sleep) or no test at all.
Let me know - I made it so we can just revert the test commit and it should be good to go.
This change is