Skip to content

fix: Panic when closing DB while other goroutines doing reads #1986

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 77 commits into from
Closed

fix: Panic when closing DB while other goroutines doing reads #1986

wants to merge 77 commits into from

Conversation

fatelei
Copy link
Contributor

@fatelei fatelei commented Aug 4, 2023

fix: Panic when closing DB while other goroutines doing reads

Problem

fix #1923

Solution

when close db, should set close flag firstly

darkn3rd and others added 30 commits April 15, 2021 08:28
zstd is not set by default even when cgo is enabled.
Add a Builder type in skiplist package which can be used to insert
sorted keys efficiently. Add a test and benchmark for it.
This change makes the skiplist grow for the case of sorted 
skiplist builder. The normal skiplist still cannot grow. 
Note: The growing skiplist is not thread safe.

Co-authored-by: Ahsan Barkati <ahsanbarkati@gmail.com>
In Dgraph, we already use Raft write-ahead log. Also, when we commit transactions, we update tens of thousands of keys in one go. To optimize this write path, this PR introduces a way to directly hand over Skiplist to Badger, short circuiting Badger's Value Log and WAL.

This feature allows Dgraph to generate Skiplists while processing mutations and just hand them over to Badger during commits. It also accepts a callback which can be run when Skiplist is written to disk. This is useful for determining when to create a snapshot in Dgraph.
…isher (#1697)

When a skip-list is handed over to badger we should also send the
entries in skiplist to the publisher so that all the subscribers get notified.
This PR adds DropPrefixNonBlocking and DropPrefixBlocking API that can be used to logically delete the data for specified prefixes.
DropPrefix now makes decision based on badger option AllowStopTheWorld whose default is to use DropPrefixBlocking.
With DropPrefixNonBlocking the data would not be cleared from the LSM tree immediately. It would be deleted eventually through compactions.

Co-authored-by: Rohan Prasad <prasad.rohan93@gmail.com>
Add benchmark tool for picktable benchmarking.
#1700)

This PR adds FullCopy option in Stream. This allows sending the table entirely to the writer. If this option is set to true we directly copy over the tables from the last 2 levels. This option increases the stream speed while also lowering the memory consumption on the DB that is streaming the KVs.
For 71GB, compressed and encrypted DB we observed 3x improvement in speed. The DB contained ~65GB in the last 2 levels while remaining in the above levels.

To use this option, the following options should be set in Stream.

stream.KeyToList = nil
stream.ChooseKey = nil
stream.SinceTs = 0
db.managedTxns = true

If we use stream writer for receiving the KVs, the encryption mode has to be the same in sender and receiver. This will restrict db.StreamDB() to use the same encryption mode in both input and output DB. Added TODO for allowing different encryption modes.
Remove "GitHub issues" reference. (we use discuss now)
Remove Datadog's ZSTD that requires CGO
Make Klauspost's ZSTD as default
This PR adds support for stream writing incrementally to the DB.
Adds an API: StreamWriter.PrepareIncremental

Co-authored-by: Manish R Jain <manish@dgraph.io>
While doing an incremental stream write, we should look at the first level on which there is no data. Earlier, due to a bug we were writing to a level that already has some tables.
I propose this simple fix for detecting conflicts in managed mode. Addresses https://discuss.dgraph.io/t/fatal-error-when-writing-conflicting-keys-in-managed-mode/14784.

When a write conflict exists for a managed DB, an internal assert can fail.
This occurs because a detected conflict is indicated with commitTs of 0, but handling the error is skipped for managed DB instances.

Rather than conflate conflict detection with a timestamp of 0, it can be indicated with another return value from hasConflict.
)

With the introduction of SinceTs, a bug was introduced #1653 that skips the pending entries.
The default value of SinceTs is zero. And for the transaction made at readTs 0, the pending entries have version set to 0. So they were also getting skipped.
* According to CHANGELOG.md#removed-apis. `TableLoadingMode` option is removed.
fixup the memory usage part of doc.

* Remove `Options.ValueLogLoadingMode` part.
requilence and others added 25 commits April 29, 2022 09:55
## Problem

Links are broken, and issue / PR templates don't exist. Codeowners is
out of date.

## Solution

We also add github templates to bring the repo to parity with Dgraph and
Ristretto. We update the codeowners file and fix some links.
## Problem

Badger tests are run through a bash script called test.sh. It is
currently not working, and there are no CI workflows on the Badger
repository.

## Solution

We update test.sh and add CI workflow steps. Test suite now works for
Linux and Mac (x86, M1). To run the test suite, simply run `make test`.
If you are on Mac, see remark below.

### Why a Makefile?

Badger depends on jemalloc for efficient memory allocation (see
[here](https://dgraph.io/blog/post/manual-memory-management-golang-jemalloc/),
and z package in Ristretto). While Badger can be built without jemalloc,
many users will probably want to benefit from it. The makefile makes it
easy to install the jemalloc dependency with `make jemalloc`. Also now
the test.sh script contains only test related functionality.

This also has the advantage that now the CI workflow only needs
essentially two steps:

```
make dependency
make test
```

### Remarks
- In pb/gen.sh, `go get` is
[deprecated](https://go.dev/doc/go-get-install-deprecation) as a way to
retrieve and install an executable into our $GOBIN, which is what we
want to do here. We use `go install` instead.
- In test.sh, Teamcity flags no longer needed
- In test.sh, Installjemalloc no longer needed as it is in the Makefile,
simplifying the test script
- In test.sh, tests are now run sequentially and not in parallel
(currently broken)
- We remove .travisci because it is unused
- Important note for Mac users: for historical reasons, certain tools on
MacOS are different from the standard GNU tools everyone else uses. One
of these is the `mktemp` command, which is used in the test suite. In
order to get the GNU version of this tool, you can run:
```
brew install coreutils
export PATH="$(brew --prefix)/opt/coreutils/libexec/gnubin:$PATH"
make test
```

This will temporarily modify your path so that the GNU version of mktemp
is called in the script. Without this fix the script will complain that
it doesn't recognize the p flag in `mktemp -d -p .`

#### To-do
- tune lint tests
- add code coverage
- potentially add build target for badger in makefile
- bring test suite to parity with [old teamcity
setup](https://teamcity.dgraph.io/project.html?projectId=Badger)
- prune dependencies
## Problem

Readme is out of date.

## Solution

We update the readme.
## Problem

Readme badges are out of date. 

## Solution

We update readme badges.
## Problem

Linter was not running due to Go config issue. See issue here
#1810.

## Solution

We refactor the setup for the linter. See relevant discussion here
golangci/golangci-lint#1920. Go setup for
linter is now on parity with the badger test CI workflow.
## Problem
 
Errcheck linter is failing.

## Solution

On [Dgraph](https://github.com/dgraph-io/dgraph/blob/main/.golangci.yml)
and
[Ristretto](https://github.com/dgraph-io/ristretto/blob/main/.golangci.yml)
repositories, neither is running errcheck. We temporarily disable
errcheck here to bring this repository to parity. We also do some
cleanup in the Readme.
## Problem

We are not reporting test coverage.

## Solution

We show test coverage with coveralls.
## Problem

Previously we were running a nightly Badger bank test (4 hours).

## Solution

We bring back the nightly badger bank test.
## Problem

Resolving more lint errors.

## Solution

We fix some basic gosimple and gofmt lint errors. We raise the lll line
length to 120 (as in Dgraph repo).
## Problem

Lint tests were failing.

## Solution
 
We comment out varcheck and gosec.  We will resolve these later.
## Problem

gas linter was running gosec, which we will resolve later.

## Solution

We comment out gas linter.
## Problem

CI jobs are not running when updates are made to release branch.

## Solution

Run all CI jobs when PR's are opened against release branch.
This PR adds CD steps for Badger releases. Artifacts (badger binary and
checksum) will be uploaded automatically to Github. Final step will be
to add artifacts to release. This reflects the process we already have
in place for Dgraph.

Badger build flags were taken from the [Dgraph release
script](https://github.com/dgraph-io/dgraph/blob/main/contrib/release.sh).
We add a Makefile to streamline the build process.

(cherry picked from commit 11c81e3)

## Remark

PR is duplicate (cherry-pick) because we have two branches running in
parallel (main and release/v3.2103).
Latest runner tag now uses ubuntu-22.04.  We pin to ubuntu 20.04.
Currently [appveyor
tests](https://ci.appveyor.com/project/manishrjain/badger/builds/42502297)
are failing in multiple places on Windows.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg"
height="34" align="absmiddle"
alt="Reviewable"/>](https://reviewable.io/reviews/dgraph-io/badger/1775)
<!-- Reviewable:end -->
## Problem

Currently we only deploy amd64 badger CLI tool builds. We would like
arm64 builds too.

## Solution

Use an arm64 self-hosted runner to build arm64 badger CLI tool.
Signed-off-by: fatelei <fatelei@gmail.com>
@CLAassistant
Copy link

CLAassistant commented Aug 4, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
12 out of 32 committers have signed the CLA.

✅ darkn3rd
✅ SaveTheRbtz
✅ Avokadoen
✅ matino
✅ requilence
✅ smnzhu
✅ TensShinet
✅ pckhoi
✅ mjungsbluth
✅ MichelDiz
✅ joshua-goldstein
✅ fatelei
❌ atetubou
❌ MikkelHJuul
❌ bucanero
❌ NamanJain8
❌ manishrjain
❌ ahsanbarkati
❌ darkweak
❌ chenfengjin
❌ roysc
❌ Ashmita152
❌ qichengzx
❌ jalseth
❌ aman-bansal
❌ mihaiiorga
❌ danforbes
❌ iluminae
❌ dmacvicar
❌ jjzazuet
❌ BenjamenMeyer
❌ vitalvas
You have signed the CLA already but the status is still pending? Let us recheck it.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: Panic when closing DB while other goroutines doing reads