Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Reinstate read-only lock on hooks access in dialHook to fix data race #3225

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LINKIWI
Copy link
Contributor

@LINKIWI LINKIWI commented Jan 11, 2025

Previously, in #3088, I removed the mutex guarding the implementation of dialHook in order to resolve an unbounded contention failure mode, that had the potential to backpressure commands indefinitely during periods of server downtime.

However, this introduced a data race regression, which was the original motivation of introducing the lock, in #2814.

A minimal reproduction is as follows:

package main

import (
	"context"
	"fmt"

	"github.com/redis/go-redis/v9"
)

type h struct{}

func (h *h) DialHook(next redis.DialHook) redis.DialHook {
	return next
}

func (h *h) ProcessHook(next redis.ProcessHook) redis.ProcessHook {
	return next
}

func (h *h) ProcessPipelineHook(next redis.ProcessPipelineHook) redis.ProcessPipelineHook {
	return next
}

func exec() {
	ctx := context.Background()
	opts := &redis.Options{
		MinIdleConns: 5,
	}

	client := redis.NewClient(opts)
	client.AddHook(&h{})

	fmt.Println(client.Ping(ctx))
}

func main() {
	exec()
}
package main

import (
	"testing"
)

func TestExec(t *testing.T) {
	exec()
}
$ go test -v -race -count=1 ./cmd/...
=== RUN   TestExec
==================
WARNING: DATA RACE
Write at 0x00c0000f6130 by goroutine 8:
  github.com/redis/go-redis/v9.(*hooksMixin).chain()
      /home/kiwi/sync/code/external/go-redis/redis.go:126 +0x128
  github.com/redis/go-redis/v9.(*hooksMixin).AddHook()
      /home/kiwi/sync/code/external/go-redis/redis.go:117 +0x1b1
  github.com/redis/go-redis/v9/cmd.exec()
      /home/kiwi/sync/code/external/go-redis/cmd/main.go:31 +0xaa
  github.com/redis/go-redis/v9/cmd.TestExec()
      /home/kiwi/sync/code/external/go-redis/cmd/main_test.go:8 +0x1c
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/lib/go/src/testing/testing.go:1743 +0x44

Previous read at 0x00c0000f6130 by goroutine 9:
  github.com/redis/go-redis/v9.(*hooksMixin).dialHook()
      /home/kiwi/sync/code/external/go-redis/redis.go:183 +0x8c
  github.com/redis/go-redis/v9.(*hooksMixin).dialHook-fm()
      <autogenerated>:1 +0x8f
  github.com/redis/go-redis/v9.newConnPool.func1()
      /home/kiwi/sync/code/external/go-redis/options.go:516 +0x9a
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).dialConn()
      /home/kiwi/sync/code/external/go-redis/internal/pool/pool.go:213 +0x16c
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).addIdleConn()
      /home/kiwi/sync/code/external/go-redis/internal/pool/pool.go:143 +0x54
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).checkMinIdleConns.func1()
      /home/kiwi/sync/code/external/go-redis/internal/pool/pool.go:126 +0x2e

Goroutine 8 (running) created at:
  testing.(*T).Run()
      /usr/lib/go/src/testing/testing.go:1743 +0x825
  testing.runTests.func1()
      /usr/lib/go/src/testing/testing.go:2168 +0x85
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:1690 +0x226
  testing.runTests()
      /usr/lib/go/src/testing/testing.go:2166 +0x8be
  testing.(*M).Run()
      /usr/lib/go/src/testing/testing.go:2034 +0xf17
  main.main()
      _testmain.go:45 +0x164

Goroutine 9 (running) created at:
  github.com/redis/go-redis/v9/internal/pool.(*ConnPool).checkMinIdleConns()
      /home/kiwi/sync/code/external/go-redis/internal/pool/pool.go:125 +0x6d
  github.com/redis/go-redis/v9/internal/pool.NewConnPool()
      /home/kiwi/sync/code/external/go-redis/internal/pool/pool.go:109 +0x25e
  github.com/redis/go-redis/v9.newConnPool()
      /home/kiwi/sync/code/external/go-redis/options.go:514 +0x35d
  github.com/redis/go-redis/v9.NewClient()
      /home/kiwi/sync/code/external/go-redis/redis.go:665 +0x208
  github.com/redis/go-redis/v9/cmd.exec()
      /home/kiwi/sync/code/external/go-redis/cmd/main.go:30 +0xa4
  github.com/redis/go-redis/v9/cmd.TestExec()
      /home/kiwi/sync/code/external/go-redis/cmd/main_test.go:8 +0x1c
  testing.tRunner()
      /usr/lib/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/lib/go/src/testing/testing.go:1743 +0x44
==================
ping: PONG
    testing.go:1399: race detected during execution of test
--- FAIL: TestExec (0.00s)
FAIL
FAIL	github.com/redis/go-redis/v9/cmd	0.009s
FAIL

This race is caused by concurrent access to hs.current when the connection pool executes dialHook in the background (when MinIdleConns > 0) while AddHook also mutates hs.current. However, within dialHook, only read access is required. This PR proposes fixing this by changing the mutex to a sync.RWMutex and guarding only the access to hs.current with the lock, which both solves the data race and does not regress the connection contention unit test introduced in #3088.

With this patch, the example test above passes with the race detector enabled:

$ go test -v -race -count=1 ./cmd/...
=== RUN   TestExec
ping: PONG
--- PASS: TestExec (0.00s)
PASS
ok  	github.com/redis/go-redis/v9/cmd	1.010s

@LINKIWI
Copy link
Contributor Author

LINKIWI commented Jan 15, 2025

@ofekshenawa Would you be able to help with the review on this one? This is a follow up to #3088. Thanks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant