-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
Conversation
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
@poorbarcode Does this PR fix the issue mentioned in #21203 ? |
Yes, the current PR also fixed the issue that #21203 tries to fix. |
3eb5393
to
498ebec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a test to cover this case?
And it looks like we can simplify the fix by adding a new method terminate()
to the replicator so that we don't need to mix the closeProducer
and closeReplicator
logic.
05de423
to
257f163
Compare
Rebase master |
257f163
to
3bb81fa
Compare
a42bd91
to
5793ca1
Compare
… an orphan replicator in the previous topic owner (apache#21946)
Because there are too many conflicts and there are no new releases for
|
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
… an orphan replicator in the previous topic owner (apache#21946)
Motivation
There is a race condition that makes an orphan replicator in the original owner of a topic, and causes the new owner of the topic can not start a replicator due to
org.apache.pulsar.broker.service.BrokerServiceException$NamingException Producer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topic
.Scenario 1
Scenario 2
replication_clusters
.Current PR is focusing on Scenario 1.
Steps of Scenario 1
thread start replicator
unload bundle
pulsar.repl
closing
replicator.disconnect
replicator.stat --> Stopped
replicator.stat --> Starting
replicator.stat --> Started
readMoreEntries
, since there is no entries to read, just pending this requestpulsar.repl
Producer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topic
Modifications
Replicator.State.Stopped
intoProducer_Stopped
andClosed
.terminate
to close the Replicator.disconnect
only used to close the internal producer.A case that hit this issue
Picture-1: An orphan producer was left in

old broker
, it is not associated with any topic/replicatorPicture-2: After the topic is transferred to

new broker
, it can not start a new Replicator successfullySince the scenario is too complex, I can not add a test. But I reproduced the Scenario 1 locally.


#21948 fixes the following issues:
topic.unfenceTopicToResume
aftertopic.close
failed.Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: x