-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[improve][ci] Disable test that causes OOME until the problem has been resolved #22586
[improve][ci] Disable test that causes OOME until the problem has been resolved #22586
Conversation
In one of the heap dumps, there was 251,029 lambdas which all reference a Using https://github.com/vlsi/mat-calcite-plugin to query the heap dump. select this['arg$2.completeTopicName'], count(*) from "org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$1819+0x00007f08a8b65ee8" group by 1
|
In another heapdump select this['arg$2.completeTopicName'], count(*) from "org.apache.pulsar.broker.resources.NamespaceResources$PartitionedTopicResources$$Lambda$3405+0x00007fae50f7b000" group by 1
|
There are a few recent replicator related changes #21946, #21948 and #22537 . @poorbarcode please check if one of the changes is triggering the OOME issue possibly related to deletion. There are a lot of entries for |
Just wondering if the problem is somehow related to namespace deletion with replication enabled. pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java Lines 216 to 344 in d7d5452
The concurrency issue is explained in #22541 (comment) |
the namespace deletion in the test might be the code that triggers the problem: pulsar/pulsar-broker/src/test/java/org/apache/pulsar/broker/service/ReplicatorSubscriptionTest.java Lines 821 to 824 in e81a20d
@poorbarcode do you have a chance to debug this issue? |
There are more problems. Using heap dump from https://github.com/apache/pulsar/actions/runs/8835173621/attempts/1?pr=22583 select toString(this['stack.fn.arg$1']), count(*) from java.util.concurrent.CompletableFuture where this['stack.fn'] is not null group by 1 order by 2 desc
|
select toString(this['result.ex.detailMessage']), count(*) from java.util.concurrent.CompletableFuture where this['result.ex.detailMessage'] is not null group by 1 order by 2 desc
|
Motivation
Unit test group 1 fails often with OOME. (example)
Modifications
The issue is most like related to #21495 and org.apache.pulsar.broker.service.ReplicatorSubscriptionTest#testWriteMarkerTaskOfReplicateSubscriptions .
Disable the test until the problem has been resolved.
Documentation
doc
doc-required
doc-not-needed
doc-complete