Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BUG] Observability plugin calling createIndex from onNodeStarted method in 2.x branches #1883

Open
shwetathareja opened this issue Nov 11, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@shwetathareja
Copy link
Member

What is the bug?

Observability plugin is implementing ClusterPlugin interface

class ObservabilityPlugin : Plugin(), ActionPlugin, ClusterPlugin, SystemIndexPlugin {

and onNodeStarted it calls

which calls createIndex

override fun afterStart() {
// create default index
createIndex()

ClusterPlugin - onNodeStarted method is called from Node Bootstrap (Node.java) and it causing UnhandledException in the main thread

    pluginsService.filterPlugins(ClusterPlugin.class).forEach(plugin -> plugin.onNodeStarted(clusterService.localNode()));

Node is not initialized. Cluster is not formed, why is it trying to createIndex from here?
The main branch doesn't have same code as well? why is there divergence?

OpenSearchTimeoutException[java.util.concurrent.TimeoutException: Timeout waiting for task.]; nested: TimeoutException[Timeout waiting for task.];
Likely root cause: java.util.concurrent.TimeoutException: Timeout waiting for task.
        at org.opensearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:257)
        at org.opensearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:82)
        at org.opensearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:94)
        at org.opensearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:79)
        at org.opensearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:68)
        at org.opensearch.observability.index.ObservabilityIndex.createIndex(ObservabilityIndex.kt:108)
        at org.opensearch.observability.index.ObservabilityIndex.afterStart(ObservabilityIndex.kt:91)
        at org.opensearch.observability.ObservabilityPlugin.onNodeStarted(ObservabilityPlugin.kt:84)
        at org.opensearch.plugins.ClusterPlugin.onNodeStarted(ClusterPlugin.java:111)
        at org.opensearch.node.Node.lambda$start$37(Node.java:1768)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
        at org.opensearch.node.Node.start(Node.java:1768)
        at org.opensearch.bootstrap.Bootstrap.start(Bootstrap.java:339)
        at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:413)
        at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:181)
        at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:172)
        at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:104)
        at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:138)
        at org.opensearch.cli.Command.main(Command.java:101)
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:138)
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:104)

How can one reproduce the bug?
If ClusterManager election takes more time, it will result in UnhandledException in main thread.

@mengweieric
Copy link
Collaborator

Involving @YANG-DB to take a look as seems like the change was introduced by him.

@YANG-DB
Copy link
Member

YANG-DB commented Nov 26, 2024

@shwetathareja @mengweieric we need to further investigate on how to solve this
I'll update once we have a better insight ...

@cwperks
Copy link
Member

cwperks commented Nov 26, 2024

@YANG-DB This is how security solves the same problem: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/ConfigurationRepository.java#L188-L192

The security index is initialized within onNodeStarted, but it waits for the cluster state to be ready

@YANG-DB
Copy link
Member

YANG-DB commented Nov 26, 2024

@YANG-DB This is how security solves the same problem: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/configuration/ConfigurationRepository.java#L188-L192

The security index is initialized within onNodeStarted, but it waits for the cluster state to be ready

Thanks @cwperks I'll look into it soon

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants