Skip to content

feat: Hive listener integration #605

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

Maleware
Copy link
Member

@Maleware Maleware commented May 27, 2025

Description

Adds listener Support

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Roadmap has been updated

@Maleware
Copy link
Member Author

=== NAME  kuttl
    harness.go:403: run tests finished
    harness.go:510: cleaning up
    harness.go:567: removing temp folder: ""
--- PASS: kuttl (2381.48s)
    --- PASS: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-3.1.3_openshift-false_s3-use-tls-false (109.29s)
        --- PASS: kuttl/harness/resources_hive-4.0.1_openshift-false (25.30s)
        --- PASS: kuttl/harness/kerberos-hdfs_postgres-12.5.6_hive-4.0.1_hdfs-latest-3.4.1_zookeeper-latest-3.9.3_krb5-1.21.1_openshift-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (174.61s)
        --- PASS: kuttl/harness/kerberos-hdfs_postgres-12.5.6_hive-4.0.0_hdfs-latest-3.4.1_zookeeper-latest-3.9.3_krb5-1.21.1_openshift-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (173.43s)
        --- PASS: kuttl/harness/kerberos-hdfs_postgres-12.5.6_hive-3.1.3_hdfs-latest-3.4.1_zookeeper-latest-3.9.3_krb5-1.21.1_openshift-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (191.31s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-4.0.1_krb5-1.21.1_openshift-false_s3-use-tls-true_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (109.84s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-4.0.1_krb5-1.21.1_openshift-false_s3-use-tls-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (107.09s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-4.0.0_krb5-1.21.1_openshift-false_s3-use-tls-true_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (132.82s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-4.0.0_krb5-1.21.1_openshift-false_s3-use-tls-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (110.03s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-3.1.3_krb5-1.21.1_openshift-false_s3-use-tls-true_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (119.68s)
        --- PASS: kuttl/harness/kerberos-s3_postgres-12.5.6_hive-3.1.3_krb5-1.21.1_openshift-false_s3-use-tls-false_kerberos-realm-PROD.MYCORP_kerberos-backend-mit (118.54s)
        --- PASS: kuttl/harness/upgrade_postgres-12.5.6_hive-old-3.1.3_hive-new-4.0.1_openshift-false (70.49s)
        --- PASS: kuttl/harness/cluster-operation_hive-latest-4.0.1_openshift-false (61.16s)
        --- PASS: kuttl/harness/logging_postgres-12.5.6_hive-4.0.0_openshift-false (79.92s)
        --- PASS: kuttl/harness/resources_hive-4.0.0_openshift-false (26.53s)
        --- PASS: kuttl/harness/resources_hive-3.1.3_openshift-false (26.19s)
        --- PASS: kuttl/harness/external-access_hive-latest-4.0.1_openshift-false (36.33s)
        --- PASS: kuttl/harness/orphaned-resources_hive-latest-4.0.1_openshift-false (37.07s)
        --- PASS: kuttl/harness/logging_postgres-12.5.6_hive-4.0.1_openshift-false (78.39s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-4.0.1_openshift-false_s3-use-tls-false (102.36s)
        --- PASS: kuttl/harness/logging_postgres-12.5.6_hive-3.1.3_openshift-false (87.95s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-4.0.1_openshift-false_s3-use-tls-true (96.98s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-4.0.0_openshift-false_s3-use-tls-false (102.11s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-4.0.0_openshift-false_s3-use-tls-true (100.83s)
        --- PASS: kuttl/harness/smoke_postgres-12.5.6_hive-3.1.3_openshift-false_s3-use-tls-true (103.19s)
PASS

@Maleware Maleware marked this pull request as ready for review May 28, 2025 13:22
@Maleware
Copy link
Member Author

Maleware commented Jun 4, 2025

I might have run into stackabletech/hdfs-operator#686 during development.

I recognized that I can have an empty string in my discovery configMap from time to time. Behaviour appears to be flaky, but yet more often then not the emtpy string appears:

Expected

apiVersion: v1
data:
  HIVE: thrift://hive-postgres-s3-metastore-default.default.svc.cluster.local:9083
kind: ConfigMap
metadata:

Flaky faulty one

apiVersion: v1
data:
  HIVE: ""
kind: ConfigMap
metadata:

@lfrancke lfrancke moved this to Development: In Progress in Stackable Engineering Jun 4, 2025
@Maleware Maleware changed the title WIP: Listener integration feat: Hive listener integration Jun 5, 2025
@Maleware
Copy link
Member Author

Maleware commented Jun 5, 2025

🟢

=== NAME  kuttl
    harness.go:403: run tests finished
    harness.go:510: cleaning up
    harness.go:567: removing temp folder: ""
--- PASS: kuttl (38.21s)
    --- PASS: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/external-access_hive-latest-4.0.1_openshift-false (38.17s)
PASS

@Maleware Maleware moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Jun 5, 2025
Maleware and others added 2 commits June 11, 2025 15:27
Co-authored-by: Malte Sander <contact@maltesander.com>
@maltesander maltesander linked an issue Jun 11, 2025 that may be closed by this pull request
Comment on lines 135 to 140
listener_ref: Listener,
rolegroup: &String,
chroot: Option<&str>,
) -> Result<String, Error> {
// We only need the first address corresponding to the rolegroup
let listener_address = listener_ref
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be called listener, as it isn't a ref.

Suggested change
listener_ref: Listener,
rolegroup: &String,
chroot: Option<&str>,
) -> Result<String, Error> {
// We only need the first address corresponding to the rolegroup
let listener_address = listener_ref
listener: Listener,
rolegroup: &String,
chroot: Option<&str>,
) -> Result<String, Error> {
// We only need the first address corresponding to the rolegroup
let listener_address = listener

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is gone as not needed anymore

@@ -448,6 +417,10 @@ pub struct MetaStoreConfig {
/// Time period Pods have to gracefully shut down, e.g. `30m`, `1h` or `2d`. Consult the operator documentation for details.
#[fragment_attrs(serde(default))]
pub graceful_shutdown_timeout: Option<Duration>,

/// This field controls which [ListenerClass](DOCS_BASE_URL_PLACEHOLDER/listener-operator/listenerclass.html) is used to expose the webserver.
#[serde(default)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other fields use fragment_attr, is that needed here too?

Suggested change
#[serde(default)]
#[fragment_attr(serde(default))]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we want the String::default() to be called? Doesn't it have to be a valid class, or it falls back to cluster-internal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also outdated I think.

Yes, now it falls back to cluster-internal if listenerClass is not given.

hive,
&resolved_product_image.app_version_label,
&HiveRole::MetaStore.to_string(),
"discovery",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this cause a collision if there are multiple role groups defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also outdated as it's on role level now

@Maleware Maleware moved this from Development: In Review to Development: In Progress in Stackable Engineering Jun 18, 2025
Comment on lines 508 to 530
// Init listener struct. Collect listener after applied to cluster_resources
// to use listener object in later created discovery configMap
let mut listener = Listener::new("name", ListenerSpec::default());
let role_config = hive.role_config(&hive_role);
if let Some(GenericRoleConfig {
pod_disruption_budget: pdb,
if let Some(HiveMetastoreRoleConfig {
common: GenericRoleConfig {
pod_disruption_budget: pdb,
},
listener_class,
}) = role_config
{
add_pdbs(pdb, hive, &hive_role, client, &mut cluster_resources)
.await
.context(FailedToCreatePdbSnafu)?;

let group_listener: Listener =
build_group_listener(hive, &resolved_product_image, &hive_role, listener_class)?;
listener = cluster_resources
.add(client, group_listener)
.await
.with_context(|_| ApplyGroupListenerSnafu {
role: hive_role.to_string(),
})?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split the pdbs and listener stuff like so:

    let role_config = hive.role_config(&hive_role);
    if let Some(HiveMetastoreRoleConfig {
        common: GenericRoleConfig {
            pod_disruption_budget: pdb,
        },
        ..
    }) = role_config
    {
        add_pdbs(pdb, hive, &hive_role, client, &mut cluster_resources)
            .await
            .context(FailedToCreatePdbSnafu)?;
    }

    // std's SipHasher is deprecated, and DefaultHasher is unstable across Rust releases.
    // We don't /need/ stability, but it's still nice to avoid spurious changes where possible.
    let mut discovery_hash = FnvHasher::with_key(0);
    if let Some(HiveMetastoreRoleConfig { listener_class, .. }) = role_config {
        let group_listener: Listener =
            build_group_listener(hive, &resolved_product_image, &hive_role, listener_class)?;
        let listener = cluster_resources
            .add(client, group_listener)
            .await
            .context(ApplyGroupListenerSnafu {
                role: hive_role.to_string(),
            })?;

        for discovery_cm in discovery::build_discovery_configmaps(
            hive,
            hive,
            hive_role,
            &resolved_product_image,
            None,
            listener,
        )
        .await
        .context(BuildDiscoveryConfigSnafu)?
        {
            let discovery_cm = cluster_resources
                .add(client, discovery_cm)
                .await
                .context(ApplyDiscoveryConfigSnafu)?;
            if let Some(generation) = discovery_cm.metadata.resource_version {
                discovery_hash.write(generation.as_bytes())
            }
        }
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with aa56092

Comment on lines 990 to 991
let recommended_object_labels: ObjectLabels<'_, v1alpha1::HiveCluster> =
build_recommended_labels(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why type that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overlooked it. Probably IDE autocompletion whenever you double click the type :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with cae1450

obj_ref: ObjectRef<Service>,
},
#[snafu(display("could not find port [{port_name}] for rolegroup listener {role}"))]
NoServicePort { port_name: String, role: String },
#[snafu(display("service [{obj_ref}] port [{port_name}] does not have a nodePort "))]
NoNodePort {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoNodePort, FindEndpoints, InvalidNodePort etc. unused

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with aa56092

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: Development: In Progress
Development

Successfully merging this pull request may close these issues.

Integrate Hive Operator with Listener Operator
3 participants