Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Change default value for dfs.ha.nn.not-become-active-in-safemode to false #458

Merged
merged 3 commits into from
Jan 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ All notable changes to this project will be documented in this file.
### Changed

- `operator-rs` `0.56.1` -> `0.57.0` ([#433]).
- Change default value of `dfs.ha.nn.not-become-active-in-safemode` from `true` to `false` ([#458]).

### Fixed

Expand All @@ -19,6 +20,7 @@ All notable changes to this project will be documented in this file.

[#433]: https://github.com/stackabletech/hdfs-operator/pull/433
[#451]: https://github.com/stackabletech/hdfs-operator/pull/451
[#458]: https://github.com/stackabletech/hdfs-operator/pull/458

## [23.11.0] - 2023-11-24

Expand Down
10 changes: 9 additions & 1 deletion rust/operator-binary/src/hdfs_controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -494,6 +494,15 @@ fn rolegroup_config_map(
// IMPORTANT: these folders must be under the volume mount point, otherwise they will not
// be formatted by the namenode, or used by the other services.
// See also: https://github.com/apache-spark-on-k8s/kubernetes-HDFS/commit/aef9586ecc8551ca0f0a468c3b917d8c38f494a0
//
// Notes on configuration choices
// ===============================
// We used to set `dfs.ha.nn.not-become-active-in-safemode` to true here due to
// badly worded HDFS documentation:
// https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html
// This caused a deadlock with no namenode becoming active during a startup after
// HDFS was completely down for a while.

hdfs_site_xml = HdfsSiteConfigBuilder::new(hdfs_name.to_string())
.dfs_namenode_name_dir()
.dfs_datanode_data_dir(merged_config.data_node_resources().map(|r| r.storage))
Expand All @@ -508,7 +517,6 @@ fn rolegroup_config_map(
.dfs_client_failover_proxy_provider()
.security_config(hdfs)
.add("dfs.ha.fencing.methods", "shell(/bin/true)")
.add("dfs.ha.nn.not-become-active-in-safemode", "true")
.add("dfs.ha.automatic-failover.enabled", "true")
.add("dfs.ha.namenode.id", "${env.POD_NAME}")
// the extend with config must come last in order to have overrides working!!!
Expand Down