Skip to content

feat(stackable-operator): Add git-sync support #1024

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

siegfriedweber
Copy link
Member

@siegfriedweber siegfriedweber commented May 7, 2025

Description

Add git-sync support

Currently used in:

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes
# Author
- [x] Changes are OpenShift compatible
- [x] Integration tests passed (for non trivial changes)
# Reviewer
- [ ] Code contains useful comments
- [ ] (Integration-)Test cases added
- [ ] Documentation added or updated
- [ ] Changelog updated
- [ ] Cargo.toml only contains references to git tags (not specific commits or branches)
# Acceptance
- [ ] Feature Tracker has been updated
- [ ] Proper release label has been added

@siegfriedweber siegfriedweber marked this pull request as ready for review May 8, 2025 14:31
@siegfriedweber siegfriedweber requested a review from a team May 8, 2025 14:31
@siegfriedweber siegfriedweber moved this to Development: Waiting for Review in Stackable Engineering May 8, 2025
@Techassi Techassi self-requested a review May 8, 2025 14:31
@Techassi Techassi moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering May 9, 2025
@Techassi Techassi changed the title Add git-sync support feat(stackable-operator): Add git-sync support May 9, 2025
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First initial (partial, I didn't look at the unit tests yet) review.


mod v1alpha1_impl;

#[versioned(version(name = "v1alpha1"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice job versioning this right from the start!

Comment on lines +21 to +22
/// The git repository URL that will be cloned, for example: `https://github.com/stackabletech/airflow-operator`.
pub repo: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Is there any particular reason why this field is named repo? I think we should name it repository.

note: Additionally, the type of this field should be Url instead of a plain String.

Comment on lines +30 to +35
/// Location in the Git repository containing the resource.
///
/// It can optionally start with `/`, however, no trailing slash is recommended.
/// An empty string (``) or slash (`/`) corresponds to the root folder in Git.
#[serde(default = "GitSync::default_git_folder")]
pub git_folder: PathBuf,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: All other fields document the default value. This one should then also do that.

Comment on lines +26 to +28
/// Since git-sync v4.x.x this field is mapped to the flag `--ref`.
#[serde(default = "GitSync::default_branch")]
pub branch: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is a perfect case for a future v1alpha2 version of this struct, to rename the field to ref instead.

Comment on lines +43 to +45
/// Since git-sync v4.x.x this field is mapped to the flag `--period`.
#[serde(default = "GitSync::default_wait")]
pub wait: Duration,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Another candidate for version v1alpha2 to rename the field to period.

Comment on lines +207 to +221
let internal_args = [
Some(("--repo".to_string(), git_sync.repo.to_owned())),
Some(("--ref".to_string(), git_sync.branch.to_owned())),
Some(("--depth".to_string(), git_sync.depth.to_string())),
Some((
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
)),
Some(("--link".to_string(), GIT_SYNC_LINK.to_string())),
Some(("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string())),
one_time.then_some(("--one-time".to_string(), "true".to_string())),
]
.into_iter()
.flatten()
.collect::<BTreeMap<_, _>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Remove the Somes (and the no longer required flatten call) and also remove the explicit into_iter call.

Suggested change
let internal_args = [
Some(("--repo".to_string(), git_sync.repo.to_owned())),
Some(("--ref".to_string(), git_sync.branch.to_owned())),
Some(("--depth".to_string(), git_sync.depth.to_string())),
Some((
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
)),
Some(("--link".to_string(), GIT_SYNC_LINK.to_string())),
Some(("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string())),
one_time.then_some(("--one-time".to_string(), "true".to_string())),
]
.into_iter()
.flatten()
.collect::<BTreeMap<_, _>>();
let mut internal_args = BTreeMap::from([
("--repo".to_string(), git_sync.repo.to_owned()),
("--ref".to_string(), git_sync.branch.to_owned()),
("--depth".to_string(), git_sync.depth.to_string()),
(
"--period".to_string(),
format!("{}s", git_sync.wait.as_secs()),
),
("--link".to_string(), GIT_SYNC_LINK.to_string()),
("--root".to_string(), GIT_SYNC_ROOT_DIR.to_string()),
]);
if one_time {
internal_args.insert("--one-time".into(), "true".into());
}

Comment on lines +223 to +228
let internal_git_config = [(
GIT_SYNC_SAFE_DIR_OPTION.to_string(),
GIT_SYNC_ROOT_DIR.to_string(),
)]
.into_iter()
.collect::<BTreeMap<_, _>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Simplify this.

Suggested change
let internal_git_config = [(
GIT_SYNC_SAFE_DIR_OPTION.to_string(),
GIT_SYNC_ROOT_DIR.to_string(),
)]
.into_iter()
.collect::<BTreeMap<_, _>>();
let internal_git_config = BTreeMap::from([(
GIT_SYNC_SAFE_DIR_OPTION.to_string(),
GIT_SYNC_ROOT_DIR.to_string(),
)]);

// (https://github.com/stackabletech/airflow-operator/pull/381)
// used this condition to find Git configs. It is also used here
// for backwards-compatibility:
if key.to_lowercase().ends_with("-git-config") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This condition seems a little weird. As far as I can see, the only way to provide custom git configs in git-sync is to use --git-config. Why don't we instead use the following expression:

if key.to_lowercase() == "--git-config" {}

Independent of which approach we finally go with, the value we compare against should live in a constant.

Comment on lines +244 to +248
if internal_git_config.keys().any(|key| value.contains(key)) {
tracing::warn!("Config option {value:?} contains a value for {GIT_SYNC_SAFE_DIR_OPTION} that overrides
the value of this operator. Git-sync functionality will probably not work as expected!");
}
user_defined_git_configs.push(value.to_owned());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we want to allow that? This can potentially break what the operator wants to do. Basically user freedom vs opinionated approach.

Comment on lines +268 to +270
let mut args = internal_args;
args.extend(user_defined_args);
args.insert("--git-config".to_string(), format!("'{git_config}'"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Any particular reason why we don't operate on the internal_args directly? One of the above suggestions even makes them mut.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

3 participants