Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[CELEBORN-1802] Fail the celeborn master/worker start if CELEBORN_CONF_DIR is not directory #3030

Closed
wants to merge 3 commits into from

Conversation

turboFei
Copy link
Member

@turboFei turboFei commented Dec 25, 2024

What changes were proposed in this pull request?

Fail the celeborn master/worker start if CELEBORN_CONF_DIR is not directory. Otherwise, the process would run into unexpected status.

Why are the changes needed?

In the celeborn-daemon.sh , if we specify the --config <conf-dir> option. It would fail the master/worker start if the conf-dir is not a directory, likes the systemctl ConditionPathExists=$CELEBORN_CONF_DIR requirement check.

echo " <conf-dir> Override CELEBORN_CONF_DIR"

if [ "$1" == "--config" ]
then
shift
conf_dir="$1"
if [ ! -d "$conf_dir" ]
then
echo "ERROR : $conf_dir is not a directory"
usage
exit 1
else

But before this PR, for the start master/worker scripts, it did not check if the CELEBORN_CONF_DIR is dirctory because the scripts did not leverage --config <conf-dir> option.

In this PR, we check the final CELEBORN_CONF_DIR before start celeborn, so that all the scripts would check if the CELEBORN_CONF_DIR is a directory before start.

Does this PR introduce any user-facing change?

Yes, it would fail the start if CELEBORN_CONF_DIR is not a directory.

How was this patch tested?

image

@turboFei turboFei changed the title [CELEBORN-1802] Fail the celeborn master/worker start on CELEBORN_CONF_DIR does not exists [CELEBORN-1802] Fail the celeborn master/worker start on CELEBORN_CONF_DIR does not exist Dec 25, 2024
@turboFei turboFei changed the title [CELEBORN-1802] Fail the celeborn master/worker start on CELEBORN_CONF_DIR does not exist [CELEBORN-1802] Fail the celeborn master/worker start on CELEBORN_CONF_DIR is not directory Dec 25, 2024
@turboFei turboFei marked this pull request as draft December 25, 2024 02:41
@turboFei turboFei changed the title [CELEBORN-1802] Fail the celeborn master/worker start on CELEBORN_CONF_DIR is not directory [CELEBORN-1802] Fail the celeborn master/worker start if CELEBORN_CONF_DIR is not directory Dec 25, 2024
@turboFei turboFei marked this pull request as ready for review December 25, 2024 02:44
@turboFei
Copy link
Member Author

turboFei commented Dec 25, 2024

Maybe I can use the below commands in systemctl.d/supervisor.d directly.

$CELEBORN_HOME/sbin/celeborn-daemon.sh --config $CELEBORN_CONF_DIR start org.apache.celeborn.service.deploy.master.Master

$CELEBORN_HOME/sbin/celeborn-daemon.sh --config $CELEBORN_CONF_DIR start org.apache.celeborn.service.deploy.master.Worker 1

@turboFei turboFei closed this Dec 25, 2024
@turboFei
Copy link
Member Author

turboFei commented Dec 25, 2024

reopen it incase someone think it is necessary

@turboFei turboFei reopened this Dec 25, 2024
@turboFei turboFei marked this pull request as draft December 25, 2024 03:46
@turboFei turboFei marked this pull request as ready for review December 25, 2024 04:09
@turboFei turboFei requested a review from pan3793 December 25, 2024 04:09
Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes sense to me, also cc @FMX @RexXiong

Copy link
Contributor

@RexXiong RexXiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks @pan3793

Copy link
Contributor

@FMX FMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks. Merged into main(v0.6.0).

@FMX FMX closed this in 1b3bd6e Dec 26, 2024
@turboFei turboFei deleted the check_config_dir branch December 26, 2024 18:21
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants