Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Implement "config rm --unlink". #1186

Merged
merged 3 commits into from
Apr 27, 2017
Merged

Implement "config rm --unlink". #1186

merged 3 commits into from
Apr 27, 2017

Conversation

lipingxue
Copy link
Contributor

@lipingxue lipingxue commented Apr 24, 2017

Fixed #1138.
This PR includes the change to implement "config rm --unlink".

[ ] In SingleNode, rm --local will remove the local DB, rm --unlink will return an error message to say this command is not supported in SingleNode configuration.

[ ] In MultiNode, rm --unlink will print a message and remove the local symlink. rm --local will return an error message to say rm --unlink is not support in MultiNode configuration.

Testing:

Configured in SingleNodeMode:

  • run config rm
[root@sc2-rdops-vm01-dhcp-34-30:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm
 
        Shared DB removal is not supported. For removing  local configuration, use --local flag.
        For removing shared DB,  run 'vmdkops_admin config rm --unlink' on ESX hosts using this DB,
        and manually remove the vmdkops_config.db file from shared storage.
       
  • run config rm --unlink
[root@sc2-rdops-vm01-dhcp-34-30:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --unlink
Warning: For extra safety, removal operation requires '--confirm' flag.
[root@sc2-rdops-vm01-dhcp-34-30:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --unlink --confirm
'rm --unlink' is not supported when Config DB  is in SingleNode mode. Use 'rm --local' to remove the local link or local DB.
  • run config rm --local
root@sc2-rdops-vm01-dhcp-34-30:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --local --confirm
Moved /etc/vmware/vmdkops/auth-db to backup file /etc/vmware/vmdkops/auth-db.bak_Thu_Apr_20_23:13:07_2017

[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --unlink --confirm
Removed link /etc/vmware/vmdkops/auth-db
[root@sc2-rdops-vm01-dhcp-52-142:~]
[root@sc2-rdops-vm01-dhcp-52-142:~]
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py status
=== Service:
Version: 0.13.87ce18c-0.0.1
Status: Running
Pid: 734227
Port: 1019
LogConfigFile: /etc/vmware/vmdkops/log_config.json
LogFile: /var/log/vmware/vmdk_ops.log
LogLevel: INFO
=== Authorization Config DB:
DB_SharedLocation: N/A
DB_Mode: NotConfigured (no local DB, no symlink to shared DB)
DB_LocalPath: N/A

Configured in MultiNodeMode:

  • run config rm
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm
 
        Shared DB removal is not supported. For removing  local configuration, use --local flag.
        For removing shared DB,  run 'vmdkops_admin config rm --unlink' on ESX hosts using this DB,
        and manually remove the vmdkops_config.db file from shared storage.
  • run config rm --local
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --local
Warning: For extra safety, removal operation requires '--confirm' flag.
[root@sc2-rdops-vm01-dhcp-52-142:~]
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --local --confirm
'rm --local' is not supported when Config DB is in MultiNode mode.Use 'rm --unlink' to remove the local link to shared DB.
  • run config rm --unlink
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config rm --unlink --confirm
Removed link /etc/vmware/vmdkops/auth-db
[root@sc2-rdops-vm01-dhcp-52-142:~]
[root@sc2-rdops-vm01-dhcp-52-142:~]
[root@sc2-rdops-vm01-dhcp-52-142:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py status
=== Service:
Version: 0.13.87ce18c-0.0.1
Status: Running
Pid: 734227
Port: 1019
LogConfigFile: /etc/vmware/vmdkops/log_config.json
LogFile: /var/log/vmware/vmdk_ops.log
LogLevel: INFO
=== Authorization Config DB:
DB_SharedLocation: N/A
DB_Mode: NotConfigured (no local DB, no symlink to shared DB)
DB_LocalPath: N/A

@@ -1301,16 +1305,38 @@ def config_rm(args):
# This asks for double confirmation, and removes the local link or DB (if any)
# NEVER deletes the shared database - instead prints help

if not args.local:
if not args.local and not args.unlink:
return err_out("""
Shared DB removal is not supported. For removing local configuration, use --local flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove extra space - "removing local"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole message has been rewritten per Mark's comments.

return err_out("""
Shared DB removal is not supported. For removing local configuration, use --local flag.
For removing shared DB, run 'vmdkops_admin config rm --local' on ESX hosts using this DB,
For removing shared DB, run 'vmdkops_admin config rm --unlink' on ESX hosts using this DB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove extra space - "DB, run"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

@@ -1301,16 +1305,38 @@ def config_rm(args):
# This asks for double confirmation, and removes the local link or DB (if any)
# NEVER deletes the shared database - instead prints help

if not args.local:
if not args.local and not args.unlink:
return err_out("""
Shared DB removal is not supported. For removing local configuration, use --local flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd suggest to move "Shared DB removal is not supported" to the last statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

and manually remove the {} file from shared storage.
""".format(auth_data.CONFIG_DB_NAME))

# Check the existing config mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract this section as a separate method for code reuse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue #1187 is filed to track this.

mode = auth.mode # for usage outside of the 'with'
except auth_data.DbAccessError as ex:
return err_out(str(ex))

if not args.confirm:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this argument check before line 1315.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pass
elif mode == auth_data.DBMode.MultiNode:
if args.local:
return err_out("'rm --local' is not supported when " + DB_REF + "is in MultiNode mode."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add a space before and after "is in MultiNode mode."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB_REF already included the space.

elif mode == auth_data.DBMode.SingleNode:
if args.unlink:
return err_out("'rm --unlink' is not supported when " + DB_REF + " is in SingleNode mode."
" Use 'rm --local' to remove the local link or local DB.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "local link or local DB"? Do we also create a symlink to local DB? If yes, we should say "remove local link and local DB" or simply "remove local DB configuration"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@msterin msterin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting it together. Some of the messages need rework (suggestions included) , but mainly - the new code is unnecessary complex. Suggestions are also included - please take a loo

return err_out("""
Shared DB removal is not supported. For removing local configuration, use --local flag.
For removing shared DB, run 'vmdkops_admin config rm --local' on ESX hosts using this DB,
For removing shared DB, run 'vmdkops_admin config rm --unlink' on ESX hosts using this DB,
and manually remove the {} file from shared storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the message no longer matches the functionality.
Here it should say:
"""
DB removal is irreversible operation. Please use --local flag for removing DB in SingeNode mode,
and use --unlink to unlink from DB in MultiNode mode".
Note that --unlink will not remove a shared DB, but simply configure the current ESXi host to stop using it.
For removing shared DB, run 'vmdkops_admin config rm --unlink' on ESXi hosts using this DB, and then manually remove the actual DB file from shared storage."""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if not args.confirm:
return err_out("Warning: For extra safety, removal operation requires '--confirm' flag.")

if mode == auth_data.DBMode.NotConfigured:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should say "Nothing to do - DB is not configured".
It also should be handled together with all other "noops" csses - see below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

and manually remove the {} file from shared storage.
""".format(auth_data.CONFIG_DB_NAME))

# Check the existing config mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing check
`if args.local and args.unlink: err_out("cannot use both flags together" + help message above about when to use what)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if mode == auth_data.DBMode.NotConfigured:
pass
elif mode == auth_data.DBMode.MultiNode:
if args.local:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole code should be simpler , something like this:

if mode == DBMode.MutiNode:
   if arg.local: 
      err_out(...)
   else:
      move_to_backup
      return service_reset

if mode == DBMode.SingleNode:
  if args.unlink:
       err_out
  else: 
       rm link
       return service_reset

# all other cases
    print("Noting to do. Mode =<mode>"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auth.connect()
info = auth.get_info()
mode = auth.mode # for usage outside of the 'with'
except auth_data.DbAccessError as ex:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's generally a bad idea to catch exception half way in command line.
We should let them bubble to main and just print proper error message there.
I think that's what we already do, don't we ?
Can you check what's the behavior if there is no catch here, and connect() does throw DBAccessError ? (this is not a blocker so if you are out of time feel free to skip this comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we cannot let the code go through. When the exception happens, we cannot get the correct config mode from DB. And our following code depends on the config mode. We have similar logic in the "config_init()" too.

@lipingxue
Copy link
Contributor Author

@shaominchen @msterin I have addressed your comments, please take a look. Thanks!

Copy link
Contributor

@shaominchen shaominchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@govint
Copy link
Contributor

govint commented Apr 26, 2017

@lipingxue, pls. add tests as below,

  1. For both local and multi-node cases
    a. Remove the DB and then run vmgroup commands to create/delete a group. Just to see if anything is broken and the user is able to use the system normally.

  2. For multi-node case
    a. Two ESX hosts with shared DS.
    b. Config rm --unlink the DB on one node, confirm that the other node is able to use the DB with no issue.
    c. Check the node (in (b)) can restore the link on "config init". The next config init on the node that did the "config rm --unlink" should check that there is already a DB on the shared DS and link to it. Confirm that this is happening.

if mode == auth_data.DBMode.MultiNode:
if args.local:
return err_out("'rm --local' is not supported when " + DB_REF + "is in MultiNode mode."
" Use 'rm --unlink' to remove the local link to shared DB.")
Copy link
Contributor

@govint govint Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to have a --local and --unlink. A --unlink is sufficient for both single and multi-node cases. In the single node case it destroys the DB (or backs it up) and in the multi-node case simply delinks the host from the shared DB. Since the configured mode is already known to the system, it can differentiate the cases and figure how to interpret the unlink.

The --local just seems extra and user needs to keep track of the mode.

A host may have a local datastore and have a DB on that and a shared DS with a DB on that. How does the user decide which DB on which DS to remove? Don't we need to pass the DS name on the command line?? Or multiple shared DS'es on the same host.

and use '--unlink' to unlink from DB in MultiNode mode.
"""
)

if not args.confirm:
return err_out("Warning: For extra safety, removal operation requires '--confirm' flag.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not this change, why does the user need to provide two options?? --local and --confirm, it would make sense if the confirmation is asked by the CLI. If the user provides "rm --local" then they want to remove the DB, the user typing out another "--confirm" doesn't make any difference - they still want to remove the DB.

Copy link
Contributor

@govint govint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also try this out on VSAN and say how the DB is configured there? Is it local or multi-node?

@pshahzeb
Copy link
Contributor

pshahzeb commented Apr 27, 2017

@govint
I did the the tests that you had mentioned with this change. Following are the results

  1. After removing DB in both modes, the admin command fails with following error:
Please init configuration with 'vmdkops_admin.py config init' before changing it.
  1. For this multinode test, the behavior is exactly the same as you described. i.e. the node that unlinks the DB detects the symlink when config init done again. Access from the second ESX node remains consistent. I also verified that vmgroups are consistently visible across ESX nodes when config init is done ( init -> unlink > init).

Can you also try this out on VSAN and say how the DB is configured there? Is it local or multi-node?

I did the above tests for shared mode on VSAN itself. It's configured the same way. Were you implying something else?

@govint
Copy link
Contributor

govint commented Apr 27, 2017

@pshahzeb, thanks for this verification.

For VSAN my doubt was since its a single datastore across all hosts in the cluster (yes?) is the DB visible to all nodes and hence is it local or multi-node? And whether the link to the DB is visible to all nodes (its a single datastore). Is it then a single link or per node?

Copy link
Contributor

@govint govint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@pshahzeb pshahzeb merged commit 43a34d2 into master Apr 27, 2017
@shuklanirdesh82 shuklanirdesh82 deleted the config_rm_unlink.liping branch April 27, 2017 21:21
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants