Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

[SRM] After disaster recovery, vmgroup for volumes of user vmgroup are listed as N/A at secondary site #1786

Closed
ashahi1 opened this issue Aug 21, 2017 · 5 comments

Comments

@ashahi1
Copy link
Contributor

ashahi1 commented Aug 21, 2017

Steps:

  1. Initialized the config and created a vmgroup and added a vm
  2. After VM was added to vmgroup, created new volumes
  3. Ran Disaster Recovery from VC
  4. After DR finished successfuly, admin cli lists volumes of user vmgroup as N/A.
    And VM no longer belonged to user vmgoup - it became part of default vmgroup and could see volumes of default vmgroup

Detailed steps and their output are as follows:

  1. Initialized the config and created a vmgroup and added a vm
[root@w3-stsrm-017:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py config init --local
Warning: this feature is EXPERIMENTAL
Creating new DB at /etc/vmware/vmdkops/auth-db
Warning: Local configuration will not survive ESXi reboot. See KB2043564 for details
[root@w3-stsrm-017:~]
[root@w3-stsrm-017:~]
[root@w3-stsrm-017:~]
[root@w3-stsrm-017:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py vmgroup create --name vmgroup1 --default-datastore _VM_DS --vm-list Ubuntu-14.04-Docker11-12
vmgroup 'vmgroup1' is created. Do not forget to run 'vmgroup vm add' to add vm to vmgroup.
[root@w3-stsrm-017:~]
[root@w3-stsrm-017:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py vmgroup ls
Uuid                                  Name      Description                Default_datastore  VM_list
------------------------------------  --------  -------------------------  -----------------  ------------------------
11111111-1111-1111-1111-111111111111  _DEFAULT  This is a default vmgroup  _VM_DS
6979ae23-bb79-4952-bd42-cdbdf8a7821e  vmgroup1                             _VM_DS             Ubuntu-14.04-Docker11-12

[root@w3-stsrm-017:~]
  1. After vm was added to vmgroup, created new volumes
root@sc-rdops-vm02-dhcp-52-237:~# docker volume create -d vsphere --name TestVolVG1
TestVolVG1
root@sc-rdops-vm02-dhcp-52-237:~# docker volume create -d vsphere --name TestVolVG1-2
TestVolVG1-2
root@sc-rdops-vm02-dhcp-52-237:~# docker volume create -d vsphere --name TestVolVG1-3
TestVolVG1-3
root@sc-rdops-vm02-dhcp-52-237:~#


[root@w3-stsrm-017:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py volume ls
Volume        Datastore                VMGroup   Capacity  Used  Filesystem  Policy  Disk Format  Attached-to  Access      Attach-as               Created By              Created Date          
------------  -----------------------  --------  --------  ----  ----------  ------  -----------  -----------  ----------  ----------------------  ----------------------  ----------------------
TestVol22     snap-7a9248ba-ABR_U_360  _DEFAULT  100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persistent  Ubuntu-14.04-Docker1..  Fri Aug 18 21:47:13 ..
TestVolVG1    snap-7a9248ba-ABR_U_360  vmgroup1  100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persistent  Ubuntu-14.04-Docker1..  Fri Aug 18 21:51:40 ..
TestVolVG1-2  snap-7a9248ba-ABR_U_360  vmgroup1  100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persistent  Ubuntu-14.04-Docker1..  Fri Aug 18 21:51:50 ..
TestVolVG1-3  snap-7a9248ba-ABR_U_360  vmgroup1  100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persistent  Ubuntu-14.04-Docker1..  Fri Aug 18 21:51:55 ..
  1. Ran Disaster Recovery from VC
  2. After distaer recovery finished successfulyy, volumes of user vmgroup were listed as N/A
 [root@w3-stsrm-010:~] /usr/lib/vmware/vmdkops/bin/vmdkops_admin.py volume ls
Volume        Datastore              VMGroup   Capacity  Used  Filesystem  Policy  Disk Format  Attached-to  Access      Attach-as              Created By            Created Date         
------------  ---------------------  --------  --------  ----  ----------  ------  -----------  -----------  ----------  ---------------------  --------------------  ---------------------
TestVol22     snap-0cef40db-ABR_U..  _DEFAULT  100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persist..  Ubuntu-14.04-Docke..  Fri Aug 18 21:47:13..
TestVolVG1    snap-0cef40db-ABR_U..  N/A       100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persist..  Ubuntu-14.04-Docke..  Fri Aug 18 21:51:40..
TestVolVG1-2  snap-0cef40db-ABR_U..  N/A       100MB     15MB  ext4        N/A     thin         detached     read-write  independent_persist..  Ubuntu-14.04-Docke..  Fri Aug 18 21:51:50..
TestVolVG1-3  snap-0cef40db-ABR_U..  N/A       100MB     13MB  ext4        N/A     thin         detached     read-write  independent_persist..  Ubuntu-14.04-Docke..  Fri Aug 18 21:51:55..
@ashahi1 ashahi1 changed the title [SRM] After disaster recovery, vmgroup for volumes of user vmgroup as listed as N/A at secondary site [SRM] After disaster recovery, vmgroup for volumes of user vmgroup are listed as N/A at secondary site Aug 22, 2017
@govint
Copy link
Contributor

govint commented Aug 23, 2017

@ashahi1 could you also post config status from both source and destination hosts? And list the vmgroups on both hosts. If config DB isn't copied over then it can't be helped. Needs to be documented that multi-tenancy isn't supported with SRM for now?

@ashahi1
Copy link
Contributor Author

ashahi1 commented Aug 24, 2017

@govint Yes, it's has been documented to Interop.md

@govint
Copy link
Contributor

govint commented Sep 1, 2017

Checked SRM and the provisions existing today don't allow replicating specific files in an ESX host to a backup host. The config DB isn't replicated and in a DR scenario won't be available. The observed behavior is inevitable.

If backup and DR is a priority for the plugin then could consider options like, a) create a docker service configured to pull local tenancy configuration and push to specific backup host(s). Ensuring one instance hosted per ESX node. b) the ESX service itself be configured with a backup (ESX host) to which it replicates config data. Either way unless the plugin/service can handle replication on its own DR can't be supported at this time.

If not a priority (tenancy is experimental) then can close this issue as documented.

@govint
Copy link
Contributor

govint commented Sep 6, 2017

@ashahi1 @tusharnt, I'm repro'ing the options below, we can review and agree on an approach.

  1. Upgrade the ESX service to be configured with a backup (ESX host) to which it replicates config data. the ESX service on a host replicates its config DB to the backup host. Since SRM today doesn't handle something like this. Nor is it possible to include specific services to be mirrored between nodes. This builds out more functionality to maintain on the ESX side, which, long term, I'm not sure is appropriate. A leaner smaller-function ESX service is going to be easier to maintain (like for VMC).

  2. With VMC the notion of datastores on the ESX host is just not there (users don't want to know what datastores are there on their hosts or what VMs (names/IDs) are running on the VMC hosts) and this tenancy model we have is maybe not unusable there. Nor can they log into the hosts to manage vmgroups. We could revamp the support for tenancy and instead do this, at a high level:

  3. Support user based tenancy at the level of the plugin, user groups just like vmgroups.

  4. Support storage allocation vs. datastores assigned to vmgroups and manage volume capacity against the storage allocated to the group.

  5. Keep the config in say a key-value store accessible to all nodes in the host cluster.

@govint
Copy link
Contributor

govint commented Sep 12, 2017

Closing this issue as SRM doesn't support arbitrary services on ESX (only VMs) and thats not going to change at all for vSphere. Earlier update suggesting a different approach to vmgroup based tenancy can be a separate issue.

@govint govint closed this as completed Sep 12, 2017
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

No branches or pull requests

2 participants