Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

dtm0: add online recovery DLD #1981

Merged
merged 75 commits into from
Aug 30, 2022
Merged

dtm0: add online recovery DLD #1981

merged 75 commits into from
Aug 30, 2022

Conversation

andriytk
Copy link
Contributor

@andriytk andriytk commented Jul 13, 2022

Add DLD for the new, Online Recovery DTM0 approach. Recovery is done always and automatically now, without the need for a special RECOVERING state from Hare. In short, when there is no pmsg from some participant for a while, and it is not in TRANSIENT state, we start sending REDO msgs to it.

Also, the new log pruning process is outlined.

ivan-alekhin and others added 30 commits June 17, 2022 19:16
NOTE: drlink-simple get stuck with "-n 2" for some reason.
TODOs:
1. Address todos in dtm0/ut/log.c
2. Generate "random" txd (see todo in cas-client code).
3. Fix cas UTs.
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Conflicts:
	be/btree.h
... to make it pass.

Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
@cla-bot
Copy link

cla-bot bot commented Jul 28, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
@cla-bot
Copy link

cla-bot bot commented Jul 29, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

@madhavemuri madhavemuri changed the base branch from main to dtm/log August 1, 2022 06:25
@madhavemuri madhavemuri changed the base branch from dtm/log to main August 1, 2022 06:26
@cla-bot
Copy link

cla-bot bot commented Aug 1, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

@madhavemuri madhavemuri changed the base branch from main to dtm/log August 1, 2022 06:29
@madhavemuri madhavemuri changed the base branch from dtm/log to main August 1, 2022 06:36
@madhavemuri madhavemuri changed the base branch from main to dtm/log August 1, 2022 06:36
@hessio hessio added the Status: Merge Conflicts PR has conflicts that need to resolved before it can be merged label Aug 9, 2022
@stale
Copy link

stale bot commented Aug 13, 2022

This issue/pull request has been marked as needs attention as it has been left pending without new activity for 4 days. Tagging @nkommuri @mehjoshi @huanghua78 for appropriate assignment. Sorry for the delay & Thank you for contributing to CORTX. We will get back to you as soon as possible.

Base automatically changed from dtm/log to main August 26, 2022 13:10
@cla-bot
Copy link

cla-bot bot commented Aug 29, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

@cla-bot
Copy link

cla-bot bot commented Aug 30, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

@cla-bot
Copy link

cla-bot bot commented Aug 30, 2022

Thanks for your contribution!
The CLA bot has flagged your contribution as not having a Contributor License Agreement
in place. Note that this is not needed in the overwhelming majority of instances and this warning will usually be ignored.
The code reviewers will make a determination and may ask you to sign a CLA or may choose to ignore this warning.
More information about this can be found here.

@madhavemuri
Copy link
Contributor

madhavemuri commented Aug 30, 2022

@rkothiya : You can consider following commit description:

dtm: add DTM0 online meta-data recovery DLD

Added updated DTM0 DLD and implementation plan.

Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Signed-off-by: Shashank Parulekar <Shashank.Parulekar@seagate.com>
Co-authored-by: Ivan Alekhin <ivan.alekhin@seagate.com>
Co-authored-by: Maksym Medvied <maksym.medvied@seagate.com>
Co-authored-by: Madhavrao Vemuri <madhav.vemuri@seagate.com>

@rkothiya
Copy link
Contributor

Jenkins CI Result : Motr#1664

Motr Test Summary

Test ResultCountInfo
❌Failed1
📁

01motr-single-node/00userspace-tests

🏁Skipped32
📁

01motr-single-node/28sys-kvs
01motr-single-node/35m0singlenode
01motr-single-node/04initscripts
01motr-single-node/37protocol
02motr-single-node/51kem
02motr-single-node/20rpc-session-cancel
02motr-single-node/10pver-assign
02motr-single-node/21fsync-single-node
02motr-single-node/13dgmode-io
02motr-single-node/14poolmach
02motr-single-node/11m0t1fs
02motr-single-node/26motr-user-kernel-tests
02motr-single-node/08spiel
03motr-single-node/06conf
03motr-single-node/36spare-reservation
04motr-single-node/34sns-repair-1n-1f
04motr-single-node/08spiel-sns-repair-quiesce
04motr-single-node/28sys-kvs-kernel
04motr-single-node/11m0t1fs-rconfc-fail
04motr-single-node/08spiel-sns-repair
04motr-single-node/19sns-repair-abort
04motr-single-node/22sns-repair-ios-fail
05motr-single-node/18sns-repair-quiesce
05motr-single-node/12fwait
05motr-single-node/16sns-repair-multi
05motr-single-node/07mount-fail
05motr-single-node/15sns-repair-single
05motr-single-node/23sns-abort-quiesce
05motr-single-node/17sns-repair-concurrent-io
05motr-single-node/07mount
05motr-single-node/07mount-multiple
05motr-single-node/12fsync

✔️Passed44
📁

01motr-single-node/43m0crate
01motr-single-node/05confgen
01motr-single-node/06hagen
01motr-single-node/52motr-singlenode-sanity
01motr-single-node/01net
01motr-single-node/01kernel-tests
01motr-single-node/03console
01motr-single-node/02rpcping
02motr-single-node/07m0d-fatal
02motr-single-node/67fdmi-plugin-multi-filters
02motr-single-node/53clusterusage-alert
02motr-single-node/41motr-conf-update
03motr-single-node/61sns-repair-motr-1n-1f
03motr-single-node/72spiel-sns-motr-repair-quiesce
03motr-single-node/08spiel-multi-confd
03motr-single-node/69sns-repair-motr-quiesce
03motr-single-node/62sns-repair-motr-mf
03motr-single-node/70sns-failure-after-repair-quiesce
03motr-single-node/63sns-repair-motr-1k-1f
03motr-single-node/60sns-repair-motr-1f
03motr-single-node/66sns-repair-motr-abort-quiesce
03motr-single-node/24motr-dix-repair-lookup-insert-spiel
03motr-single-node/68sns-repair-motr-shutdown
03motr-single-node/64sns-repair-motr-ios-fail
03motr-single-node/71spiel-sns-motr-repair
03motr-single-node/24motr-dix-repair-lookup-insert-m0repair
03motr-single-node/04sss
03motr-single-node/65sns-repair-motr-abort
04motr-single-node/73motr-io-small-disks
04motr-single-node/48motr-raid0-io
04motr-single-node/74motr-di-corruption-detection
04motr-single-node/49motr-rpc-cancel
04motr-single-node/25m0kv
04motr-single-node/44motr-rm-lock-cc-io
04motr-single-node/45motr-rmw
05motr-single-node/23dix-repair-m0repair
05motr-single-node/43motr-sync-replication
05motr-single-node/42motr-utils
05motr-single-node/45motr-sns-repair-N-1
05motr-single-node/40motr-dgmode
05motr-single-node/23dix-repair-quiesce-m0repair
05motr-single-node/23spiel-dix-repair-quiesce
05motr-single-node/44motr-sns-repair
05motr-single-node/23spiel-dix-repair

Total77🔗

CppCheck Summary

   Cppcheck: No new warnings found 👍

@andriytk andriytk changed the title Add DTM0 DLD Add DTM0 Online Recovery DLD Aug 30, 2022
@andriytk andriytk changed the title Add DTM0 Online Recovery DLD dtm0: add online recovery DLD Aug 30, 2022
@rkothiya rkothiya merged commit 2698e71 into main Aug 30, 2022
@andriytk andriytk deleted the dtm0/dld branch September 1, 2022 09:01
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
needs-attention Status: Merge Conflicts PR has conflicts that need to resolved before it can be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants