-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Concurrent shard moves with colocated table creation might fail to create placement for the new table #6050
Comments
I don't think we support parallel shard moves. We might be missing a lock somewhere. |
I created a script which does almost the above, but couldn't reproduce the problem yet: https://gist.github.com/onderkalaci/9c72d32adea42848e6368a7e943cf246 Needs more investigatio |
We do seem to acquire proper lock here: Two shard moves block each other: select * from citus_lock_waits;
┌─[ RECORD 1 ]──────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ waiting_gpid │ 10000049454 │
│ blocking_gpid │ 10000049328 │
│ blocked_statement │ select citus_move_shard_placement(120072, 'localhost', 9700, 'localhost', 9702, 'force_logical'); │
│ current_statement_in_blocking_process │ SELECT citus_move_shard_placement(120066, 'localhost', 9700, 'localhost', 9702, 'force_logical'); │
│ waiting_nodeid │ 1 │
│ blocking_nodeid │ 1 │
└───────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────┘
Time: 87.205 ms
-- waiting on an advisory lock
select * from pg_locks where pid = 49454 and not granted;
┌─[ RECORD 1 ]───────┬───────────────────────────────┐
│ locktype │ advisory │
│ database │ 13236 │
│ relation │ │
│ page │ │
│ tuple │ │
│ virtualxid │ │
│ transactionid │ │
│ classid │ 0 │
│ objid │ 0 │
│ objsubid │ 12 │
│ virtualtransaction │ 12/39861 │
│ pid │ 49454 │
│ mode │ ExclusiveLock │
│ granted │ f │
│ fastpath │ f │
│ waitstart │ 2022-07-14 14:47:02.270842+02 │
└────────────────────┴───────────────────────────────┘
|
I'm not investigating whether there are any issues with deferred drop. Note that they ensured that there is only one maintenance daemon running on the server |
Might be related to #4909, marking for visibility |
Steps to repro:
-- session 2
-- back to session 1
-- this command got blocked
-- switch to session 2 again
COMMIT;
-- now, from any node see the the placement for test_2 is lost
-- ops, one shard DOES NOT HAVE any placements
The text was updated successfully, but these errors were encountered: