Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

All target table reads & writes blocked because of autovacuum on repack temporary table #371

Open
schmave opened this issue Nov 9, 2023 · 3 comments

Comments

@schmave
Copy link

schmave commented Nov 9, 2023

I'm running pg_repack 1.4.7 on Postgres 14 hosted on Amazon RDS.

I see other issues related to autovacuum, but I don't think any of them are the same as this. Thanks for taking a look.

While pg_repack was repacking a table A in my database the other night, all reads and writes to the table were failing. I looked at the running processes and saw that this query had been running for over half an hour:

postgres | 6881 | | autovacuum: VACUUM ANALYZE repack.table_36175838 (to prevent wraparound)

After I canceled that process with select pg_cancel_backend(6881);, reads and writes on the original table A were able to proceed.

In investigating the outage after the fact, I saw this in the logs. I believe that "canceling autovacuum task" was caused by me running pg_cancel_backend.

2023-11-08 09:37:27 UTC::@:[6881]:ERROR:  canceling autovacuum task
2023-11-08 09:37:27 UTC::@:[6881]:CONTEXT:  while scanning block 384003 of relation "repack.table_36175838"
	automatic vacuum of table "cs1.repack.table_36175838"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  process 28599 acquired AccessExclusiveLock on relation 116814165 of database 16401 after 2233899.660 ms
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:CONTEXT:  SQL statement "DROP TABLE IF EXISTS repack.table_36175838 CASCADE"
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:STATEMENT:  SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  duration: 2233917.744 ms  execute <unnamed>: SELECT repack.repack_drop($1, $2)
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:DETAIL:  parameters: $1 = '36175838', $2 = '4'
2023-11-08 09:37:27 UTC:172.31.65.98(54604):postgres@cs1:[28599]:LOG:  duration: 2233917.741 ms  plan:
	Query Text: SELECT repack.repack_drop($1, $2)
	Result  (cost=0.00..0.01 rows=1 width=4) (actual rows=1 loops=1)
	  Buffers: shared hit=1016 read=15 dirtied=28
	  I/O Timings: read=8.343

Here's an example of a query not being able to run because of locks held by process 28599:

2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:LOG:  process 29360 still waiting for AccessShareLock on relation 36175838 of database 16401 after 1000.045 ms at character 1034
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:DETAIL:  Process holding the lock: 28599. Wait queue: 28809, 10941, 29360.
2023-11-08 09:37:01 UTC:172.31.76.15(33070):postgres@cs1:[29360]:STATEMENT:  SELECT * FROM A WHERE (A.b IS NULL AND A.c = 602146)

Should pg_repack prevent autovacuum on the temporary table or automatically kill any processes that prevent it from acquiring a lock to delete the table? Maybe I'm doing something wrong. Any advice is much appreciated.

@za-arthur
Copy link
Collaborator

From your logs I see that the lock contention happened between repack.repack_drop() and autovacuum worker, and repack.repack_drop() blocks all other reads and writes.

On this stage pg_repack should have swapped original and temporary tables. And therefore it isn't clear what client reads/writes to that temporary table. From my understanding pg_repack shouldn't try to read/write temporary table.

Does another application tries to read the temporary table?

@za-arthur
Copy link
Collaborator

pg_repack 1.5.0 has a fix #298. It resolves slightly different issue (previously pg_repack locked the original table for a long time), but it might help in your case.

@schmave
Copy link
Author

schmave commented Nov 29, 2023

Thanks for taking a look at this! I don't think that there is any application reading the temporary table, though I could be mistaken about that. Glad to hear that #298 might be helpful here.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants