Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Doubts about repacked tables relfrozenxid #152

Open
dvarrazzo opened this issue Oct 7, 2017 · 11 comments
Open

Doubts about repacked tables relfrozenxid #152

dvarrazzo opened this issue Oct 7, 2017 · 11 comments

Comments

@dvarrazzo
Copy link
Member

I'm down in the rabbit hole trying to understand #135. Looking at what the swap_head_or_index_files() does here, it seems that the original author wanted to set the repacked table's relfrozenxid to the greatest of the two, but the code does the opposite (or better it sets the relfrozenxid on the temporary table that then it drops).

contrib_regression=# select  relfrozenxid, txid_current() from pg_class where oid = 'test'::regclass;
 relfrozenxid | txid_current 
--------------+--------------
       444833 |       444901
(1 row)

contrib_regression=# vacuum full test;
VACUUM
contrib_regression=# select  relfrozenxid, txid_current() from pg_class where oid = 'test'::regclass;
 relfrozenxid | txid_current 
--------------+--------------
       444902 |       444903
(1 row)

contrib_regression=# cluster test using test_pkey;
CLUSTER
contrib_regression=# select  relfrozenxid, txid_current() from pg_class where oid = 'test'::regclass;
 relfrozenxid | txid_current 
--------------+--------------
       444905 |       444906
(1 row)

What pg_repack does is sort of backwards:

contrib_regression=# \! /usr/lib/postgresql/9.6/bin/pg_repack -d contrib_regression -t test
INFO: repacking table "test"

# adding a couple of LOG in the code:
LOG:  relform1->relfrozenxid = 444905
LOG:  relform2->relfrozenxid = 444908

contrib_regression=# select  relfrozenxid, txid_current() from pg_class where oid = 'test'::regclass;
 relfrozenxid | txid_current 
--------------+--------------
       444905 |       444917
(1 row)

shouldn't we set the repacked table frozen xid to the one of the new table?

@schmiddy
Copy link
Member

schmiddy commented Oct 9, 2017

You may already know all this @dvarrazzo , but just to summarize some code archaelogy from a little digging I did: the comment above our swap_heap_or_index_files says:

This is a copy of swap_relation_files in cluster.c, but it also swaps relfrozenxid.

Looks like way back in the days of PG 8.2 , the signature was: swap_relation_files(Oid r1, Oid r2). And then for PG 8.3, that changed to swap_relation_files(Oid r1, Oid r2, TransactionId fronzenXid. I bet the original author copied from PG 8.2, and added in this logic for relfrozenxid copying which maybe he mixed up. I think we should try to update our swap_relation_files() signature to mimic the slightly-more-recent version of cluster.c from PG. (And at some point we should catch up to the even more recent changes to swap_relation_files()...)

@dvarrazzo
Copy link
Member Author

Yes I noticed the fork happened between 8.2 and 8.3. We can go forward version-by-version I guess.

The function signature in the latest version has sprouted a lot of arguments...

@bashtanov
Copy link
Contributor

bashtanov commented Nov 1, 2017

I'm sorry I noticed this conversation only after I've opened #157.

Anyway, why is it a bad idea to swap relfrozenxids together with the data? It looks correct, as table's relfrozenxid is consistent with its data.

For the new table, it's actually the transaction the table was created, unless the table was autovacuumed in the meanwhile, which I doubt repack transactions and locks would allow to happen. Even if it would, autovacuum would update the relfrozenxid correctly. So it looks like this value is good enough and we cannot use any later one instead.

As far as I understand, core postgres CLUSTER needs something more sophisticated because it copies the tuples in low-level way, without changing their xmin or xmax. Unlike them, repack uses INSERT by SELECT, which changes the xids to newer ones, so we shouldn't blindly copy postgres' swap_relation_files xids behaviour.

@bashtanov
Copy link
Contributor

@dvarrazzo @schmiddy I would appreciate your opinion on my comment and PR

@dvarrazzo
Copy link
Member Author

Hi @bashtanov sorry for not getting back to you.

My intention is to work on this problem and on #135 in a few days, during the xmas break. I'll look into your MR but I guess it will be "swallowed" by a sea of other changes aligning the swap_head_or_index_files() to the current state in Postgres code.

@MichaelDBA
Copy link
Collaborator

Where are we on this issue and the underlying issue that is also still open related to a problem with logical replication and pg_repack (#135)? I haven't seen an update in a few months. Also I see a subsequent issue that is related to this one as well (#157).

I hesitate to use pg_repack anymore in my v10 logical replication environment.

@dvarrazzo
Copy link
Member Author

dvarrazzo commented Mar 22, 2018

@MichaelDBA We are discussing just now about it in #157, as you have probably read (so why your message here, even?).

The issue (which really is this, #152 is only marginal) is complex and is about postgres internals evolution: the people working in this project have only a marginal understanding of what happened in the 10 years after this code was written and my request for help has gone basically ignored. I can speak only for myself: I have limited time for unpaid work, mine is volunteering. If work will be done, you will benefit of it for free, as you have done so far. If you want to contribute to the problem in a more constructing way than "I hesitate to use pg_repack anymore" you are welcome. If you are a core Postgres developer then please help fixing the problem or convince (pay, lure, threaten) someone to do it.

@MichaelDBA
Copy link
Collaborator

MichaelDBA commented Mar 22, 2018

Thanks for the comments, Daniele. I guess what I was interested in is what thread to follow on this issue. Couldn't we just close #152 and #135 saying it is being resolved through issue, #157?

That way we don't have to monitor multiple issues kinda dealing with the same problem. As a side question, is there a future plan to integrate pg_repack into core postgreSQL at some point? Then core guys have to take care of it. Bringing online non-blocking defrag feature into PostgreSQL seems a great selling point.

@dvarrazzo
Copy link
Member Author

@MichaelDBA

#157 is a pull request which fixes #152, which is a bug about xid swapped. Pretty much standard github workflow.

#135 is a bug that says logical decoding is broken, which might be fixed updating the code to the current architecture and is not closed by #157.

Talking with the Postgres core people is a disheartening experience I have gone through several time already: I am not at their level, both technically and masochistically. I keep on doing it if I feel the project at hand needs it, but TBH I have better used of my time than being publicly called a moron, so I save the taste for the special occasions. Yes, I believe this project shouldn't even exist and postgres should defrag its table by itself, but they all seem fine the way with the way things are (Bloat? What bloat?) and I don't think I am able to talk them into it.

@MichaelDBA
Copy link
Collaborator

MichaelDBA commented Mar 22, 2018

Thanks again, Daniele, for your updates on the respective issues and how they really should be handled separately. On another note, can you look at another issue I opened a on January 18 and there has been no feedback for almost 30 days now, #167. I think that is very important since autovacuum stops pg_repack in its tracks!

@ardentperf
Copy link

Old issue, but ISTM the discussion never really came to a conclusion. My comment here is that if repack is not processing visibility (ie. copying dead rows) then the relfrozenxid needs to be preserved. OTOH, if repack is processing visibility information as it copies data (skipping dead rows, essentially vacuuming as it copies) then it should be safe to update relfrozenxid afterwards. This all needs a very careful look, of course, but that makes sense to me at a high level.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants