Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add pg_lsn postgres data type #4499

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

arpad-m
Copy link

@arpad-m arpad-m commented Feb 19, 2025

A pg_lsn encodes an offset into the Postgres Write Ahead Log (WAL). It is a native type of Postgres.

Copy link
Member

@Ten0 Ten0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your contribution!

I have no strong opinion on whether this should live in Diesel (I'd say I'm 80% sure that others will say yes, but unless I'm mistaken this could alternatively be implemented in any given schema crate and I doubt many people use this, and this seems to add non trivial code) so I'll let others voice their opinions there.

That being said If we do indeed merge this I think I notice a few things that we might want to address before merging.

Thanks again! 🤗

/// [`pg_lsn`]: https://www.postgresql.org/docs/current/datatype-pg-lsn.html
#[cfg(feature = "postgres_backend")]
#[derive(Debug, Clone, Copy, Default, QueryId, SqlType)]
#[diesel(postgres_type(name = "pg_lsn"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't it have a constant oid, like most other types?
If so then we should prefer using that so as to avoid doing unnecessary dynamic type OID lookups.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed it has a constant oid, will make it use that one instead.

Comment on lines 20 to 21
let val = value.as_bytes().read_u64::<NetworkEndian>().map_err(|_| {
Box::<dyn Error + Send + Sync>::from("invalid lsn format: input isn't 8 bytes.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I'm mistaken read_u64 will succeed as long as input length is >= 8 bytes, but we might want to check that there is nothing left as well, to minimize the odds of mistakes:

if bytes.len() < 8 {
return emit_size_error(
"Received less than 8 bytes while decoding an i64. \
Was an Integer expression accidentally marked as BigInt?",
);
}
if bytes.len() > 8 {
return emit_size_error(
"Received more than 8 bytes while decoding an i64. \
Was an expression of a different type expression accidentally marked as BigInt?"
);
}

Comment on lines 58 to 62
if last.ident == "PgLsn" {
"pg_lsn".to_string()
} else {
last.ident
.to_string()
.split("_")
.collect::<Vec<_>>()
.join(" ")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me how this is consistent with what was previously written: previous writing seems to take something that's snake and turn it into separate words, while this seems to take something that's PascalCase, and turn it into snake, and I don't understand where these are mixed and this is a legitimate special case.
Could you please clarify this for me?
(Disclaimer: I have not investigated this in detail.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, it would take PgLsn and turn it into PGLSN. For the othe types like TimestampTz this is not a problem: TIMESTAMPTZ is accepted by postgres. But PGLSN isn't. The underscore is required: either pg_lsn or PG_LSN, or some other casing.

I tried calling the type Pg_Lsn, but that would not just be against Rust's naming guidelines (it would trigger lints), it also created issues with other parts of the diesel tooling which want to turn pg_lsn into PgLsn.

Given that there seem to be no other types with an underscore, I figured the easiest fix is to just special case it in this piece of code. It allows us to use PgLsn for the type name without trouble.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, did you figure out what the previous code was intending to do then ? I don't see a reason why it replacing underscores with spaces was something it wanted to do before though. (It might be linked to the fact that in Diesel 1.4 SQL type names inferred by the CLI might contain underscores, IIRC... But maybe that isn't relevant anymore)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason why it replacing underscores with spaces was something it wanted to do before though.

Don't really know but I didn't want to create spacebar heater situations where this change broke it for people.

In any case, I don't think we should call the type Pg_Lsn. It would then require changes in other code that turns pg_lsn into PgLsn (to make it turn it to Pg_Lsn). But your call in the end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, I don't think we should call the type Pg_Lsn. It would then require changes in other code that turns pg_lsn into PgLsn (to make it turn it to Pg_Lsn). But your call in the end.

Yeah no I agree that PgLsn is the correct name but I'm weary of making a change here without understanding why the code that was here before was here, because I don't see how these are consistent, and I'm wondering if the previous code even makes sense anymore now.

Don't really know

I'll have to investigate this further then.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pg_lsn is a bit of an outlier that it has an actual underscore in the type name. E.g. we don't want it to turn TimestampTz into timestamp_tz, because postgres doesn't use an underscore in that type. And this is the case for the other types that use camel case as well.

Copy link
Member

@Ten0 Ten0 Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but if the other types are supposed to use upper camel (pascal) here why is the previous code looking for underscores? is what I'm wondering.
Because if it's not doing anything, then there's a chance that this entire code should just be pascal to snake instead of special casing PgLsn.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A generic pascal to snake would fail because it would turn stuff like TimestampTz into timestamp_tz. pg_lsn is the only diesel-supported type with an underscore, so I feel like hardcoding this makes sense. If more types with an underscore is added, one can turn the single case into a list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell now (as person that have written that old code 🤦 ) there was no case where this was required. I checked our type defs and did not found any even hidden type def that contains an _ so I'm not sure why that splitting was there is the first place. Given that I would suggest to just remove the splitting, but keeping the special casing for pg_lsn. If it turns out to be important, we well certainly learn about that in the future from a bug report. After all that part of the code is not 100% mission critical as it is only used to generate migrations in the first place.

Copy link
Author

@arpad-m arpad-m Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the _ to code as per your request. I don't care in either direction about that part. When I changed the code originally I didn't know whether it was important or not so I erred on the side of not touching it.

Copy link
Member

@weiznich weiznich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contribution. This looks fine to me beside that minor remark about the commented out code.

Comment on lines 58 to 62
if last.ident == "PgLsn" {
"pg_lsn".to_string()
} else {
last.ident
.to_string()
.split("_")
.collect::<Vec<_>>()
.join(" ")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell now (as person that have written that old code 🤦 ) there was no case where this was required. I checked our type defs and did not found any even hidden type def that contains an _ so I'm not sure why that splitting was there is the first place. Given that I would suggest to just remove the splitting, but keeping the special casing for pg_lsn. If it turns out to be important, we well certainly learn about that in the future from a bug report. After all that part of the code is not 100% mission critical as it is only used to generate migrations in the first place.

Comment on lines 263 to 265
/*diesel::sql_query("SET lc_messages TO 'en_US.UTF-8'")
.execute(&mut conn)
.unwrap();*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason to comment out this code?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants