-
-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Interval type improvements #1067
Conversation
If SQLAlchemy doesn't have this hint, it uses default psycopg2 behavior, which loses information in the case of intervals. Performance drag, but rarely used.
... hopefully
Codecov Report
@@ Coverage Diff @@
## master #1067 +/- ##
==========================================
+ Coverage 92.70% 92.77% +0.06%
==========================================
Files 108 109 +1
Lines 3852 3888 +36
==========================================
+ Hits 3571 3607 +36
Misses 281 281
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, I added a couple of comments. @mathemancer, please resolve before merge.
I also think @silentninja should review this since he more recently worked on date and time types.
Also remove a bit of cruft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Nice work @mathemancer, I learned a few sqlalchemy tricks, thanks. Can you add the interval standard as ISO 8601
to the Engineering decision
mathesar/api/db/viewsets/columns.py
Outdated
@@ -69,19 +70,13 @@ def create(self, request, table_pk=None): | |||
) | |||
else: | |||
raise database_base_api_exceptions.ProgrammingAPIException(e) | |||
except TypeError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have you removed capturing this exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. I changed that so I could get more debugging output at some point and forgot to revert the change.
@@ -29,6 +29,7 @@ class TypeOptionSerializer(MathesarErrorMessageMixin, serializers.Serializer): | |||
length = serializers.IntegerField(required=False) | |||
precision = serializers.IntegerField(required=False) | |||
scale = serializers.IntegerField(required=False) | |||
fields = serializers.CharField(required=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to add this api change to the PR description.
f'fields "{self.impl.fields}" is not in {all_fields}' | ||
) | ||
|
||
def column_expression(self, col): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any issues with using column_expression
, I am just curious to know if you have thought about setting a local intervalstyle
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with that is we need to avoid psycopg2
interpreting the result as an actual INTERVAL
. The column expression casts to TEXT
at the DB layer, which then gets picked up by psycopg2
as a python str
. The problem with letting psycopg2
pick up INTERVAL
s otherwise is that it uses a python timedelta
to represent them, but timedelta
makes some different choices than PosgreSQL (e.g., assuming 30 days can be accumulated into 1 month) and the conversion is therefore inaccurate. This would result in erroneous info being passed to the UI, and in fact make viewing some values (e.g., 37 days) impossible.
intervalstyle
doesn't change the actual return type, and so doesn't really solve the problem by itself. We'd still need to cast to text using a column_expression
. Given that, I opted to handle the formatting ourselves, since the PostgreSQL iso_8601
style is slightly out-of-spec, and made a few choices that don't work that well for our use case (IMO). It also makes the formatting that would happen completely visible, rather than hidden in the PostgreSQL docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, thanks for the explanation!
I'm currently writing up a wiki page that will cover the spec. It's a little more involved. |
Fixes #430
This adds a custom interval type at the SQLAlchemy level that maps to the default PostgreSQL type. Further, we can now accept
precision
andfields
arguments when creating or altering columns involving theINTERVAL
type.Technical details
The
precision
type option takes an integer (1-6) as input. Thefields
type option takes a string, and defines which fields the interval stores. Acceptable strings are:If both
precision
andfields
are specified, thenfields
must include seconds, sinceprecision
applies to the seconds field.The reason for a custom type was initially to ensure that we didn't convert PostgreSQL
INTERVAL
s into Pythontimedelta
s, since that conversion is lossy. It also standardizes the output and some aspects of input for intervals. Future PRs will similarly standardize output of other time-related types.For reference, we'll be using the ISO 8601 spec as reduced by RFC 3339 for standardized output, and always-acceptable (at the DB layer) input. We will also attempt to handle any string as input using the default PostgreSQL date / time / duration parsing. So, in the case of intervals, we have strings like
Each variable is an integer with the exception of seconds for output (seconds can be a float). For input, decimals are allowed, and will be converted appropriately. Also, inputs will "carry over" when possible. Seconds and minutes will aggregate into hours, but hours won't aggregate into days since some days are different numbers of hours around DST changes. Days won't aggregate into months, but months will aggregate into years. For output, any missing units (e.g. zero minutes) will have actual zeroes so the client can count on each part being in the returned string. For input, this is not necessary (but you do need to include the
T
separator between the date and time sections if you include time values).As a bonus, this PR also fixes some bugs in the constraints API tests that were preventing the pipeline from passing.
Finally, there
Checklist
Update index.md
).master
branch of the repositoryvisible errors.
Developer Certificate of Origin
Developer Certificate of Origin