Interval type improvements #1067

mathemancer · 2022-02-14T13:32:58Z

Fixes #430

This adds a custom interval type at the SQLAlchemy level that maps to the default PostgreSQL type. Further, we can now accept precision and fields arguments when creating or altering columns involving the INTERVAL type.

Technical details

The precision type option takes an integer (1-6) as input. The fields type option takes a string, and defines which fields the interval stores. Acceptable strings are:

YEAR
MONTH
DAY
HOUR
MINUTE
SECOND
YEAR TO MONTH
DAY TO HOUR
DAY TO MINUTE
DAY TO SECOND
HOUR TO MINUTE
HOUR TO SECOND
MINUTE TO SECOND

If both precision and fields are specified, then fields must include seconds, since precision applies to the seconds field.

The reason for a custom type was initially to ensure that we didn't convert PostgreSQL INTERVALs into Python timedeltas, since that conversion is lossy. It also standardizes the output and some aspects of input for intervals. Future PRs will similarly standardize output of other time-related types.

For reference, we'll be using the ISO 8601 spec as reduced by RFC 3339 for standardized output, and always-acceptable (at the DB layer) input. We will also attempt to handle any string as input using the default PostgreSQL date / time / duration parsing. So, in the case of intervals, we have strings like

f"P{years}Y{months}M{days}DT{hours}H{minutes}M{seconds}S"

Each variable is an integer with the exception of seconds for output (seconds can be a float). For input, decimals are allowed, and will be converted appropriately. Also, inputs will "carry over" when possible. Seconds and minutes will aggregate into hours, but hours won't aggregate into days since some days are different numbers of hours around DST changes. Days won't aggregate into months, but months will aggregate into years. For output, any missing units (e.g. zero minutes) will have actual zeroes so the client can count on each part being in the returned string. For input, this is not necessary (but you do need to include the T separator between the date and time sections if you include time values).

As a bonus, this PR also fixes some bugs in the constraints API tests that were preventing the pipeline from passing.

Finally, there

Checklist

My pull request has a descriptive title (not a vague title like Update index.md).
My pull request targets the master branch of the repository
My commit messages follow best practices.
My code follows the established code style of the repository.
I added tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no
visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

If SQLAlchemy doesn't have this hint, it uses default psycopg2 behavior, which loses information in the case of intervals. Performance drag, but rarely used.

... hopefully

codecov-commenter · 2022-02-15T05:52:24Z

Codecov Report

Merging #1067 (c93359f) into master (a7591cb) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1067      +/-   ##
==========================================
+ Coverage   92.70%   92.77%   +0.06%     
==========================================
  Files         108      109       +1     
  Lines        3852     3888      +36     
==========================================
+ Hits         3571     3607      +36     
  Misses        281      281

Flag	Coverage Δ
pytest-backend	`92.77% <100.00%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
db/columns/base.py	`92.06% <ø> (ø)`
db/columns/operations/select.py	`100.00% <100.00%> (ø)`
db/types/__init__.py	`100.00% <100.00%> (ø)`
db/types/exceptions.py	`100.00% <100.00%> (ø)`
db/types/interval.py	`100.00% <100.00%> (ø)`
db/types/operations/cast.py	`100.00% <100.00%> (ø)`
mathesar/api/db/viewsets/columns.py	`91.39% <100.00%> (+0.09%)`	⬆️
mathesar/api/serializers/columns.py	`98.79% <100.00%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7591cb...c93359f. Read the comment docs.

kgodey

Looks good to me overall, I added a couple of comments. @mathemancer, please resolve before merge.

I also think @silentninja should review this since he more recently worked on date and time types.

db/filters/operations/apply.py

db/tests/types/test_interval.py

Also remove a bit of cruft.

silentninja

Looks good to me. Nice work @mathemancer, I learned a few sqlalchemy tricks, thanks. Can you add the interval standard as ISO 8601 to the Engineering decision

silentninja · 2022-02-16T00:35:05Z

mathesar/api/db/viewsets/columns.py

@@ -69,19 +70,13 @@ def create(self, request, table_pk=None):
                    )
                else:
                    raise database_base_api_exceptions.ProgrammingAPIException(e)
-            except TypeError as e:


Why have you removed capturing this exception?

Ah, good catch. I changed that so I could get more debugging output at some point and forgot to revert the change.

silentninja · 2022-02-16T00:41:55Z

mathesar/api/serializers/columns.py

@@ -29,6 +29,7 @@ class TypeOptionSerializer(MathesarErrorMessageMixin, serializers.Serializer):
    length = serializers.IntegerField(required=False)
    precision = serializers.IntegerField(required=False)
    scale = serializers.IntegerField(required=False)
+    fields = serializers.CharField(required=False)


It would be helpful to add this api change to the PR description.

silentninja · 2022-02-16T01:12:57Z

db/types/interval.py

+                    f'fields "{self.impl.fields}" is not in {all_fields}'
+                )
+
+    def column_expression(self, col):


I don't have any issues with using column_expression, I am just curious to know if you have thought about setting a local intervalstyle instead.

The problem with that is we need to avoid psycopg2 interpreting the result as an actual INTERVAL. The column expression casts to TEXT at the DB layer, which then gets picked up by psycopg2 as a python str. The problem with letting psycopg2 pick up INTERVALs otherwise is that it uses a python timedelta to represent them, but timedelta makes some different choices than PosgreSQL (e.g., assuming 30 days can be accumulated into 1 month) and the conversion is therefore inaccurate. This would result in erroneous info being passed to the UI, and in fact make viewing some values (e.g., 37 days) impossible.

intervalstyle doesn't change the actual return type, and so doesn't really solve the problem by itself. We'd still need to cast to text using a column_expression. Given that, I opted to handle the formatting ourselves, since the PostgreSQL iso_8601 style is slightly out-of-spec, and made a few choices that don't work that well for our use case (IMO). It also makes the formatting that would happen completely visible, rather than hidden in the PostgreSQL docs.

That makes sense, thanks for the explanation!

…nterval_type

mathemancer · 2022-02-16T06:03:05Z

Looks good to me. Nice work @mathemancer, I learned a few sqlalchemy tricks, thanks. Can you add the interval standard as ISO 8601 to the Engineering decision

I'm currently writing up a wiki page that will cover the spec. It's a little more involved.

mathemancer added 26 commits January 28, 2022 00:51

ISO8601 interval support

a4b3dec

add initial interval type tests

6b9613a

remaining interval tests

1fab711

add args and validation to custom interval type

7139599

add tests for fields and precision setting

d3a0db5

clean up formatting, remove extraneous print

11f1e74

double cast default values so SQLAlchemy picks up type

b2a3189

If SQLAlchemy doesn't have this hint, it uses default psycopg2 behavior, which loses information in the case of intervals. Performance drag, but rarely used.

update casting tests with new interval output strings

ef44272

fix API test to reflect new interval output

0395dc1

fix bug preventing certain type casts

037da48

handle type options at db layer for interval

46603b5

handle type options (no errors) at API layer

c1f5d37

Merge branch 'master' into interval_type

6a5d1c3

Merge branch 'master' into interval_type

9e74573

compile cast expr in tests, pass as object otherwise

3bcb896

add new type parameter exceptions to viewsets

4bb431c

add tests for creation of interval cols with options

3036187

fix flake8 problem

ff6ba8a

specify JSON content type for constraints API tests

faa16ba

break cache to try getting tests to run

ade460b

add verbose output, reduce coverage for debugging (REVERT ME)

c658d3e

run all tests in pipeline

72bed82

remove engine echo from tests for slightly reduced output

fa03722

add more debugging output

b87736b

make constraint API tests order independent

76e2f48

... hopefully

Merge branch 'master' into interval_type

e2995e4

revert debugging github workflow changes

0ad0435

mathemancer marked this pull request as ready for review February 15, 2022 06:39

mathemancer requested a review from a team February 15, 2022 06:40

github-actions bot requested review from dmos62, eito-fis, kgodey, pavish and silentninja February 15, 2022 06:40

kgodey assigned kgodey and silentninja Feb 15, 2022

kgodey approved these changes Feb 15, 2022

View reviewed changes

db/filters/operations/apply.py Outdated Show resolved Hide resolved

db/tests/types/test_interval.py Show resolved Hide resolved

mathemancer and others added 2 commits February 16, 2022 00:00

add test for interval default setting/getting

d0c60ba

Also remove a bit of cruft.

Merge branch 'master' into interval_type

f1e149a

kgodey removed their assignment Feb 15, 2022

kgodey added the pr-status: review A PR awaiting review label Feb 15, 2022

silentninja requested changes Feb 16, 2022

View reviewed changes

silentninja assigned mathemancer and unassigned silentninja Feb 16, 2022

silentninja added pr-status: revision A PR awaiting follow-up work from its author after review and removed pr-status: review A PR awaiting review labels Feb 16, 2022

mathemancer added 2 commits February 16, 2022 12:49

add accidentally removed exception handling statement

7c45f70

Merge branch 'interval_type' of github.com:centerofci/mathesar into i…

c93359f

…nterval_type

mathemancer requested a review from silentninja February 16, 2022 06:07

silentninja enabled auto-merge February 16, 2022 08:03

silentninja approved these changes Feb 16, 2022

View reviewed changes

silentninja merged commit cbe1f94 into master Feb 16, 2022

silentninja deleted the interval_type branch February 16, 2022 08:04

silentninja mentioned this pull request Feb 17, 2022

Add initial spec for handling dates and times mathesar-foundation/mathesar-wiki#35

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interval type improvements #1067

Interval type improvements #1067

mathemancer commented Feb 14, 2022 •

edited

Loading

codecov-commenter commented Feb 15, 2022 •

edited

Loading

kgodey left a comment

silentninja left a comment •

edited

Loading

silentninja Feb 16, 2022

mathemancer Feb 16, 2022

silentninja Feb 16, 2022

silentninja Feb 16, 2022

mathemancer Feb 16, 2022 •

edited

Loading

silentninja Feb 16, 2022

mathemancer commented Feb 16, 2022

Interval type improvements #1067

Interval type improvements #1067

Conversation

mathemancer commented Feb 14, 2022 • edited Loading

Checklist

Developer Certificate of Origin

codecov-commenter commented Feb 15, 2022 • edited Loading

Codecov Report

kgodey left a comment

Choose a reason for hiding this comment

silentninja left a comment • edited Loading

Choose a reason for hiding this comment

silentninja Feb 16, 2022

Choose a reason for hiding this comment

mathemancer Feb 16, 2022

Choose a reason for hiding this comment

silentninja Feb 16, 2022

Choose a reason for hiding this comment

silentninja Feb 16, 2022

Choose a reason for hiding this comment

mathemancer Feb 16, 2022 • edited Loading

Choose a reason for hiding this comment

silentninja Feb 16, 2022

Choose a reason for hiding this comment

mathemancer commented Feb 16, 2022

mathemancer commented Feb 14, 2022 •

edited

Loading

codecov-commenter commented Feb 15, 2022 •

edited

Loading

silentninja left a comment •

edited

Loading

mathemancer Feb 16, 2022 •

edited

Loading