Add initial spec for handling dates and times #35

mathemancer · 2022-02-17T08:47:39Z

This PR adds a spec to the wiki showing how we'll deal with dates, times, date/time combinations, and time durations.

dmos62 · 2022-02-17T13:08:22Z

Everything seems sound. I've no comments.

mathemancer · 2022-02-17T13:10:20Z

Quick question for the front end devs: Right now, I have the spec specifying using 'AD' or 'BC' to denote the era in all cases for consistency. However, it's often omitted for recent dates (e.g., the last few hundred years or so). My thinking is that the API should be consistent there, and the UI can omit the era information if it wants. Does this work for you all?

Edit: Also, I stuck the era at the end of both date and date/time combo strings. My thinking on that is that this will be easiest to strip off if it's not wanted. On the other hand, it's slightly inconsistent, since it means the string from the DATE is split into two pieces in date/time representations.

seancolsen · 2022-02-17T13:39:08Z

engineering/architecture/dateandtime.md

+1. Users should be able to interact with dates, times, datetimes, and time durations in Mathesar in the most human-friendly way possible
+2. The UI should be extremely forgiving w.r.t. date/time input, and also display dates and times in a human-friendly way.
+3. The API should be extremely forgiving w.r.t. date/time input, but should return dates, times, and time durations in a more programmatic, i.e., consistent way than the UI.
+4. The UI and API should be consistent in the following way:  If the input string is understood by both UI and API, they should always result in the _same date/time_ being stored, regardless of the input method.


If the input string is understood by both UI and API, they should always result in the same date/time being stored, regardless of the input method.

Agreed!

I want to highlight a related conversation in which I said:

Some situations may have edge cases which create difficulty in making the client's results consistent with the server's results.

and @kgodey said:

checking if something is an ISO datetime shouldn't be that hard

I know we're not critiquing implementation yet, but I just want to offer a thought for people to consider early-on during implementation...

If we had a Node backend, reusing the same parsing code across client and server would be trivial, and I'd be inclined to push quite a bit of validation logic to the front end, for date/time and other stuff too. However, with our front end and back end using different languages, it's not trivial. I'm wary of implementing separate parsers (or even simple validators) in python and TypeScript, because as this document demonstrates, there are edge cases to consider.

The first thing I'd do here is to look for a library which implements a parser that we can run on both the client and server. I haven't spent time to research this, but here's an example of a small library I've played with before (not related to dates/times) that runs the same code in Python and a browser (via wasm). If such a library already exists for parsing ISO 8601 and/or RFC 3339, then I'd be inclined to pursue it a bit. If we did decide to use a 3rd-party library, then all these specs would become subject to its implemenation details -- which is why I raise this point now, instead of during our own implementation phase.

@mathemancer, have you looked for any 3rd-party libs that already do this parsing?

If a suitable 3rd-party library doesn't already exist, then I'm a bit torn about how I'd want to proceed. Personally, I'd be quite interested in writing one in Rust, but that's likely more than work that we want to bite off right now. I would accept an approach where we this logic in Python and TypeScript, if others are really set on it. But if everything were up to me, I'd probably begin by delegating full responsibility to the back-end for all the date/time parsing.

A couple of problems arise if we try to do parsing in python. Notably, parsing into python date / time types will break for:

years with more than 4 digits

years b.c.e.

durations given with at least 30 days (python's timedelta always assumes 30 days is a month).

durations given with at least 24 hours (python's timedelta always assumes 24 hours is a day; not true around DST changes).

A lot of what I'm doing right now is avoiding letting python have anything to do with date and/or time parsing.

So, I'm currently doing all parsing in PostgreSQL. It's more sophisticated, and also guarantees consistency with what's actually being stored for obvious reasons.

A lot of what I'm doing right now is avoiding letting python have anything to do with date and/or time parsing

Ok I'm confused then. Can you help me understand? For example, if the user imports a CSV file containing a string that we'd like to parse into a date, where would the code exist that would do that parsing?

Arrow, a python library for parsing DateTime is quite similar to moment.js, so we could consider using it if necessary. We are already using it to validate timestamp-related display options. But I am not sure if we should be parsing on the service layer as most of the python libraries would be converting it to a python based timedelta which have issues with the precision.

@mathemancer

I'm currently doing all parsing in PostgreSQL

Ok I didn't see this edit you made when I wrote my previous comment. Have you considered using something like plv8 and writing all the parsing logic in JavaScript? I've never done this but I've heard good things about it.

What language are you currently planning to use to write the parsing logic?

I don't think the UI should understand date/time while the user is entering it. The only purpose in this case would be to perform validations, and I think validations should always be done on the server for this.

We intend to allow the user to enter any string which may be a date or time. Eg., 'yesterday', 'an hour ago'.

We would probably want to extend these in the future (with more such common strings or even additional languages).

We can only do that without conflict if we did this in one place: either the server or the client and it makes most sense to do it on the server.

After the user enters it, the UI will absolutely need to understand valid date/time strings. This would be possible as the server will always respond with a well-defined format as mentioned further down in the spec. This can be used for any operation the UI needs as well as for formatting the strings based on display options.

What language are you currently planning to use to write the parsing logic?

We are not writing any parsing logic ourselves on the backend; PostgreSQL has excellent parsing logic built in.

Ok I didn't see this edit you made when I wrote my previous comment. Have you considered using something like plv8 and writing all the parsing logic in JavaScript? I've never done this but I've heard good things about it.

I'd need to do anything custom in PL/PGSQL (or just SQL). The reason is that we can't count on being able to install extensions. Some users may be using managed PosgreSQL instances (e.g., Amazon's RDS) where installing extensions isn't possible.

seancolsen · 2022-02-17T13:39:14Z

engineering/architecture/dateandtime.md

+```python
+f'{year}-{month}-{day} {era}'
+```
+Here, `year` will have a minimum of 4 digits, using preceding zeroes for years like 0042 b.c. Years with more than 4 digits have the correct number of digits. `month` is a 2-digit integer, as is `day` (with preceding zeroes as needed). `era` will always be included, and will be either `BC` or `AD` as appropriate.


Here, year will have a minimum of 4 digits

I'm happy to see that we'll be Y10k compliant 😁

seancolsen · 2022-02-17T13:42:14Z

engineering/architecture/dateandtime.md

+
+#### Regarding relative times
+
+There are some relative dates / times which make sense, e.g. "tomorrow", to store in a date column. These will _not_ be stored in a "relative" way, though. They'll be stored as a static date or time (one day later than the current day in the example). As for durations, there's some weird overlap between intervals "2 days ago" and a specific time "2 days ago". We'll choose whether to store the duration or a static time based on the column type.


We'll choose whether to store the duration or a static time based on the column type.

This is very cool!

I imagine we might eventually want to support date/time parsing for contexts in which we don't have a column type. For example, say I want to create a filter condition that selects all records with date_added within the past 2 days. I that case, I think we'd want to honor the fact the the user entered a duration. That ought to be obvious during implementation, but I just wanted to point this out for others to consider.

seancolsen · 2022-02-17T13:49:42Z

engineering/architecture/dateandtime.md

+- `'1 year 1 month, 1 day 1 hour 1 minute 1.1 seconds'` maps to `P1Y1M1DT1H1M1.1S`
+- `'0 years 13 months 31 days 23 hours 60 minutes 61.1 seconds'` maps to `P1Y1M31DT24H1M1.1S`


These are helpful examples of our expected input/output.

From reading these two examples, I infer:

Commas do not matter

Pluralization of words does not matter

Is this correct?

It might also be nice to strip the string and from the input, so that users can enter 1 year and 1 month.

Yes, that's correct (at least for the DB-layer parsing). As for stripping "and", we might be able to do something for single inserts, but I don't think it would be very simple (with the default INTERVAL type) for COPY-based bulk imports. We could probably add something as a feature at some point, but the implementation would be quite difficult.

seancolsen · 2022-02-17T13:59:31Z

@mathemancer

My thinking is that the API should be consistent there, and the UI can omit the era information if it wants. Does this work for you all?

Sounds good!

seancolsen · 2022-02-17T14:05:19Z

Here's an additional edge case to consider too, which I don't fully understand https://en.wikipedia.org/wiki/Year_zero#ISO_8601 Maybe not worth thinking about more, though I'm not sure.

seancolsen · 2022-02-17T14:44:38Z

Here's another thing I just realized might be missing from this spec (though perhaps the scope of the spec is more narrow than I'm imagining).

In contexts where we accept a date, the user should be able to enters a string like "1/7/2022", right? What about "6 Aug, 1870"? What about "7/2" (inferring the current year)? There are lots of variations here obviously, and a good deal of ambiguity too, which requires knowledge about locale to resolve. For example, "1/7/2022" might become "2022-01-07" in USA but "2022-07-01" in many European countries. These situations are tricky. Should this spec deal with those situations too?

silentninja · 2022-02-17T14:51:21Z

engineering/architecture/dateandtime.md

+
+## Overall input flow
+
+If the UI is able to parse a date, time, duration, etc., it should reformat it into the appropriate canonical form described below and send that as the input to be stored to the API. If it's not able to parse the input, it should send the raw input string to the API and let the service layer / database try to parse the string. This helps satisfy goal number 5. If the service layer is able to parse the string, then store it. Otherwise, raise a validation error.


let the service layer / database try to parse the string. This helps satisfy goal number 5. If the service layer is able to parse the string, then store it. Otherwise, raise a validation error.

I am not sure if we should attempt to do the parsing on the service layer, as most of the python libraries would be converting it to a python based timedelta which has an issue with the precision.

By "service layer", I'm mostly meaning the database. Perhaps I should be clearer there. We're going to strictly avoid using any python date / time types.

kgodey

@mathemancer Looks great overall. Two changes requested:

(1) I think we need documentation (or a link to documentation) about the formats that PostgreSQL supports while parsing dates, times, and durations so that we can make sure the frontend is consistent with it. Ideally, what frontend library we use should also be part of this spec but we can get to that once we have the requirements. It should also help resolve @seancolsen's question here.

(2) See my comment on line 21.

kgodey · 2022-02-17T14:55:27Z

engineering/architecture/dateandtime.md

+
+## Overall input flow
+
+If the UI is able to parse a date, time, duration, etc., it should reformat it into the appropriate canonical form described below and send that as the input to be stored to the API. If it's not able to parse the input, it should send the raw input string to the API and let the service layer / database try to parse the string. This helps satisfy goal number 5. If the service layer is able to parse the string, then store it. Otherwise, raise a validation error.


I don't think the frontend should be compelled to send raw input to the API if it can't parse it, and seems unnecessary given goal 5. Someone might enter "pizza" in a datetime field, we don't need the backend to know that's not a datetime.

After reading @pavish's comment and thinking about it more, it seems like the frontend wants to rely entirely on the backend for validation. I think this is fine for now, which makes my comment above irrelevant.

I do think we should eventually look into whether we can do validation on the frontend (to make the UX snappier), but we can update this spec if/when it comes to that.

My current thinking is to do the "default" parsing in the DB (and therefore for the API), and then the front end could extend that bit by bit, and add validation iteratively. But if we assume that the DB will need to attempt parsing whatever it's handed from the beginning, it takes pressure off of adding any intelligence in the front end for awhile.

Sounds good.

pavish · 2022-02-17T15:08:11Z

engineering/architecture/dateandtime.md

+2. The UI should be extremely forgiving w.r.t. date/time input, and also display dates and times in a human-friendly way.
+3. The API should be extremely forgiving w.r.t. date/time input, but should return dates, times, and time durations in a more programmatic, i.e., consistent way than the UI.
+4. The UI and API should be consistent in the following way:  If the input string is understood by both UI and API, they should always result in the _same date/time_ being stored, regardless of the input method.
+5. The dates/times understood by the API should be a subset of those understood by the UI. That is, there should be no dates/times/time durations that are understood by the API, but which can't be entered through the UI.


I don't think we need to restrict what is accepted by the API based on the UI understanding user input.

understood by the UI

The UI needs only to understand dates & times after the user enters and saves it, in which scenario the backend would provide the same in a well-defined format. See my comment above.

The UI needs to understand date and time but it does not need to understand user input.

I could definitely be convinced to keep the UI layer pretty "thin" for this. That would certainly make keeping things consistent easier.

Ah, @pavish I just realized the source of confusion. By "understood by the UI", I don't mean it needs to be handled by code at the UI layer. I mean that from a user perspective, the UI should accept all input that the API accepts. This implies avoiding invalidating some string that the API could have accepted. I did want to leave the possibility open that the UI could have some fancy parsing that would allow accepting more input styles, and then parse them into RFC 3339 compliant strings to submit to the API.

I'll think about how to word that more clearly.

pavish

@mathemancer I have some concerns about goal 5 which I've mentioned in different comments. Apart from that, everything looks good to me.

mathemancer · 2022-02-17T15:34:02Z

In contexts where we accept a date, the user should be able to enters a string like "1/7/2022", right? What about "6 Aug, 1870"? What about "7/2" (inferring the current year)? There are lots of variations here obviously, and a good deal of ambiguity too, which requires knowledge about locale to resolve. For example, "1/7/2022" might become "2022-01-07" in USA but "2022-07-01" in many European countries. These situations are tricky. Should this spec deal with those situations too?

These all parse correctly using the default PostgreSQL handling except "7/2", which is just too ambiguous. As far as the locale-sensitive ones, they'll be parsed according to the database locale (which is the C locale by default). As far as the spec, I guess we should try to emulate the PostgreSQL order and so on to keep things consistent with bulk imports.

seancolsen · 2022-02-17T15:59:37Z

@mathemancer That fact that we're leaning on parsing functionality that's already built into Postgres is a pretty big piece of the picture which I completely failed to see upon first reading of the spec. Thanks for clarifying. With that additional knowledge, my opinion to delegate validation and parsing responsibility to the back-end is further strengthened. And it seems like we have a rough consensus on that now too.

These all parse correctly using the default PostgreSQL handling...

I'm curious to play with this a bit.

What is the SQL I'd run to convert "6 Aug, 1870" into a canonical date?
What is the SQL I'd run to convert "2 days ago" into a duration?

kgodey · 2022-02-17T16:45:38Z

I'm curious to play with this a bit.

What is the SQL I'd run to convert "6 Aug, 1870" into a canonical date?

What is the SQL I'd run to convert "2 days ago" into a duration?

No special SQL is necessary, you can insert the values as-is. See https://www.db-fiddle.com/f/gijuboYyi7k5fRX5ABnwba/0

seancolsen · 2022-02-17T17:53:00Z

I spent a little time playing with PostgreSQL's parsing using queries like SELECT '6 Aug, 1870'::timestamp; and I find it very impressive!

Comparing it the date parsing in Google Sheets, I only found a few inconsistencies:

Data type	User input	PostgreSQL	Google Sheets
date	`7/8`	error	`2022-07-08`
date	`12/30/69`	`2069-12-30`	`1969-12-30`
time	`6 pm`	error	`18:00:00`

We'll need to be aware of these limitations and document them. For example if a user exports a spreadsheet to CSV which contains the date 1969-12-30 formatted as 12/30/69, then their data will be silently corrupted upon import in Mathesar.

kgodey · 2022-02-17T23:18:12Z

We'll need to be aware of these limitations and document them.

I don't think we need to do this – different software handle dates and times in different ways and I don't think it's feasible for us to keep track of them all. We also don't necessarily want to be consistent, see Scientists rename human genes to stop Microsoft Excel from misreading them as dates. I think we should assume the user is responsible for the format they use when exporting their CSVs and not try to reconcile different parsers.

Also, the default representation of dates in Google Sheets is fairly unambiguous. I tried formatting a column in Google Sheets with values 7/8 and 12/30/69 to "Date", and it transformed them into 7/8/2022 and 12/30/1969. The CSV will reflect those values. The only way it wouldn't is if the user specifically set the formatting to the more ambiguous versions using the "Custom date time format" feature.

seancolsen · 2022-02-18T01:51:25Z

@kgodey Sorry to harp on this a bit. My opinions are diminishing in strength as we get further into the weeds, but I want to offer a few clarifying points. I'm not arguing for any modifications to our strategy. I'm just trying to share some information with others as food-for-thought while considering our date parsing.

The only way it wouldn't is if the user specifically set the formatting to the more ambiguous versions using the "Custom date time format" feature.

Not true. If you begin without applying any formatting, then values like 7/8 and 12/30/69 will be stored as dates but formatted (and thus exported in CSV) exactly as the user entered them.

I've helped a lot of novice users with spreadsheets, and most of them enter their data without ever setting up any formatting (because they don't understand the importance of data types). Indeed the article to which you linked is also based on this premise of a user entering data into a cell before they have appropriately configured the formatting. Obviously we don't want consistency with the behavior mentioned in that article -- nor will we be consistent with it because, no matter how our date parsing works, users will always begin with a explicitly-typed columns, which is great!
not try to reconcile different parsers

I'm absolutely not arguing that we should aim to parse strings in the same way that spreadsheets do. I can see how my last messages probably was not clear in that regard. Apologies.

I wholeheartedly support moving ahead with the Postgres parsing and think it's great!
For the case of 7/8, Mathesar requires a bit more work from users who are entering data, but I think that's actually a good thing for this situation.
For the case of 6 pm, I wish Postgres would handle this. But I don't wish it hard enough to pursue another direction. When I say "We'll need to be aware of these limitations", here I mean that we'll need to understand that by relying on Postgres we are relinquishing control over some subtle parsing behavior which has implications for data-entry UX. I enter values like 6 pm into my spreadsheet timesheet many times per day. A few more keystrokes is a nuisance, but it's livable. With an awareness that this parsing behavior is essentially out-of-our-control (which was not clear to me when I first read this spec), I'll be quick to write off any grievances I have with it.
For the case of 12/30/69, I still think it's worth putting a small note somewhere in the user docs (when we have them) to recommend that users ensure their dates have full years.

kgodey · 2022-02-18T03:25:45Z

Sorry to harp on this a bit.

No apologies are necessary! I'm glad that you clarified what you were saying.

Not true. If you begin without applying any formatting, then values like 7/8 and 12/30/69 will be stored as dates but formatted (and thus exported in CSV) exactly as the user entered them.

You're correct, I somehow missed that they were stored as dates during my initial Google Sheets testing.

no matter how our date parsing works, users will always begin with a explicitly-typed columns, which is great!

For imports, It does matter how our date parsing works, since we suggest a data type for a column based on whether we can parse it.

I'm absolutely not arguing that we should aim to parse strings in the same way that spreadsheets do. I can see how my last messages probably was not clear in that regard. Apologies.

I thought you were suggesting that we do an exhaustive check of how our parsing differs from common spreadsheet tools and document every case; I don't think that's worth putting time into (and it doesn't seem like that's what you were actually suggesting).

For the case of 7/8, Mathesar requires a bit more work from users who are entering data, but I think that's actually a good thing for this situation.

Agreed. Mathesar won't let you set columns with that data to a date and that's good.

For the case of 6 pm, I wish Postgres would handle this.

I do too. This is where a datetime parsing library on the frontend could come in handy, we could parse user input into ISO format datetimes. The tricky thing here would be to make sure that the frontend parsing is a superset of the backend parsing or be okay with different parsing on the frontend/backend. I would like to explore frontend parsing after the alpha release but as you've pointed out, it's good enough for now.

For the case of 12/30/69, I still think it's worth putting a small note somewhere in the user docs (when we have them) to recommend that users ensure their dates have full years.

Agreed. At some point (well after alpha), we could even check with the user how they want us to process 2 digit dates when we detect them.

seancolsen · 2022-02-18T13:45:13Z

@kgodey

For imports, It does matter how our date parsing works, since we suggest a data type for a column based on whether we can parse it.

Good point. Fortunately strings like MAR1 won't be parsed into dates! But I imagine we'll still have situations where our best guess of the column type is incorrect. For example, if I'm storing semantic version numbers like 1.2.10, Postgres will parse that into 2010-01-02. I think that's still okay, because at least we're forcing the user to pause and confirm the data type.

[the 6pm case] is where a datetime parsing library on the frontend could come in handy

I'd still rather handle this on the back end so that the parsing logic works identically for user input and imports. We could add a very thin layer of input cleaning before the string is sent to Postgres. Within that layer, we'd do some simple regex transformation using very strict patterns, and we'd only apply the transform when the column type is datetime or time. For example:

import re

def expand_shorthand_time(input: str) -> str:
    return re.sub(re.compile(r'(?i)\b(1[12]|0?\d)(\s*[ap]m)\b'), r'\1:00\2', input)

def test_expand_shorthand_time():
    e = expand_shorthand_time

    # Matches
    assert e('3pm') == '3:00pm'
    assert e('11pm') == '11:00pm'
    assert e('12pm') == '12:00pm'
    assert e('1 AM') == '1:00 AM'
    assert e('07 am') == '07:00 am'
    assert e('from 11 am to 12 pm and beyond') == 'from 11:00 am to 12:00 pm and beyond'

    # Non-matches
    assert e('111 pm') == '111 pm' # too many digits
    assert e('13 pm') == '13 pm' # number > 12
    assert e('3 amigos') == '3 amigos' # no word boundary

At some point (well after alpha), we could even check with the user how they want us to process 2 digit dates when we detect them.

Good idea. We could also use some regex logic to detect these kinds of dates and issue a warning to the user explaining the situation.

kgodey · 2022-02-18T14:00:57Z

I'd still rather handle this on the back end so that the parsing logic works identically for user input and imports. We could add a very thin layer of input cleaning before the string is sent to Postgres.

Unfortunately we can't do this for imports under our current architecture. We don't read the CSV in Python at all, we just pass it directly to Postgres (which has built-in CSV import functionality).

We could do it for individually entered data, either on the frontend or the backend. I think it would be okay for us to support more parsing formats on the frontend than during import, it's more important that people be able to add data easily/quickly when they're doing it manually.

seancolsen · 2022-02-18T14:02:07Z

We don't read the CSV in Python at all, we just pass it directly to Postgres

Ahh I see. That makes sense.

Clarify where parsing happens, add more examples

kgodey · 2022-02-21T15:30:24Z

@mathemancer I'll review in more detail tomorrow (when I'm actually working) but I think it would make sense to document (probably in an appendix-type place) the parsing differences between Google Sheets (or spreadsheets in general) that @seancolsen found.

seancolsen · 2022-02-22T16:41:14Z

engineering/architecture/dateandtime.md

+If we want to extend the set of understandable date/time input strings in the future, then the UI should parse the more exotic input strings into the appropriate canonical form described below and send that as the input to be stored to the API. Care should be taken to ensure that _if the UI parses the string_, then it's either 
+- a string which wouldn't be handled by the database parsing, or
+- the parsing results in an identical value being stored as the one that would have resulted from sending the raw string to the API. 
+
+If the UI is not able to parse the input, it should still send the raw input string to the API and let the data layer try to parse the string. This helps satisfy goal number 5. If the data layer is able to parse the string, then store it. Otherwise, raise a validation error.


the UI should parse the more exotic input strings into the appropriate canonical form

This statement is at odds with my understanding of the resolution of our discussion. With the "6 pm" case as an example, I don't want to fully parse that into canonical form. I'd rather only do the minimum amount of work necessary to convert it into a format that Postgres can understand -- in this case "6:00 pm" because that's much simpler and thus has less risk of violating goal 4.

I suggest replacing lines 23-27 with:

We have identified strings like "6 pm" as having a format which Postgres cannot parse but which we would like to accept as user input. There may be other such formats too. For the sake of task prioritization, we will not attempt to accept such strings before the alpha release. If we want to extend the set of understandable date/time input strings in the future then we will evaluate the best implementation on a case-by-case basis. For the "6 pm" format, we would likely implement a thin string transformation in the UI layer, converting the string to "6:00 pm" (without fully parsing it into canonical form). With this implementation, "6 pm" would only be accepted in the UI, and not through imports, which would be okay.

A complete rewrite of lines 23-27 seems unnecessary, I think the detail is useful.

I do agree that the UI shouldn't need to parse it into the canonical representation, just an API-understood representation.

I also think the "6 pm" example would be good to include in addition.

This is a good point. My intent with the "canonical representation" is more of a restriction on API output, and also a sort of "guaranteed input" that conforms to a well-known RFC, and by extension should be possible to achieve using libraries to avoid custom work in the UI layer. That being the case, I'll reword this to make the "canonical form" a non-restrictive option for input. I was mostly hoping to avoid the UI implementer needing to read any PostgreSQL docs to figure out what forms would be acceptable.

seancolsen

I feel quite confident that we're all on the same page about everything now, so I'm a little hesitant to continue with critique. But for the sake of clarity in the spec though, I made a small suggestion.

kgodey

@mathemancer See comments below. I reiterate that I think it would make sense to document (probably in an appendix-type place) the parsing differences between Google Sheets (or spreadsheets in general) that @seancolsen found and posted in this comment.

Looks good overall, I don't need to re-review before merge.

kgodey · 2022-02-23T01:09:07Z

engineering/architecture/dateandtime.md

+If we want to extend the set of understandable date/time input strings in the future, then the UI should parse the more exotic input strings into the appropriate canonical form described below and send that as the input to be stored to the API. Care should be taken to ensure that _if the UI parses the string_, then it's either 
+- a string which wouldn't be handled by the database parsing, or
+- the parsing results in an identical value being stored as the one that would have resulted from sending the raw string to the API. 
+
+If the UI is not able to parse the input, it should still send the raw input string to the API and let the data layer try to parse the string. This helps satisfy goal number 5. If the data layer is able to parse the string, then store it. Otherwise, raise a validation error.


A complete rewrite of lines 23-27 seems unnecessary, I think the detail is useful.

I do agree that the UI shouldn't need to parse it into the canonical representation, just an API-understood representation.

kgodey · 2022-02-23T01:09:37Z

engineering/architecture/dateandtime.md

+If we want to extend the set of understandable date/time input strings in the future, then the UI should parse the more exotic input strings into the appropriate canonical form described below and send that as the input to be stored to the API. Care should be taken to ensure that _if the UI parses the string_, then it's either 
+- a string which wouldn't be handled by the database parsing, or
+- the parsing results in an identical value being stored as the one that would have resulted from sending the raw string to the API. 
+
+If the UI is not able to parse the input, it should still send the raw input string to the API and let the data layer try to parse the string. This helps satisfy goal number 5. If the data layer is able to parse the string, then store it. Otherwise, raise a validation error.


I also think the "6 pm" example would be good to include in addition.

mathemancer added 2 commits February 17, 2022 16:31

add initial spec for handling dates and times

ef76faf

remove temporary file

b006101

mathemancer requested review from kgodey, dmos62, pavish, seancolsen and silentninja February 17, 2022 08:49

github-actions bot requested review from eito-fis and ghislaineguerin February 17, 2022 08:50

dmos62 removed their request for review February 17, 2022 13:01

dmos62 approved these changes Feb 17, 2022

View reviewed changes

seancolsen approved these changes Feb 17, 2022

View reviewed changes

silentninja requested changes Feb 17, 2022

View reviewed changes

kgodey requested changes Feb 17, 2022

View reviewed changes

pavish reviewed Feb 17, 2022

View reviewed changes

pavish requested changes Feb 17, 2022

View reviewed changes

mathemancer added 2 commits February 21, 2022 15:49

Merge branch 'master' into datetime_spec

819b635

update spec according to review comments

2b6c929

Clarify where parsing happens, add more examples

mathemancer requested review from kgodey, silentninja and pavish and removed request for silentninja February 21, 2022 08:45

mathemancer mentioned this pull request Feb 21, 2022

Return RFC 3339 (ish) style date / time values from API mathesar-foundation/mathesar#1086

Closed

mathemancer added 2 commits February 21, 2022 21:16

fix complicated interval mapping outputs

8e4e62a

add link to relevant PostgreSQL docs.

ccb1d32

kgodey self-assigned this Feb 21, 2022

seancolsen reviewed Feb 22, 2022

View reviewed changes

seancolsen approved these changes Feb 22, 2022

View reviewed changes

pavish approved these changes Feb 22, 2022

View reviewed changes

kgodey added the status: review In review label Feb 22, 2022

kgodey approved these changes Feb 23, 2022

View reviewed changes

kgodey assigned silentninja and unassigned kgodey Feb 23, 2022

silentninja approved these changes Feb 23, 2022

View reviewed changes

silentninja merged commit 79b7d97 into master Feb 23, 2022

silentninja deleted the datetime_spec branch February 23, 2022 07:52


		#### Regarding relative times

		There are some relative dates / times which make sense, e.g. "tomorrow", to store in a date column. These will _not_ be stored in a "relative" way, though. They'll be stored as a static date or time (one day later than the current day in the example). As for durations, there's some weird overlap between intervals "2 days ago" and a specific time "2 days ago". We'll choose whether to store the duration or a static time based on the column type.

		- `'1 year 1 month, 1 day 1 hour 1 minute 1.1 seconds'` maps to `P1Y1M1DT1H1M1.1S`
		- `'0 years 13 months 31 days 23 hours 60 minutes 61.1 seconds'` maps to `P1Y1M31DT24H1M1.1S`


		## Overall input flow

		If the UI is able to parse a date, time, duration, etc., it should reformat it into the appropriate canonical form described below and send that as the input to be stored to the API. If it's not able to parse the input, it should send the raw input string to the API and let the service layer / database try to parse the string. This helps satisfy goal number 5. If the service layer is able to parse the string, then store it. Otherwise, raise a validation error.

Add initial spec for handling dates and times #35

Add initial spec for handling dates and times #35

Conversation

mathemancer commented Feb 17, 2022

dmos62 commented Feb 17, 2022

mathemancer commented Feb 17, 2022 • edited Loading

seancolsen Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

mathemancer Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavish Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

kgodey Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seancolsen commented Feb 17, 2022

seancolsen commented Feb 17, 2022

seancolsen commented Feb 17, 2022 • edited Loading

silentninja Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgodey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kgodey Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavish Feb 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavish left a comment

Choose a reason for hiding this comment

mathemancer commented Feb 17, 2022 • edited Loading

seancolsen commented Feb 17, 2022 • edited Loading

kgodey commented Feb 17, 2022

seancolsen commented Feb 17, 2022

kgodey commented Feb 17, 2022

seancolsen commented Feb 18, 2022 • edited Loading

kgodey commented Feb 18, 2022

seancolsen commented Feb 18, 2022 • edited Loading

kgodey commented Feb 18, 2022

seancolsen commented Feb 18, 2022

kgodey commented Feb 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seancolsen left a comment

Choose a reason for hiding this comment

kgodey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mathemancer commented Feb 17, 2022 •

edited

Loading

seancolsen Feb 17, 2022 •

edited

Loading

mathemancer Feb 17, 2022 •

edited

Loading

pavish Feb 17, 2022 •

edited

Loading

kgodey Feb 17, 2022 •

edited

Loading

seancolsen commented Feb 17, 2022 •

edited

Loading

silentninja Feb 17, 2022 •

edited

Loading

kgodey Feb 17, 2022 •

edited

Loading

pavish Feb 17, 2022 •

edited

Loading

mathemancer commented Feb 17, 2022 •

edited

Loading

seancolsen commented Feb 17, 2022 •

edited

Loading

seancolsen commented Feb 18, 2022 •

edited

Loading

seancolsen commented Feb 18, 2022 •

edited

Loading