-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
should dt-* parsing do date and time parsing for all values? #12
Comments
My answer to the question in the title would be Yes. I feel like As I wrote in #8 and on IRC (emphasis added that is implicit to this issue):
|
Here is a real-world example we ran into today on http://indieweb.org/events. <span class="h-event vevent">
<span class="dt-start dtstart">
<span class="value" title="August 1, 2018">2018-08-01</span>
<span class="value" title="20:30">20:30<span style="display: none;">-5:00</span></span>
</span>–
<span class="dt-end dtend">22:00<span style="display: none;">-5:00</span></span> (-5:00 <abbr>UTC</abbr>):
<span class="p-content">An informal online get together for people new to blogging, building websites, or using IndieWeb plugins on WordPress.</span>
</span> php-mf2 parse: {
"items": [
{
"type": [
"h-event"
],
"properties": {
"content": [
"An informal online get together for people new to blogging, building websites, or using IndieWeb plugins on WordPress."
],
"start": [
"2018-08-01 20:30-0500"
],
"end": [
"22:00-5:00-0500"
]
}
}
],
"rels": {},
"rel-urls": {},
"debug": {
"package": "https://packagist.org/packages/mf2/mf2",
"source": "https://github.com/indieweb/php-mf2",
"version": "v0.4.5",
"note": [
"This output was generated from the php-mf2 library available at https://github.com/indieweb/php-mf2",
"Please file any issues with the parser at https://github.com/indieweb/php-mf2/issues",
"Using the Masterminds HTML5 parser"
]
}
} mf2py parse: {
"rels": {},
"items": [
{
"type": [
"h-event"
],
"properties": {
"content": [
"An informal online get together for people new to blogging, building websites, or using IndieWeb plugins on WordPress."
],
"start": [
"2018-08-01"
],
"end": [
"22:00-5:00"
]
}
}
],
"rel-urls": {},
"debug": {
"source": "https://github.com/microformats/mf2py",
"version": "1.1.1",
"markup parser": "html5lib",
"description": "mf2py - microformats2 parser for python"
}
} |
I guess this makes sense. VCP and the HTML rules for the |
After having microformats/tests#29 confirmed and resolved, the lack of this being in the standard is the only thing preventing the Rust parser from being fully compliant, thus enabling this: #12 (comment) (Originally published at: https://jacky.wtf/2022/6/yy8Z) |
I found some more edge cases that this spec update should cover:
<div class="h-event">
<span class="dt-start">
<span class="value">2022-07-05T17:30-08:00</span>
</span>
</div> This "value" is used as-is, no normalization to remove "T" or the colon in timezone offset: {
"items": [
{
"type": [
"h-event"
],
"properties": {
"start": [
"2022-07-05T17:30-08:00"
],
"name": [
"2022-07-05T17:30-08:00"
]
}
}
]
} Similarly for:
<div class="h-event">
<span class="dt-start">
<span class="value">2022-07-05T17:30</span>
</span>
</div> {
"items": [
{
"type": [
"h-event"
],
"properties": {
"start": [
"2022-07-05T17:30"
],
"name": [
"2022-07-05T17:30"
]
}
}
]
} |
I mentioned before how this is a upstream blocker to get the Rust library fully compatible. That's changed but normalization would simplify the act of parsing (and testing) date values, thus me throwing my vote in favor of it and curious to hear if anyone else is in favor of that as well. (Originally published at: https://jacky.wtf/2023/10/evyZ) |
I'm also in favour of normalizating date values everywhere, be it VCP or not. Parsers already have to perform normalization sometimes, so it adds no appreciable complexity to parsers, while simplifying things for consumers of the output of parsers. My own parser already does this by default, for what it's worth. |
While we're at it it might be worthwhile to drop
It's a pretty straightforward application of Postel's law, with no information lost, and no new formats added. |
Currently in http://microformats.org/wiki/microformats2-parsing#parsing_a_dt-_property special date and time parsing is only done as part of step one for VCP handling.
The proposal is to move (and thus extract from VCP and inline into mf2 parsing) that "date and time parsing rules" mentioned in step 1 to after all the value retrieval is done, before returning a value.
This would be a larger fix that should incorporate also accepting the proposals in issue #4 and #8 .
I don't have a specific real world example for this particular proposal, thus the issue title is a question. All feedback welcome, and especially real world examples that would be helped by this beyond the smaller fixes noted in #4 and #8.
Feedback explicitly requested from: @sknebel @gRegorLove @Zegnat. Thanks!
The text was updated successfully, but these errors were encountered: