Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Should u-* parsing special case img srcset? #7

Open
tantek opened this issue Mar 27, 2017 · 15 comments
Open

Should u-* parsing special case img srcset? #7

tantek opened this issue Mar 27, 2017 · 15 comments

Comments

@tantek
Copy link
Member

tantek commented Mar 27, 2017

Responsive images add the 'srcset' attribute to the 'img' tag which can then provide a variety of image resolutions, both in density and pixel size.

Should u-* on an img tag pay attention to the srcset attribute, and if so, how?

Summary of srcset here https://indieweb.org/srcset

@snarfed
Copy link
Member

snarfed commented Sep 22, 2017

yes! use case: snarfed/bridgy#592 (comment) . a few people want bridgy publish to support srcset, specifically to publish the largest available image that's under a silo's size limit. e.g. twitter's is 5MB.

cc @jonnybarnes and @petermolnar for example posts.

@jonnybarnes
Copy link

Here’s an initial post that uses srcset: https://jonnybarnes.uk/notes/GE

@jonnybarnes
Copy link

It appears php-mf2 leaves the srcset attribute blank: https://pin13.net/mf2/?url=https%3A%2F%2Fjonnybarnes.uk%2Fnotes%2FGE

@sknebel
Copy link
Member

sknebel commented Mar 12, 2018

#2 proposes already to expand images into an object with url and alt attributes. This object could be expanded to cover srcset as well.

Looking at the examples linked, from my understanding the sizes attribute doesn't need to be preserved since it depends on the design of the specific page and does not transfer over into different displays (a downstream consumer displaying it could insert a sizes attribute in their own page, matching their own stylesheets.

based on example HTML from the wiki:

<img 
  class="u-photo"
  src="http://example.com/img-360x540.jpg" 
  alt="img" 
  srcset="img-240x360.jpg 360w, img-360x540.jpg 540w, img-653x980.jpg 980w, img-853x1280.jpg 1280w" 
  sizes="(min-width: 960px) 50vw, 100vw"
/>

The parser could just pass on the content of srcset, leaving the downstream consumer to handle it completely, giving output along the lines of


"photo": {
   "value": "http://example.com/img-360x540.jpg",
   "alt": "img",
   "srcset": "img-240x360.jpg 360w, img-360x540.jpg 540w, img-653x980.jpg 980w, img-853x1280.jpg 1280w" 
}

, or transform it into some kind of more detailed structure (suggestions at what that'd ideally look like for a consumer? Seems like @snarfed would only want something where the full list of images is easy to get from, since the parameters don't actually matter for bridgy?)

@sknebel
Copy link
Member

sknebel commented Jul 2, 2018

@grantcodes I believe you were interested in this too, any feedback?

@grantcodes
Copy link

@sknebel yeah, I didn't comment as my use case is very different, and probably not a microformats issue.

I am trying to generate html from mf2 json and very rarely want to show a full resolution image. Your proposal may work for that but honestly I have not thought about it enough to be sure.

@snarfed
Copy link
Member

snarfed commented Apr 3, 2020

friendly nudge! this came up in chat just now: https://chat.indieweb.org/dev/2020-04-03#t1585936229416600

@dshanske
Copy link
Member

dshanske commented Apr 24, 2020

Would suggest sourceset might be best parsed out.

"srcset": "img-240x360.jpg 360w, img-360x540.jpg 540w, img-653x980.jpg 980w, img-853x1280.jpg 1280w"

turning into

"srcset": {
   "360w": "img-240x360.jpg", 
   "540w": img-360x540.jpg",
   "980w": img-653x980.jpg", 
   "1280w": img-853x1280.jpg"
}

@jgarber623
Copy link
Member

Building on @dshanske's comment with some uses cases pulled from MDN's Responsive Images tutorial. URLs are resolved based on a hypothetical source page/base URL at http://example.com/.

Simple <img> element

<img src="elva-fairy-800w.jpg" alt="Elva dressed as a fairy">
"photo": [
  {
    "value": "http://example.com/elva-fairy-800w.jpg",
    "alt": "Elva dressed as a fairy"
  }
]

<img> element with srcset attribute

<img srcset="elva-fairy-480w.jpg 480w,
             elva-fairy-800w.jpg 800w"
     sizes="(max-width: 600px) 480px,
            800px"
     src="elva-fairy-800w.jpg"
     alt="Elva dressed as a fairy">
"photo": [
  {
    "value": "http://example.com/elva-fairy-800w.jpg",
    "srcset": {
      "480w": "http://example.com/elva-fairy-480w.jpg",
      "800w": "http://example.com/elva-fairy-800w.jpg"
    },
    "alt": "Elva dressed as a fairy"
  }
]

<img> element with srcset attribute and implied descriptor

<img srcset="elva-fairy-320w.jpg,
             elva-fairy-480w.jpg 1.5x,
             elva-fairy-640w.jpg 2x"
     src="elva-fairy-640w.jpg"
     alt="Elva dressed as a fairy">
"photo": [
  {
    "value": "http://example.com/elva-fairy-640w.jpg",
    "srcset": {
      "1x": "http://example.com/elva-fairy-320w.jpg",
      "1.5x": "http://example.com/elva-fairy-480w.jpg",
      "2x": "http://example.com/elva-fairy-640w.jpg"
    },
    "alt": "Elva dressed as a fairy"
  }
]

The above example demonstrates how to handle this note from the documentation:

If no descriptor is specified, the source is assigned the default descriptor: 1x.

@jgarber623
Copy link
Member

Don't look now, but there's also the imagesrcset attribute on <link> elements: https://html.spec.whatwg.org/multipage/indices.html#attributes-3

Same values permitted: a comma-separated list of image candidate strings.

@dshanske
Copy link
Member

The conclusion at the end of today's Microformats online session was that someone should try implementing this behind an experimental flag to move forward with this idea.

jgarber623 added a commit to jgarber623/micromicro that referenced this issue Sep 24, 2022
This commit adds a new ImageElementParser class that encompasses
existing custom `img[alt]` handling with the addition of `srcset`
parsing based on the rules I proposed in
microformats/microformats2-parsing#7.

See my comment here:

microformats/microformats2-parsing#7 (comment)
@jgarber623
Copy link
Member

I've just published MicroMicro v3.1.0 (RubyGems, GitHub release notes) which implements srcset parsing. 🎉

The implementation matches the examples in my comment above along with feedback in the #microformats IRC/Slack channel.

@jgarber623
Copy link
Member

Following up on @tantek's request for implementation details, here's where we collectively landed on parsing an img element's srcset attribute. This logic builds upon the microformats2 parsing spec's parse an img element for src and alt which applies to u-* property parsing (link) and implied photo property parsing (link).

For reference, relevant classes and methods from MicroMicro's implementation:

Based on the existing specification, feedback in IRC/Slack, Mozilla's HTMLImageElement.srcset documentation, and the WHATWG HTML specification's srcset attribute documentation, I implemented the following (presented as an update to the existing microformats2 parsing specification):

parse an img element for src, srcset, and alt

To parse an img element for src, srcset, and alt attributes:

  • if img[alt] or img[srcset]
    • return a new {} structure with
      • value: the element's src attribute as a normalized absolute URL, following the containing document's language's rules for resolving relative URLs (e.g. in HTML, use the current URL context as determined by the page, and first <base> element, if any).
      • srcset: if present, the element's srcset attribute value parsed as a new {} structure with key/value pairs representing the attribute value's valid image candidates. See "parse a srcset attribute value" below.
      • alt: if present, the element's alt attribute value, otherwise omit this key.
  • else
    • return the element's src attribute as a normalized absolute URL, following the containing document's language's rules for resolving relative URLs (e.g. in HTML, use the current URL context as determined by the page, and first <base> element, if any).

parse a srcset attribute value

  • create a new {} structure
  • split the srcset attribute's value on the , (comma) character
  • for each image candidate string in the resulting array:
    • strip leading and trailing whitespace
    • parse the URL from the image candidate string as all non-whitespace characters from the beginning of the string until either a whitespace character or the end of the string (whichever is first)
    • if remaining non-whitespace characters exist
      • parse the condition descriptor from the image candidate string as any remaining non-whitespace characters
    • if no remaining non-whitespace characters exist
      • assign the default condition descriptor 1x
    • prepare a new key/value pair:
      • key: return the candidate descriptor (e.g. 1x)
      • value: return the normalized absolute URL, following the containing document's language's rules for resolving relative URLs (e.g. in HTML, use the current URL context as determined by the page, and first element, if any).
    • if the {} structure does not contain a key equal to the candidate descriptor (e.g. 1x), add the new key/value pair to the {} structure
    • if the {} structure contains a key equal to the candidate descriptor, ignore the new key/value pair

☝🏻 Feedback and refinement would be welcome.

That's a long-winded spec-y way of saying…

Take a srcset attribute value like this:

/image-480.jpg 480w, /image.jpg, /image-640.jpg 640w, /image-2x.jpg 2x, /image-nope.jpg 640w, /image-nope.jpg

…and return a hash like:

{
  "480w": "https://jgarber.example/image-480.jpg",
  "1x": "https://jgarber.example/image.jpg",
  "640w": "https://jgarber.example/image-640.jpg",
  "2x": "https://jgarber.example/image-2x.jpg"
}

For a real-world example, see this photo on my website and the results parsed by micromicro.cc (which uses v3.1.0 of MicroMicro).

The relevant snippet from the microformats2 JSON:

"photo": [
  {
    "value": "https://assets.sixtwothree.org/uploads/photos/256/5394F7C7-97B5-465D-AF6D-B9CC85F98212_medium.jpg",
    "srcset": {
      "500w": "https://assets.sixtwothree.org/uploads/photos/256/5394F7C7-97B5-465D-AF6D-B9CC85F98212_small.jpg",
      "750w": "https://assets.sixtwothree.org/uploads/photos/256/5394F7C7-97B5-465D-AF6D-B9CC85F98212_medium.jpg",
      "1000w": "https://assets.sixtwothree.org/uploads/photos/256/5394F7C7-97B5-465D-AF6D-B9CC85F98212_large.jpg"
    },
    "alt": ""
  }
]

@jgarber623
Copy link
Member

jgarber623 commented Sep 25, 2022

Also, the MicroMicro test suite includes several test files (HTML and JSON) consistent with the microformats/tests repo. Those files are available here:

https://github.com/jgarber623/micromicro/tree/main/spec/support/fixtures/micromicro_test_suite

@gdog2u
Copy link

gdog2u commented Jan 10, 2024

I'm curious how/if incorrect usages of srcset should be handled. I received some dubious advice some time ago that an appropriate way to serve webp images was to set the src attribute to an initial value of a more common file type, and that you can then use srcset to serve the webp file. This then allows the browser to serve the webp image if it can, or otherwise fallback on the initial value. While this does indeed work, the result is that I have two images in the srcset with no sizes defined.

An example of how many of the images on my site are displayed.

<img srcset="logo.webp, logo.jpeg" src="logo.jpeg" alt="The logo for my site" class="u-photo">

If I understand the specification documented above by @jgarber623 correctly, then I presume that the following structure would be attempted,

"photo": [
  {
    "value": "https://my-website.com/logo.jpg",
    "srcset": {
      "1x": "https://my-website.com/logo.webp",
      "1x": "https://my-website.com/logo.jpg"
    },
    "alt": "The logo for my site"
  }
]

I imagine the actual result would be that only the last entry in the srcset would be listed at the 1x key.

While I understand that this is explicitly against the WHATWG specs, it is accepted behavior in both Chrome 120.0.6099.201, and Firefox 121.0. Similarly, if I got this info somewhere, surely other people did too.

Is it worth it to try and consider this possibility in these specs?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants