-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Are headers always defined with ABNF? #74
Comments
my 2 cents: I can see header fields not being defined by ABNF (but why?). But even in that case, the core protocol semantics should remain wrt list handling (such as the fact that empty elements get ignored) - anything else would be an incompatible change, no? |
I don't read that as requiring it be defined using ABNF. It is requiring that its field-value definition be equivalent to a comma-separated list that matches the grammar rule #(values) in ABNF. This is required because recipients will extract and combine fields without/before knowing their definition. Perhaps we could just say that. shrug |
AIUI Julian is saying that there's something more here; if your header can be combined, it has to be combined using exactly the rules defined in section 7 (e.g., regarding how whitespace is handled, multiple headers, etc.). If I define a header field that allows comma separation, I think it's OK to do so without ABNF, and I think it can define its own handling for exactly how that combination should happen -- keeping in mind that upstream handlers might combine it in the specified fashion. Proposal: Replace the text above with:
It would also be great if we could move some of the logic for combining headers out of the #rule (section 11 currently) and into somewhere around 4.2. Field Values. |
See slightly related issue #7 |
Discussed in Montreal; Roy suggested that we specify senders can only generate headers with multiple instances where the field's syntax allows recombining with commas. Also make explicit that recipients can blindly combine multiple instances. |
And conversely should we make it clearer that unless a field is defined as possibly containing commas in its value, a recipient may rightfully split it around commas (and strip LWS) and consider each part as an individual value for that field ? This basically means that a recipient could iterate over "values" between commas in fields it doesn't know (thus with no harm) and properly deal with those it knows (date, cookie, expires, etc which support commas in the value). |
You can't split on comma without knowing the syntax of the list elements. |
@reschke definitely, but what I mean is that if we stipulate that multiple header fields may be concatenated by default, then conversely it could be assumed that an unknown header field containing commas might be the result of a concatenation. At the very least, suggesting that passing commas in a new header field should be thought about twice because some recipients along the chain might not know how this field is formed seems like a good idea to me. |
So I think the currently proposal (post-Montreal) is:
|
Slight revision:
|
I had in mind something a little more general that would include fields which are defined like Vary:
OTOH, I personally feel that the existing requirement is a bit misguided. It doesn't seem to help in practice. |
Reading the example mentioning the intermediary adding the field makes me think that the main problem we're having is caused by this combination of facts :
We can't use point 3 above on users to avoid point 2, it's for developers. It's not realistic either for developers to implement an exhaustive list of permitted/not permitted header fields in their product, nor to always look up every single added field and emit "500 server error" every time the rule matches. So we're left with MUST NOT that affect hard-coded implementations (which is already important) but we don't help recipients recover when they face the situation, despite it being declared unlikely by the specification. For this reason I think that the effort should be mostly put on guidance to recover from such violations which in my opinion are extremely common and are the reason we're discussing this. We already did this to recover from multiple "content-length" values, there definitely is some value in going further to generalize this to any invalid case. |
I agree with @wtarreau I think. I've put quite a bit of effort into defining parsers on the combined value for |
Same. I've been thinking we need to have general advice that single-value header fields should always define such error handling, and we should do an audit of the ones defined by HTTP. That said, I still think this text is helpful. I could see modifying it to make the target of the requirement clearer, or removing the 2119 language. |
https://tools.ietf.org/html/rfc7230#section-3.2.2 says:
This ties sender behaviour to how the header field is specified; do we require headers to be defined using ABNF?
If so, we should say so explicitly. If not, we should clarify this requirement.
Furthermore, Section 7 ties handling of specific syntax (e.g., empty lines) to the #-rule. If headers can be defined without ABNF, we should clarify whether that behaviour is required, or specified to #-specified headers.
See also httpwg/http-extensions#596
The text was updated successfully, but these errors were encountered: