Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

getMediaTypeParams() doesn't handle quoted-string values #245

Open
bkdotcom opened this issue Aug 21, 2024 · 0 comments
Open

getMediaTypeParams() doesn't handle quoted-string values #245

bkdotcom opened this issue Aug 21, 2024 · 0 comments

Comments

@bkdotcom
Copy link

bkdotcom commented Aug 21, 2024

https://www.rfc-editor.org/rfc/rfc7231#section-3.1.1.1

media-type = type "/" subtype *( OWS ";" OWS parameter )
type       = token
subtype    = token

The type/subtype MAY be followed by parameters in the form of name=value pairs.

parameter      = token "=" ( token / quoted-string )

A parameter value that matches the token production can be transmitted either as a token or within a quoted-string. The quoted and unquoted values are equivalent. For example, the following examples are all equivalent, but the first is preferred for consistency:

text/html;charset=utf-8
text/html;charset=UTF-8
Text/HTML;Charset="utf-8"
text/html; charset="utf-8"

Contrived example

application/json;charSet="UTF-8"; FOO = "b; a\\"r"
(there shouldn't be whitespace around the "=", but we can handle it)

expected:

array(
  'charset' => 'UTF-8',  // charset value (considered case-insensitive) in particular should probably be strtolower'd 
  'foo' => 'b; a"r',
)

actual
Undefined array key 1

array(
  'charset' => '"UTF-8"',
  'foo·' => ' "b',
  'a\"r"' => null,

something like this

public function getMediaTypeParams(): array
{
    $contentType = $this->getContentType();

    if ($contentType === null) {
        return array();
    }

    $paramString = \preg_replace('/^.*?[;,]\s*/', '', $contentType);
    $regexToken = '[^\\s";,]+';
    $regexQuotedString = '"(?:\\\\"|[^"])*"';   // \" or not "
    $regex = '/
        (?P<key>' . $regexToken . ')
        \s*=\s*    # standard does not allow whitespace around =
        (?P<value>' . $regexQuotedString . '|' . $regexToken . ')
        /x';

    \preg_match_all($regex, $paramString, $matches, PREG_SET_ORDER);

    $params = array();
    foreach ($matches as $kvp) {
        $key = \strtolower($kvp['key']);
        $value = \stripslashes(\trim($kvp['value'], '"'));
        $params[$key] = $value;
    }
    return $params;
}

fix could go futher and strtolower the value if key is charset


related

@bkdotcom bkdotcom changed the title getMediaTypeParams doesn't handle quoted-string values getMediaTypeParams() doesn't handle quoted-string values or whitespace Aug 21, 2024
@bkdotcom bkdotcom changed the title getMediaTypeParams() doesn't handle quoted-string values or whitespace getMediaTypeParams() doesn't handle quoted-string values Aug 21, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant