Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Formalize how to express null value in xml #3959

Open
ahmednfwela opened this issue Jul 16, 2024 · 1 comment
Open

Formalize how to express null value in xml #3959

ahmednfwela opened this issue Jul 16, 2024 · 1 comment
Labels
media and encoding Issues regarding media type support and how to encode data (outside of query/path params) xml
Milestone

Comments

@ahmednfwela
Copy link

Since XML has no concept of null, how can we handle validating a null value (both as an attribute and as an element) ?

consider the following 3.0 schema :

person:
  type: object
  required:
    - name
    - attrName
  properties:
    name:
      type: string
      nullable: true
    attrName:
      type: string
      nullable: true
      xml:
        attribute: true

notice how required here prevents us from removing the attribute/element

I have thought about this, and here are some of the approaches I came up with:

For elements

Approach 1: self closing tags

<person>
  <name />
</person>

Pros: Makes sense to whoever reads it
Cons: Nothing i can think of, but maybe some xml parsers can consider a self-closing tag equivalent to empty string and don't distinguish between them, which means they don't survive round tripping:
e.g.
<name /> gets represented on the way back to: <name></name>

Approach 2: empty string

<person>
  <name></name>
</person>

Cons: if the property is of type string, it's not possible to distinguish between a non-null empty string and a null.
Workaround: force strings to be wrapped around double quotes, e.g.

  • this is null:
<name></name>
  • this is empty string:
<name>""</name>
  • this is valid string:
<name>"hello"</name>
  • this is non valid string
<name>hello</name>

Ofc this workaround is very problematic and not good since most parsers consider "" A valid 2 character string.

Approach 3: special marker attribute

<name xsi:nil="true"></name>
<name xsi:nil="true"/>

Pros: Can represent nulls consistently without having to check the contents of the element, this is also how xml schema does it.
Cons: Size overhead of having to use xsi:nil="true" everywhere null is used.

For attributes

Approach 1: empty string

<person attrName="" />

Approach 2: disallow nullable attributes altogether

make it that xml.attribute: true and nullable: true are mutually exclusive

@ralfhandl ralfhandl added the xml label Jul 17, 2024
@ralfhandl
Copy link
Contributor

@ahmednfwela Thanks for reporting this, and for the detailed research and references.

Preliminary analysis for elements

XML 1.0, section 3.1 "Start-Tags, End-Tags, and Empty-Element Tags" states that the element forms <name></name> and <name/> are equivalent and represent an element with no content, aka an empty element, so approaches 1 and 2 are equivalent.

The meaning of "empty" seems to depend on context/implementation; for string-valued elements "empty" means the empty string.

So approach 3 (xsi:nil="true") seems to be the way forward.

Preliminary analysis for attributes

XML 1.0, section 3.3.3 "Attribute-Value Normalization" describes an algorithm that MUST be applied before the value of an attribute is passed to the application or checked for validity. This algorithm begins with a normalized value consisting of the empty string, then appends to it. Thus attribute values are always strings, potentially the empty string, and never null.

So approach 2 (disallow nullable attributes) seems to be the way forward.

@ralfhandl ralfhandl added this to the v3.2.0 milestone Jul 25, 2024
@handrews handrews added the media and encoding Issues regarding media type support and how to encode data (outside of query/path params) label Jul 29, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
media and encoding Issues regarding media type support and how to encode data (outside of query/path params) xml
Projects
None yet
Development

No branches or pull requests

3 participants