Skip to content
This repository has been archived by the owner on Mar 25, 2024. It is now read-only.

Inconsistent Quoting Behavior in String Serialization #397

Open
NanezX opened this issue Nov 17, 2023 · 2 comments · May be fixed by #398
Open

Inconsistent Quoting Behavior in String Serialization #397

NanezX opened this issue Nov 17, 2023 · 2 comments · May be fixed by #398

Comments

@NanezX
Copy link

NanezX commented Nov 17, 2023

Issue Description:

When using serde_yaml for deserialization, strings within YAML files lose their quotes (which we can tell is ok). However, during serialization, manually adding quotes to a string results in unexpected behavior, with extra quotes being added. This inconsistency poses challenges when working with YAML files that require specific quoting, leading to malformed output.

Steps to Reproduce:

Deserialize a YAML file containing quoted strings.
Change a string value and attempt to serialize it back.
Observe the unexpected behavior in the serialized output.

Expected Behavior:

During serialization, there should be a way to explicitly specify whether quotes should be added to a string, and this behavior should be consistent with manual addition of quotes.

Actual Behavior:

Serialization adds extra quotes or behaves inconsistently when attempting to preserve quotes around strings.

Example:

specVersion: 0.0.4
schema:
  file: ./schema.graphql
address: "0x0000000000000000000000000000000000000000"

When the deserialziation is made, I'll get an string of address without quotes. Let's say I want to change the address to 0xffaed20B7B67e498A3bEEf97386ec1849EFeE6Ac, so I use that value as string. The output will add address without quotes.

But if I use '0xffaed20B7B67e498A3bEEf97386ec1849EFeE6Ac' (with quotes in the string), the serialization will add three single quotes, so the output will be:

specVersion: 0.0.4
schema:
  file: ./schema.graphql
address: '''0xffaed20B7B67e498A3bEEf97386ec1849EFeE6Ac'''

On the other hand, if I use doublue quotes when adding the string, the output will be:

specVersion: 0.0.4
schema:
  file: ./schema.graphql
address: '"0xffaed20B7B67e498A3bEEf97386ec1849EFeE6Ac"'

How I should handle this? is an issue? is on purpose? how can be fixed?

Environment:

serde version: 1.0.192
serde_yaml version: 0.9.27
Rust version: rustc 1.73.0
Operating System: Ubuntu 22.04 LTS

@thedavidmeister
Copy link

thedavidmeister commented Nov 28, 2023

#319 seems related potentially

from what i've seen so far, it seems like the root issue is that you can have some String in the struct that is being serialized by serde_yaml::to_string and this calls serialize_str internally, but then the yaml output can be something that isn't a string at all

even if you implement custom serialization with serialize_with it still has the same issue because the logic that un-strings the input data happens inside serialize_str and all the functions that might help avoid it are marked private inside serde_yaml.

i think this is actually a pretty well known issue with the yaml spec itself, which is why strict yaml exists

https://hitchdev.com/strictyaml/why/implicit-typing-removed/

This issue and #319 both seem to be instances of "the norway problem" and unless you (control and can) change the parser, the only solution seems to be to quote everything that's intended to be a string, so that it doesn't accidentally get parsed as a not-string downstream later.

@dtolnay the logic inside serialize_str is fairly complex, trying to infer a "scalar style", it's a bit old now but do you remember the intent here?

@thedavidmeister
Copy link

For this issue specifically the offending code path is

            fn visit_str<E>(self, v: &str) -> Result<Self::Value, E> {
                Ok(if crate::de::digits_but_not_number(v) {
                    ScalarStyle::SingleQuoted
                } else {
                    ScalarStyle::Any
                })
            }

The else gets triggered for the string because it can be parsed as a hexadecimal integer, so then the string is treated as though it is an integer

I also tried "!!str 0xdeadbeef", "!Str 0xdeadbeef" and "!String 0xdeadbeef" for the value but this causes the output yaml to single quote the whole string, it doesn't recognise these tags as instructions to respect the string type

@thedavidmeister thedavidmeister linked a pull request Nov 28, 2023 that will close this issue
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants