fix(torii-erc): lenient parsing for invalid json for erc721 #2977

Larkooo · 2025-01-31T11:18:25Z

Summary by CodeRabbit

New Features
- Introduced a utility function to sanitize JSON strings by escaping unescaped quotes.
Bug Fixes
- Improved error handling and logging mechanisms in ERC721 token processing.
- Enhanced metadata fetching error reporting.
Chores
- Streamlined logging statements for better diagnostic clarity.
- Added tests to validate the new JSON sanitization functionality.

coderabbitai · 2025-01-31T11:20:20Z

Walkthrough

Ohayo, sensei! The pull request focuses on refining error handling and logging in the crates/torii/sqlite/src/executor/erc.rs file. The changes primarily target the process_register_erc721_token_query method, introducing a match statement for clearer error management and enhanced logging mechanisms. Additionally, a new utility function, sanitize_json_string, has been added in crates/torii/sqlite/src/utils.rs to improve JSON string sanitation. These modifications enhance log clarity and error context without fundamentally altering the core functionality of the methods.

Changes

File	Change Summary
`crates/torii/sqlite/src/executor/erc.rs`	- Replaced nested structure with a `match` statement in `process_register_erc721_token_query` - Enhanced error logging with `debug!` and `warn!` macros - Improved metadata parsing log messages - Removed `trace!` logging statement
`crates/torii/sqlite/src/utils.rs`	- Added public function `sanitize_json_string` for escaping unescaped quotes - Introduced test function `test_sanitize_json_string` to validate the new function

Possibly related PRs

fix(torii): handle failed metadata fetch and empty metadata in db #2755: The changes in the main PR, specifically the modifications to error handling in the process_register_erc721_token_query method, are directly related to the enhancements made in the same method in the retrieved PR, which also focuses on improving error handling for metadata fetching.
fix(torii): avoid panic on fetch token URI fail #2750: The changes in the main PR are related to the same method process_register_erc721_token_query in the erc.rs file, focusing on error handling improvements, while the retrieved PR also modifies the same method to enhance error handling for fetching the token URI.
fix(torii-tokens-graphql): returns tokens balances with no metadata #2764: The changes in the main PR are related to the addition of the sanitize_json_string function, which is also introduced in the retrieved PR, indicating a direct connection at the code level.

Suggested reviewers

glihm

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

crates/torii/sqlite/src/executor/erc.rs (1)
Line range hint 303-325: Ohayo! Enhanced logging for data URI handling, sensei!

The additional debug logging helps track the data URI parsing process and decoded content.

However, there are two issues to address:

The decoded string is logged but not used (we're using the original decoded.0)

The formatting needs adjustment per pipeline error

Apply these fixes:
-                debug!(decoded_str = %decoded_str, "Decoded metadata");
-
-                let json: serde_json::Value = serde_json::from_str(&String::from_utf8_lossy(&decoded.0))
+                let json: serde_json::Value = serde_json::from_str(&decoded_str)
🧰 Tools

🪛 GitHub Actions: ci

[error] 322-324: Code formatting error: Incorrect line breaks and indentation in JSON parsing logic

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4b733d and b2472e0.

📒 Files selected for processing (1)

crates/torii/sqlite/src/executor/erc.rs (4 hunks)

🧰 Additional context used

🪛 GitHub Actions: ci

crates/torii/sqlite/src/executor/erc.rs

[error] 245-247: Code formatting error: Incorrect block formatting for match expression with metadata handling

[error] 322-324: Code formatting error: Incorrect line breaks and indentation in JSON parsing logic

🔇 Additional comments (1)

crates/torii/sqlite/src/executor/erc.rs (1)

13-13: Ohayo! Import optimization looks good, sensei!

The logging imports have been correctly optimized to include only the levels actually used in the code.

coderabbitai · 2025-01-31T11:20:23Z

crates/torii/sqlite/src/executor/erc.rs

+            let metadata = Self::fetch_metadata(&token_uri).await;
+
+            match metadata {
+                Ok(metadata) => serde_json::to_string(&metadata).context("Failed to serialize metadata")?,
+                Err(err) => {
+                    debug!(error = %err, token_uri = %token_uri, "Error fetching metadata");
+                    warn!(
+                        contract_address = format!("{:#x}", register_erc721_token.contract_address),
+                        token_id = %register_erc721_token.actual_token_id,
+                        "Error fetching metadata, empty metadata will be used instead.",
+                    );
+                    "".to_string()
+                }


⚠️ Potential issue

Ohayo! Nice error handling improvements, sensei!

The match-based error handling with dual-level logging provides better visibility into metadata fetch failures.

However, the block formatting needs adjustment to fix the pipeline error.

Apply this formatting fix:

- let metadata = Self::fetch_metadata(&token_uri).await; - - match metadata { + match Self::fetch_metadata(&token_uri).await {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let metadata = Self::fetch_metadata(&token_uri).await;

match metadata {

Ok(metadata) => serde_json::to_string(&metadata).context("Failed to serialize metadata")?,

Err(err) => {

debug!(error = %err, token_uri = %token_uri, "Error fetching metadata");

warn!(

contract_address = format!("{:#x}", register_erc721_token.contract_address),

token_id = %register_erc721_token.actual_token_id,

"Error fetching metadata, empty metadata will be used instead.",

);

"".to_string()

}

match Self::fetch_metadata(&token_uri).await {

Ok(metadata) => serde_json::to_string(&metadata).context("Failed to serialize metadata")?,

Err(err) => {

debug!(error = %err, token_uri = %token_uri, "Error fetching metadata");

warn!(

contract_address = format!("{:#x}", register_erc721_token.contract_address),

token_id = %register_erc721_token.actual_token_id,

"Error fetching metadata, empty metadata will be used instead.",

);

"".to_string()

}

🧰 Tools

🪛 GitHub Actions: ci

[error] 245-247: Code formatting error: Incorrect block formatting for match expression with metadata handling

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

crates/torii/sqlite/src/utils.rs (2)

56-106: Ohayo sensei! Consider using a JSON parser for more robust handling.

While the current implementation works for basic cases, it might miss edge cases that a proper JSON parser would handle. Consider using serde_json for more robust parsing and string sanitization.

-pub fn sanitize_json_string(s: &str) -> String {
-    let mut result = String::new();
-    let mut chars = s.chars().peekable();
-    let mut in_string = false;
-    
-    while let Some(c) = chars.next() {
-        match c {
-            '"' => {
-                if !in_string {
-                    // Starting a string
-                    result.push('"');
-                    in_string = true;
-                } else {
-                    // Check next char to see if this is the end of the string
-                    match chars.peek() {
-                        Some(&':') | Some(&',') | Some(&'}') => {
-                            // This is end of a JSON string
-                            result.push('"');
-                            in_string = false;
-                        }
-                        _ => {
-                            // This is an internal quote that needs escaping
-                            result.push_str("\\\"");
-                        }
-                    }
-                }
-            }
-            '\\' => {
-                if let Some(&next) = chars.peek() {
-                    if next == '"' {
-                        // Already escaped quote, preserve it without adding extra escapes
-                        result.push('\\');
-                        result.push('"');
-                        chars.next(); // Consume the quote
-                    } else {
-                        // Regular backslash
-                        result.push('\\');
-                    }
-                } else {
-                    result.push('\\');
-                }
-            }
-            _ => {
-                result.push(c);
-            }
-        }
-    }
-    
-    result
+use serde_json::{Value, json};
+
+pub fn sanitize_json_string(s: &str) -> String {
+    // Parse the input as JSON
+    if let Ok(mut value) = serde_json::from_str::<Value>(s) {
+        // If parsing succeeds, the string is already valid JSON
+        return s.to_string();
+    }
+    
+    // If parsing fails, try to escape quotes in string values
+    let escaped = s.replace(r#"""#, r#"\""#);
+    if let Ok(_) = serde_json::from_str::<Value>(&escaped) {
+        return escaped;
+    }
+    
+    // If all attempts fail, return the original string
+    s.to_string()
}

🧰 Tools

🪛 GitHub Actions: ci

[warning] 58-58: Code formatting issue: unnecessary blank line.

[warning] 101-101: Code formatting issue: unnecessary blank line.

221-232: Add more test cases for edge scenarios.

The current test cases cover basic scenarios but miss important edge cases.

 #[test]
 fn test_sanitize_json_string() {
     let input = r#"{"name":""Rage Shout" DireWolf"}"#;
     let expected = r#"{"name":"\"Rage Shout\" DireWolf"}"#;
     let sanitized = sanitize_json_string(input);
     assert_eq!(sanitized, expected);
 
     let input_escaped = r#"{"name":"\"Properly Escaped\" Wolf"}"#;
     let expected_escaped = r#"{"name":"\"Properly Escaped\" Wolf"}"#;
     let sanitized_escaped = sanitize_json_string(input_escaped);
     assert_eq!(sanitized_escaped, expected_escaped);
+    
+    // Test nested objects
+    let input_nested = r#"{"metadata":{"name":""Nested" Object"}}"#;
+    let expected_nested = r#"{"metadata":{"name":"\"Nested\" Object"}}"#;
+    let sanitized_nested = sanitize_json_string(input_nested);
+    assert_eq!(sanitized_nested, expected_nested);
+    
+    // Test arrays
+    let input_array = r#"{"names":[""First"", ""Second""]}"#;
+    let expected_array = r#"{"names":["\"First\"", "\"Second\""]}"#;
+    let sanitized_array = sanitize_json_string(input_array);
+    assert_eq!(sanitized_array, expected_array);
+    
+    // Test empty string
+    assert_eq!(sanitize_json_string(""), "");
+    
+    // Test invalid JSON
+    let input_invalid = r#"{"name":}"#;
+    assert_eq!(sanitize_json_string(input_invalid), input_invalid);
 }

🧰 Tools

🪛 GitHub Actions: ci

[warning] 224-224: Code formatting issue: unnecessary blank line.

[warning] 231-231: Code formatting issue: unnecessary blank line.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42aa4c3 and 8a52e55.

📒 Files selected for processing (2)

crates/torii/sqlite/src/executor/erc.rs (4 hunks)
crates/torii/sqlite/src/utils.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/torii/sqlite/src/executor/erc.rs

🧰 Additional context used

🪛 GitHub Actions: ci

crates/torii/sqlite/src/utils.rs

[warning] 58-58: Code formatting issue: unnecessary blank line.

[warning] 101-101: Code formatting issue: unnecessary blank line.

[warning] 224-224: Code formatting issue: unnecessary blank line.

[warning] 231-231: Code formatting issue: unnecessary blank line.

coderabbitai · 2025-02-03T05:18:25Z

crates/torii/sqlite/src/utils.rs

@@ -53,6 +53,58 @@ pub fn sql_string_to_felts(sql_string: &str) -> Vec<Felt> {
    sql_string.split(SQL_FELT_DELIMITER).map(|felt| Felt::from_str(felt).unwrap()).collect()
 }

+/// Sanitizes a JSON string by escaping unescaped double quotes within string values.
+pub fn sanitize_json_string(s: &str) -> String {
+    let mut result = String::new();


⚠️ Potential issue

Fix formatting issues flagged by CI.

Remove unnecessary blank lines as indicated by the pipeline failures.

pub fn sanitize_json_string(s: &str) -> String { - let mut result = String::new(); // ... rest of the function match c { _ => { result.push(c); } - } // ... rest of the function let sanitized = sanitize_json_string(input); assert_eq!(sanitized, expected); - let input_escaped = r#"{"name":"\"Properly Escaped\" Wolf"}"#; // ... rest of the test let sanitized_escaped = sanitize_json_string(input_escaped); assert_eq!(sanitized_escaped, expected_escaped); - }

Also applies to: 101-101, 224-224, 231-231

🧰 Tools

🪛 GitHub Actions: ci

[warning] 58-58: Code formatting issue: unnecessary blank line.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

crates/torii/sqlite/src/utils.rs (3)

56-57: Enhance function documentation.

Ohayo sensei! The documentation could be more comprehensive to help users understand the function's purpose and usage better.

-/// Sanitizes a JSON string by escaping unescaped double quotes within string values.
+/// Sanitizes a JSON string by properly escaping unescaped double quotes within string values.
+///
+/// This function processes JSON strings that may have improperly escaped quotes within their string values,
+/// making them valid for JSON parsing. It maintains string boundary awareness to only escape quotes that
+/// are inside JSON string values.
+///
+/// # Arguments
+/// * `s` - A string slice that contains a JSON string that may have unescaped quotes
+///
+/// # Returns
+/// A String with all necessary quotes properly escaped
+///
+/// # Example
+/// ```
+/// let input = r#"{"name":""Rage Shout" DireWolf"}"#;
+/// let sanitized = sanitize_json_string(input);
+/// assert_eq!(sanitized, r#"{"name":"\"Rage Shout\" DireWolf"}"#);
+/// ```

58-106: Consider adding error handling for malformed JSON.

The function assumes well-formed JSON input but could handle edge cases better.

 pub fn sanitize_json_string(s: &str) -> String {
-    let mut result = String::new();
+    let mut result = String::with_capacity(s.len() + 10); // Pre-allocate with some extra space for escapes
     let mut chars = s.chars().peekable();
     let mut in_string = false;
+    let mut last_char_pos = 0;
 
     while let Some(c) = chars.next() {
+        last_char_pos += 1;
         match c {
             '"' => {
                 if !in_string {
                     // Starting a string
                     result.push('"');
                     in_string = true;
                 } else {
                     // Check next char to see if this is the end of the string
                     match chars.peek() {
                         Some(&':') | Some(&',') | Some(&'}') => {
                             // This is end of a JSON string
                             result.push('"');
                             in_string = false;
                         }
                         _ => {
                             // This is an internal quote that needs escaping
                             result.push_str("\\\"");
                         }
                     }
                 }
             }
             // ... rest of the match cases ...
         }
     }
 
+    // Check for unclosed strings
+    if in_string {
+        warn!(
+            "Unclosed string detected at position {}. Input may be malformed.",
+            last_char_pos
+        );
+    }
+
     result
 }

221-232: Enhance test coverage with additional cases.

The current tests cover basic scenarios, but we should add more edge cases for robustness.

 #[test]
 fn test_sanitize_json_string() {
     let input = r#"{"name":""Rage Shout" DireWolf"}"#;
     let expected = r#"{"name":"\"Rage Shout\" DireWolf"}"#;
     let sanitized = sanitize_json_string(input);
     assert_eq!(sanitized, expected);
 
     let input_escaped = r#"{"name":"\"Properly Escaped\" Wolf"}"#;
     let expected_escaped = r#"{"name":"\"Properly Escaped\" Wolf"}"#;
     let sanitized_escaped = sanitize_json_string(input_escaped);
     assert_eq!(sanitized_escaped, expected_escaped);
+
+    // Test empty string
+    assert_eq!(sanitize_json_string(""), "");
+
+    // Test string without quotes
+    let no_quotes = r#"{"number":42}"#;
+    assert_eq!(sanitize_json_string(no_quotes), no_quotes);
+
+    // Test nested objects
+    let nested = r#"{"data":{"name":""Nested" Object"}}"#;
+    let expected_nested = r#"{"data":{"name":"\"Nested\" Object"}}"#;
+    assert_eq!(sanitize_json_string(nested), expected_nested);
+
+    // Test array of strings
+    let array = r#"{"tags":["Hello "World"", "Test"]}"#;
+    let expected_array = r#"{"tags":["Hello \"World\"", "Test"]}"#;
+    assert_eq!(sanitize_json_string(array), expected_array);
 }

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d45f9fc and 0721b87.

📒 Files selected for processing (2)

crates/torii/sqlite/src/executor/erc.rs (4 hunks)
crates/torii/sqlite/src/utils.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

crates/torii/sqlite/src/executor/erc.rs

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: docs
GitHub Check: ensure-wasm
GitHub Check: build

codecov · 2025-02-03T09:17:29Z

Codecov Report

Attention: Patch coverage is 67.18750% with 21 lines in your changes missing coverage. Please review.

Project coverage is 57.02%. Comparing base (f4b733d) to head (0721b87).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/torii/sqlite/src/executor/erc.rs	0.00%	15 Missing ⚠️
crates/torii/sqlite/src/utils.rs	87.75%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2977      +/-   ##
==========================================
- Coverage   57.10%   57.02%   -0.09%     
==========================================
  Files         424      425       +1     
  Lines       56215    56352     +137     
==========================================
+ Hits        32104    32137      +33     
- Misses      24111    24215     +104

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fix(torii): metadata data uri parsing for erc721

b2472e0

coderabbitai bot reviewed Jan 31, 2025

View reviewed changes

Larkooo added 2 commits January 31, 2025 18:25

escape unescaped quotes

42aa4c3

sanitize json string

8a52e55

coderabbitai bot reviewed Feb 3, 2025

View reviewed changes

remove print

d45f9fc

Larkooo changed the title ~~fix(torii): metadata data uri parsing for erc721~~ fix(torii-erc): metadata data uri parsing for erc721 Feb 3, 2025

Larkooo changed the title ~~fix(torii-erc): metadata data uri parsing for erc721~~ fix(torii-erc): lenient parsing for invalid json for erc721 Feb 3, 2025

fmt

0721b87

coderabbitai bot reviewed Feb 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(torii-erc): lenient parsing for invalid json for erc721 #2977

fix(torii-erc): lenient parsing for invalid json for erc721 #2977

Larkooo commented Jan 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jan 31, 2025

coderabbitai bot left a comment

coderabbitai bot Feb 3, 2025

coderabbitai bot left a comment

codecov bot commented Feb 3, 2025

fix(torii-erc): lenient parsing for invalid json for erc721 #2977

Are you sure you want to change the base?

fix(torii-erc): lenient parsing for invalid json for erc721 #2977

Conversation

Larkooo commented Jan 31, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jan 31, 2025 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 31, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Feb 3, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 3, 2025

Codecov Report

Larkooo commented Jan 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 31, 2025 •

edited

Loading