Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

derive_named_parameters() method that works with latest SQLite #2354

Closed
simonw opened this issue Jun 12, 2024 · 5 comments
Closed

derive_named_parameters() method that works with latest SQLite #2354

simonw opened this issue Jun 12, 2024 · 5 comments

Comments

@simonw
Copy link
Owner

simonw commented Jun 12, 2024

Related:

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

Code to fix, because the opcode trick no longer works:

@documented
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
"""
Given a SQL statement, return a list of named parameters that are used in the statement
e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
"""
explain = "explain {}".format(sql.strip().rstrip(";"))
possible_params = _re_named_parameter.findall(sql)
try:
results = await db.execute(explain, {p: None for p in possible_params})
return [row["p4"].lstrip(":") for row in results if row["opcode"] == "Variable"]
except sqlite3.DatabaseError:
return possible_params

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

Got ChatGPT Code Interpreter to have a go at this for me: https://chatgpt.com/share/f2ce4904-184c-4825-847d-30467c3a8236 - it came up with a pattern that first strips all comments, single-quoted and double-quoted strings and then extracts parameters from what's left.

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

This new version of the function passes all of the existing tests:

@documented
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
    """
    Given a SQL statement, return a list of named parameters that are used in the statement

    e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
    """
    # Remove single-line comments
    sql = re.sub(r"--.*", "", sql)
    # Remove multi-line comments
    sql = re.sub(r"/\*.*?\*/", "", sql, flags=re.DOTALL)
    # Remove single-quoted strings
    sql = re.sub(r"'(?:''|[^'])*'", "", sql)
    # Remove double-quoted strings
    sql = re.sub(r'"(?:\"\"|[^"])*"', "", sql)
    # Extract parameters from what is left
    return re.findall(r":(\w+)", sql)

But... it doesn't need to take the db argument any more and it doesn't need to be async - but it is a documented function (glad we are not at the 1.0 final release).

I'm going to add a new, non-async function and switch to that, but I'll leave an async undocumented version in there so plugins that use it don't break.

simonw added a commit that referenced this issue Jun 12, 2024
simonw added a commit to asg017/datasette that referenced this issue Jun 12, 2024
@simonw simonw closed this as completed Jun 12, 2024
@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

New implementation:

_single_line_comment_re = re.compile(r"--.*")
_multi_line_comment_re = re.compile(r"/\*.*?\*/", re.DOTALL)
_single_quote_re = re.compile(r"'(?:''|[^'])*'")
_double_quote_re = re.compile(r'"(?:\"\"|[^"])*"')
_named_param_re = re.compile(r":(\w+)")
@documented
def named_parameters(sql: str) -> List[str]:
"""
Given a SQL statement, return a list of named parameters that are used in the statement
e.g. for ``select * from foo where id=:id`` this would return ``["id"]``
"""
# Remove single-line comments
sql = _single_line_comment_re.sub("", sql)
# Remove multi-line comments
sql = _multi_line_comment_re.sub("", sql)
# Remove single-quoted strings
sql = _single_quote_re.sub("", sql)
# Remove double-quoted strings
sql = _double_quote_re.sub("", sql)
# Extract parameters from what is left
return _named_param_re.findall(sql)
async def derive_named_parameters(db: "Database", sql: str) -> List[str]:
"""
This undocumented but stable method exists for backwards compatibility
with plugins that were using it before it switched to named_parameters()
"""
return named_parameters(sql)

@simonw
Copy link
Owner Author

simonw commented Jun 12, 2024

I don't think those comments are needed with the clear names for the _re compiled regexes.

simonw added a commit that referenced this issue Jun 12, 2024
simonw added a commit that referenced this issue Jun 21, 2024
@simonw simonw mentioned this issue Aug 5, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

1 participant