Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[BUG] Poor handling of boolean expressions in WHERE clauses #3266

Open
Swiddis opened this issue Jan 27, 2025 · 3 comments
Open

[BUG] Poor handling of boolean expressions in WHERE clauses #3266

Swiddis opened this issue Jan 27, 2025 · 3 comments
Assignees
Labels
bug Something isn't working calcite calcite migration releated dynamic-test Issues found by or related to Dynamic Testing SQL

Comments

@Swiddis
Copy link
Collaborator

Swiddis commented Jan 27, 2025

What is the bug?

This is a bug for three related issues around booleans in WHERE clauses. I suspect without proof that they all have a similar root cause. All examples here use the ecommerce dataset as an example index, but should be reproducible with any index.

How can one reproduce the bug?

  1. WHERE FALSE matches all records.

Query:

POST _plugins/_sql/_explain
{
  "query": "SELECT COUNT(*) FROM opensearch_dashboards_sample_data_ecommerce WHERE FALSE"
}

Result:

{
  "schema": [
    {
      "name": "COUNT(*)",
      "type": "integer"
    }
  ],
  "datarows": [
    [
      4675 // should be 0
    ]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}
  1. WHERE NOT(x) causes an error for any constant boolean.

Query:

POST _plugins/_sql
{
  "query": "SELECT * FROM opensearch_dashboards_sample_data_ecommerce WHERE NOT(FALSE)"
}

Result:

{
  "error": {
    "reason": "Invalid SQL query",
    "details": "inner bool query clause cannot be null",
    "type": "IllegalArgumentException"
  },
  "status": 400
}

The same happens with WHERE NOT(TRUE), WHERE NOT(NOT(FALSE)), WHERE NOT(FALSE OR TRUE), and similar.

  1. Internal druid exceptions raised when building expressions with NOT(NULL).

Query:

POST _plugins/_sql
{
  "query": "SELECT * FROM opensearch_dashboards_sample_data_ecommerce WHERE TRUE = NOT(NULL)"
}

Result:

{
  "error": {
    "reason": "Invalid SQL query",
    "details": "err find condition class com.alibaba.druid.sql.ast.expr.SQLBooleanExpr",
    "type": "SqlParseException"
  },
  "status": 400
}

NOT(NULL) is equivalent to NULL, so this whole expression should be equivalent to WHERE NULL and return no records1.

What is the expected behavior?
These expressions should be correctly evaluated and applied. They're somewhat weird examples for human-written queries, but automatic query builders may produce queries like this, especially for WHERE FALSE.

What is your host/environment?

Do you have any screenshots?
N/A

Do you have any additional context?
Found by the WIP distributed-testing suite. See: #3220

Footnotes

  1. It's worth noting that SQL really uses ternary logic: NULL lives among the typical boolean values and the 3 values generate their own truth tables. This is (for better or for worse) pretty fundamental to SQL's operation and is the principle that TLP is built on. As such, in boolean handling, we should really treat NULL as a bona fide boolean.

@Swiddis Swiddis added bug Something isn't working untriaged labels Jan 27, 2025
@Swiddis
Copy link
Collaborator Author

Swiddis commented Jan 27, 2025

Another parsing exception happens if you compare non-constants with NOT(_). Here's a query that uses a slightly-roundabout way of expressing "find all orders that are either bulk orders of cheap products, or single-orders of expensive products" (there probably exists another context where this type of construction is more natural):

POST _plugins/_sql
{
  "query": "SELECT * FROM opensearch_dashboards_sample_data_ecommerce WHERE (total_quantity > 1) = NOT(products.base_price > 10.0)"
}

Result:

{
  "error": {
    "reason": "Invalid SQL query",
    "details": "No enum constant org.opensearch.sql.legacy.domain.Where.CONN.=",
    "type": "IllegalArgumentException"
  },
  "status": 400
}

Swiddis added a commit to Swiddis/opensearch-sql-distributed-testing that referenced this issue Jan 27, 2025
It's annoying to have tests repeatedly output the same issues. An example bug that was found after adding the disableConfigExprs option: opensearch-project/sql#3266 (comment)

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@Swiddis
Copy link
Collaborator Author

Swiddis commented Jan 27, 2025

Ended up disabling NOT in distributed-testing since it seems like it's fundamentally broken if you breathe on it the wrong way.

@Swiddis
Copy link
Collaborator Author

Swiddis commented Jan 28, 2025

Another one: if you create an index with a boolean field, then WHERE bool_field = NULL also throws an exception (should evaluate to NULL for all values of bool_field). So in disttest we also need to disable generating NULL literals.

@Swiddis Swiddis added the dynamic-test Issues found by or related to Dynamic Testing label Jan 28, 2025
@penghuo penghuo added the calcite calcite migration releated label Jan 28, 2025
@Swiddis Swiddis added the SQL label Jan 29, 2025
@Swiddis Swiddis changed the title [BUG] Poor handling of constant booleans in WHERE clauses [BUG] Poor handling of boolean expressions in WHERE clauses Jan 29, 2025
@Swiddis Swiddis self-assigned this Jan 29, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working calcite calcite migration releated dynamic-test Issues found by or related to Dynamic Testing SQL
Projects
None yet
Development

No branches or pull requests

3 participants