Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Optimization: Enable SIP based on cardinality information without a filter #4700

Open
ray6080 opened this issue Jan 13, 2025 · 0 comments
Open

Comments

@ray6080
Copy link
Contributor

ray6080 commented Jan 13, 2025

Description

When we're trying to apply SIP from the build side to the probe side, or from the probe side to the build side, we only choose to apply SIP when there is filter on the build or probe side. This is to avoid worst cases that cardinality estimation can be very wrong. However, this also limits the use of SIP in scenarios where output distinct values is small and serve as "filters":

  1. Num of distinct Dst nodes scanned from the rel table can be small. E.g., MATCH (a)-[e]->(b) RETURN b.id. For plan JOIN(SCAN(b))(Scan(a)Extend(e)), Scan(a)Extend(e) can be selective if e contains few distinct b nodes as dst.
  2. Table function can output few distinct nodes. Vector index search can serve as an example as it usually outputs only top k tuples, which is usually much smaller compared to the base node table. Note that SIPDirection::FORCE_BUILD_TO_PROBE is now used in vector index search to force SIP, which should be refactored away along with addressing this issue.
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

1 participant