-
Notifications
You must be signed in to change notification settings - Fork 18
docs(ASOF JOIN): Clarify timestamp control and precedence #197
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Open
jerrinot
wants to merge
3
commits into
main
Choose a base branch
from
jh_asof_designated
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -327,31 +327,91 @@ Result: | |
|
||
</div> | ||
|
||
### Timestamp considerations | ||
### How ASOF JOIN uses timestamps | ||
|
||
`ASOF` join can be performed only on tables or result sets that are ordered by | ||
time. When a table is created with a | ||
[designated timestamp](/docs/concept/designated-timestamp/) the order of records | ||
is enforced and the timestamp column name is in the table metadata. `ASOF` join | ||
uses this timestamp column from metadata. | ||
`ASOF JOIN` requires tables or subqueries to be ordered by time. The best way to meet this requirement is to use a | ||
[designated timestamp](/docs/concept/designated-timestamp/), which is set when you create a table. | ||
This not only enforces the chronological order of your data but also tells QuestDB which column to use for time-series | ||
operations automatically. | ||
|
||
:::caution | ||
#### Default behavior | ||
|
||
`ASOF` join requires that the tables or subqueries have designated timestamps. This means | ||
they have an ascending order timestamp column, which may need to be specified with `timestamp(ts)`. See below! | ||
By default, an `ASOF JOIN` will always use the designated timestamp of the tables involved. | ||
|
||
::: | ||
This behavior is so fundamental that it extends to subqueries in a unique way: even if you do not explicitly SELECT the | ||
designated timestamp column in a subquery, QuestDB implicitly propagates it. The join is performed correctly under the | ||
hood using this hidden timestamp, which is then omitted from the final result set. | ||
|
||
In case tables do not have a designated timestamp column, but data is in | ||
chronological order, timestamp columns can be specified at runtime: | ||
This makes most `ASOF JOIN` queries simple and intuitive. | ||
|
||
```questdb-sql | ||
```questdb-sql title="ASOF JOIN with designated timestamp" | ||
-- The 'trades' table has 'trade_ts' as its designated timestamp. | ||
-- Even though 'trade_ts' is not selected in the subquery, | ||
-- it is used implicitly for the ASOF JOIN. | ||
WITH trades_subset AS ( | ||
SELECT symbol, price, amount FROM trades | ||
) | ||
SELECT * | ||
FROM (a timestamp(ts)) | ||
ASOF JOIN (b timestamp (ts)); | ||
FROM trades_subset ASOF JOIN quotes ON (symbol); | ||
``` | ||
|
||
### SQL Performance Hints for ASOF JOIN | ||
In more complicated subqueries, the implicit propagation of the designated timestamp may not work QuestDB responses with an error | ||
`left side of time series join has no timestamp`. In such cases, your subquery should explicitly include the designated | ||
timestamp column in the `SELECT` clause to ensure it is used for the join. | ||
|
||
#### The standard override method: Using ORDER BY | ||
|
||
The easiest and safest way to join on a different timestamp column is to use an `ORDER BY ... ASC` clause in your subquery. | ||
|
||
When you sort a subquery by a `TIMESTAMP` column, QuestDB makes that column the new designated timestamp for the subquery's results. The subsequent `ASOF JOIN` will automatically detect and use this new timestamp. | ||
|
||
Example: Joining on `ingestion_time` instead of the default `trade_ts` | ||
|
||
```questdb-sql title="ASOF JOIN with custom timestamp" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also demo-able? |
||
WITH trades_ordered_by_ingestion AS ( | ||
SELECT symbol, price, ingestion_time | ||
FROM trades | ||
WHERE symbol = 'QDB' | ||
-- This ORDER BY clause tells QuestDB to use 'ingestion_time' | ||
-- as the new designated timestamp for this subquery. | ||
ORDER BY ingestion_time ASC | ||
) | ||
-- No extra syntax is needed here. The ASOF JOIN automatically uses | ||
-- the new designated timestamp from the subquery. | ||
SELECT * | ||
FROM trades_ordered_by_ingestion | ||
ASOF JOIN quotes ON (symbol); | ||
``` | ||
|
||
#### Using the timestamp() syntax | ||
|
||
The `timestamp()` syntax is an expert-level hint for the query engine. It should only be used to manually assign a | ||
timestamp to a dataset that does not have one, without forcing a sort. | ||
|
||
You should only use this when you can guarantee that your data is already sorted by that timestamp column. Using | ||
`timestamp()` incorrectly on unsorted data will lead to incorrect join results. | ||
|
||
The primary use case is performance optimization on a table that has no designated timestamp in its schema, but where | ||
you know the data is physically stored in chronological order. Using the `timestamp()` hint avoids a costly ORDER BY | ||
operation. | ||
|
||
```questdb-sql title="ASOF JOIN with timestamp()" | ||
-- Use this ONLY IF 'my_unsorted_table' has NO designated timestamp, | ||
-- but you can guarantee its data is already physically ordered by 'event_time'. | ||
|
||
SELECT * | ||
FROM (my_unsorted_table timestamp(event_time)) | ||
ASOF JOIN another_table ON (symbol); | ||
``` | ||
|
||
To summarize: | ||
|
||
1. By default, the table's designated timestamp is used. | ||
2. To join on a different column, the standard method is to `ORDER BY` that column in a subquery. | ||
3. Use the `timestamp()` syntax as an expert-level hint to avoid a sort on a table with no designated timestamp, if and | ||
only if you are certain the data is already sorted. | ||
|
||
### SQL performance hints for ASOF JOIN | ||
|
||
QuestDB supports SQL hints that can optimize non-keyed ASOF join performance when filters are applied to the joined table: | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Demo-able query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately no:(
in fact, I found a bug while trying to find an equivalent demo-able query 😬