Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix streaming when the same file occurs with different DVs in the sam…
…e batch ## Description There was an edge case in streaming with deletion vectors in the source, where in `ignoreChanges`-mode it could happen that if the same file occurred with different DVs in the same batch (or both with a DV and without a DV), then we would read the file with the wrong DV, since we broadcast the DVs to the scans by data file path. This PR fixes this issue, by reading files from different versions in different scans and then taking the union of the result to build the final `DataFrame` for the batch. Added new tests for having 2 DML commands (DELETE->DELETE and DELETE->INSERT) in the same batch for all change modi. ## Does this PR introduce _any_ user-facing changes? No. Closes #1899 Signed-off-by: larsk-db <lars.kroll@databricks.com> GitOrigin-RevId: 43a2d479832ba9b4be7b888c0a633e725729744a
- Loading branch information