[Audit][BUG] Ensure GPU handles user specified repartition in Spark 3.4 when AQE is enabled #6678
Labels
audit_3.4.0
Audit related tasks for 3.4.0
bug
Something isn't working
P1
Nice to have for release
Spark 3.4+
Spark 3.4+ issues
Describe the bug
Spark updated AQE to handle a user-specified repartition (e.g using
df.partition(<num>)
in the output when AQE was enabled. Previously Spark did not have to respect this when AQE was enabled because AQE optimization by definition could never fully respect this partitioning after shuffle. Spark now ensures that the output partitions will match when the user usesrepartition
by adjusting shuffle partitions afterwards.Spark commit - apache/spark@801ca252f4
The text was updated successfully, but these errors were encountered: