You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish we can just let CPU handle Delta table's metadata related queries.
The reason is there are some CPU fallbacks for Delta table's metadata queries such as the one reading _delta_log(Json files).
If the _delta_log is huge(say millions of rows), then the CPU fallback's performance penalty is not trivial.
If we can just let CPU handle those metadata queries, then at least the metadata queries' performance should be similar to CPU run.
The text was updated successfully, but these errors were encountered:
@andygrove I checked the query plan and time before and after setting spark.rapids.sql.optimizer.enabled=true and the result is similar -- 30s or so.
The plan are different though.
I wish we can just let CPU handle Delta table's metadata related queries.
The reason is there are some CPU fallbacks for Delta table's metadata queries such as the one reading _delta_log(Json files).
If the _delta_log is huge(say millions of rows), then the CPU fallback's performance penalty is not trivial.
If we can just let CPU handle those metadata queries, then at least the metadata queries' performance should be similar to CPU run.
The text was updated successfully, but these errors were encountered: