Skip to content

fix: add stacker and maybe_grow on recursion guard #1468

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 1 commit into from

Conversation

Eason0729
Copy link
Contributor

Changed

under #[cfg(feature="std")], stacker::maybe_grow was called in RecursionCounter::try_decrease to prevent stack overflow

I also tested on Windows(stack size 1MB), it seems fine.

reproducible code from original issue

use datafusion::prelude::{ParquetReadOptions, SessionContext};

#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ctx = SessionContext::new();
    ctx.register_parquet(
        "parquet_table",
        "batches.parquet",
        ParquetReadOptions::default(),
    )
    .await?;

    let sql_query = "
    SELECT seq
    FROM (
      SELECT seq,
        LEAD(seq) OVER (ORDER BY seq) AS next_seq
        FROM parquet_table
    ) AS subquery
    WHERE next_seq - seq > 1";

    let df = ctx.sql(sql_query).await?;

    df.show().await?;
    Ok(())
}

Missing

  • Return error when growing stack fail:
    growing stack on unsupported target seems to be no-op on stacker side, so I am unable to detect it.
  • Test: which part should be tested?
  • Observe performance impact

@Eason0729
Copy link
Contributor Author

I evaluated stacker as part of sqlparser and I thought it was doing some too crazy stuff that made it hard to use in embedded / wasm environments. Maybe that is better now

Originally posted by @alamb in apache/datafusion#9375 (comment)

I think the use of stacker here is fine under #[cfg(feature="std")], and I totally forget to add stacker as optional dependency 😄 .

@Eason0729 Eason0729 marked this pull request as ready for review October 13, 2024 06:41
@Eason0729
Copy link
Contributor Author

I have add stacker as optional dependency, so it won't be used in no std environment.

*It will continue to stack overflow in this case.

What's your opinion on the use of stacker? @alamb

@alamb
Copy link
Contributor

alamb commented Oct 15, 2024

I have add stacker as optional dependency, so it won't be used in no std environment.

*It will continue to stack overflow in this case.

What's your opinion on the use of stacker? @alamb

I think stacker is a clever hack but we should fix the code for real to avoid deeply nested callstacks rather than hoping that a third-party library will bail us out.

I don't think it is a good idea to use stacker as I don't want to be responsible for trying to debug any issues caused by its use. That may sound selfish, but I am already at my cognitive complexity load for sqlparser (and various other crates)

@Eason0729
Copy link
Contributor Author

That's okay. (I think this PR as a draft to demonstrate what I would like to change

Then I would try to reduce stack usage with another PR.

@Eason0729
Copy link
Contributor Author

Can I mark this PR as draft again or just create a new one?

@iffyio
Copy link
Contributor

iffyio commented Oct 18, 2024

@Eason0729 thanks for looking to improve this behavior! I think should be good to create a new PR (we can link also this PR in the description since they're related)

@iffyio
Copy link
Contributor

iffyio commented Nov 6, 2024

Marking this as draft in the meantime

@iffyio iffyio mentioned this pull request Nov 16, 2024
1 task
@Eason0729 Eason0729 closed this Dec 17, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants