Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Cannot use connector with master=local[N] #20

Open
dolfinus opened this issue Jan 10, 2025 · 0 comments
Open

Cannot use connector with master=local[N] #20

dolfinus opened this issue Jan 10, 2025 · 0 comments

Comments

@dolfinus
Copy link

I'm trying to use this connector with Spark session created using .master("local[2]") to write data to GP in 2 parallel connections instead of just one. Unfortunately, I've got errors like:

insert ... from ext2563ed72a064489dab799d61e85532a3_0' failed, ERROR: invalid input syntax for type date: "2024-"  (seg40 slice1 my-seg-01:10000 pid=57946)
  Where: External table ext2563ed72a064489dab799d61e85532a3_0, line 9831 of gpfdist://myhost:49152/output.pipe, column file_date
ERROR [gpfdist-write2563ed72-a064-489d-ab79-9d61e85532a3]: ERROR: current transaction is aborted, commands ignored until end of transaction block
Exception in thread "gpfdist-write2563ed72-a064-489d-ab79-9d61e85532a3" java.lang.Exception: writeUUID=2563ed72-a064-489d-ab79-9d61e85532a3 timeout on waiting job completion
        at com.itsumma.gpconnector.writer.GreenplumBatchWrite$SqlThread.run(GreenplumBatchWrite.scala:280)
        at java.base/java.lang.Thread.run(Thread.java:1623)

And writing is just stuck.

This is info from Spark UI, Executors menu item:

изображение

Executors are stuck in this lock:

Both are waiting for writing current status to RMI Master, but none of them could do this. But if I start Spark session with just one executor, or use df.coalesce(1), this error does not reproduce.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant