Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[prism] Preprocess failure - Expected Runner Flatten Node - but wasn't #31992

Closed
Tracked by #29650
lostluck opened this issue Jul 25, 2024 · 0 comments · Fixed by #32042
Closed
Tracked by #29650

[prism] Preprocess failure - Expected Runner Flatten Node - but wasn't #31992

lostluck opened this issue Jul 25, 2024 · 0 comments · Fixed by #32042
Assignees

Comments

@lostluck
Copy link
Contributor

lostluck commented Jul 25, 2024

Six tests are failing the Java ValidatesRunner SplittableDoFnTest suite, with

java.lang.RuntimeException: The Runner experienced the following error during execution:
jobFailed job-188[splittabledofntest0testwindowedsideinputwithcheckpointsbounded-damondouglas-0709161343-776ac8a0]: preprocess validation failure of stage 8: expected runner flatten node, but wasn't: [eParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_processandsplit] -- map[nParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_splitnsized:nParDo-SDFWithMultipleOutputsPerBlockAndSideInputBounded--ParMultiDo-SDFWithMultipleOutputsPerBlockAn_splitnsized singleton/Combine.GloballyAsSingletonView/CombineValues/Values/Values/Map/ParMultiDo(Anonymous).output:singleton/Combine.GloballyAsSingletonView/CombineValues/Values/Values/Map/ParMultiDo(Anonymous).output]

expected runner flatten node, but wasn't indicates the stage got fused with multiple parallel inputs somehow, which isn't permitted.

Error is from here:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/preprocess.go#L484

An initial change should clarify what the problem actually is "stage requires multiple parallel inputs". The only stage permitted to have multiple parallel inputs is a runner side Flatten. The error should further clarify what's being printed out, along with explicit counts from each, to validate reading the error (eg. "stage has %v main inputs and isn't a runner side flatten. transforms %d %v, inputs %v" or similar)

Iterate against a locally running prism instance:

TEST=org.apache.beam.sdk.transforms.SplittableDoFnTest 
./gradlew :runners:portability:java:ulrLoopbackValidatesRunnerTests -PjobEndpoint=localhost:8073 --tests="$TEST"

There are six failing tests.


Offhand this appears as though after expansion to a splittableDoFn, there's a single transform in the stage, but somehow fusion has lead to multiple inputs for the stage.

All of the tests also use side inputs. So the problem is likely in the fusion code, when handling side inputs in synthetic SDF components. https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/runners/prism/internal/preprocess.go#L525

@lostluck lostluck self-assigned this Jul 31, 2024
@lostluck lostluck added this to the 2.59.0 Release milestone Jul 31, 2024
lostluck added a commit to lostluck/beam that referenced this issue Jul 31, 2024
lostluck added a commit that referenced this issue Jul 31, 2024
* [#31992] Send side inputs for all SDF phases.

* delint.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
reeba212 pushed a commit to reeba212/beam that referenced this issue Dec 4, 2024
)

* [apache#31992] Send side inputs for all SDF phases.

* delint.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant