What's the difference from parallel_split_test to parallel_tests? #26

sobrinho · 2024-06-18T16:23:57Z

Hey there!

By the documentation it's not clear and at first glance looks like parallel_tests is more robust.

Am I missing something here?

grosser · 2024-06-18T16:38:55Z

"Split a big test file into multiple chunks and run them in parallel"
parallel_test can't do that
this library is not as robust or well-documented, more of a POC that I never really ended up using in any big project

sobrinho · 2024-06-18T16:50:59Z

So we could say that given two files:

file_a.spec:1 => 1sec
file_a.spec:2 => 5sec
file_b.spec:1 => 5sec

Assuming 2 CPU, parallel_tests would run:

process1: file_a
process2: file_b

And parallel_split_test would run:

process1: file_a:1, file_b:1
process2: file_b:1

What I'm trying to get is which gem is the best.

We have hundreds of thousands of tests and we run 100 machines here, each machine with 4 CPU.

Although we have a file here and there that alone takes 10 minutes or even more to run.

sobrinho · 2024-06-18T16:53:33Z

Also, we are noticing that parallel_split_test doesn't balance specs so the machine gets bored by running one process for too long compared to the others.

grosser · 2024-06-18T16:59:32Z

use parallel_tests, it's much more advanced and especially with lots of tests should be much better

sobrinho · 2024-06-18T17:34:54Z

I will give it a try then, thanks!

sobrinho · 2024-06-18T20:49:49Z

@grosser that might be more a support than an issue so let me know if I should open this somewhere else but how I would parallel_test to use the ParallelTests::RSpec::RuntimeLogger to balance between different machines with more than 1 process?

I have 50 machines running in parallel with 4 vCPU each so I would like to balance the time between machines AND processes.

I can use something like https://github.com/mtsmfm/split-test to split the time between machines but then the scenario where one file that takes 22min won't be balanced and I will have 3 CPUs on the designed machine doing nothing.

We might take the path where we use 200 machines with 1 CPU instead but again that scenario won't be covered since one of the machines will have that outlier file.

grosser · 2024-06-19T00:42:28Z

the runtime logger output needs to be collected for all tests, so either run locally and store it or combine in CI see https://github.com/grosser/parallel_tests?tab=readme-ov-file#even-test-group-runtimes especially the step 3 example
for tests that are larger than anything else you'd have to split them (something dump like foo_a_spec.rb/foo_b_spec.rb, possibly automated, but ideally on a per-feature level whatever works ...)

sobrinho · 2024-06-27T19:07:16Z

@grosser we ended optimizing that big file to be near to the other files and we used split-test, things looks better now.

We still see something here and there outling but not as much anymore.

Random thought: why not use the junit output formatter to spit out the file and line number to be able to balance them?

Assuming a well-behaved file, it wouldn't change much but for outliers files we had one with 22 min and next was around 10min, it would be splitted! :)

grosser · 2024-06-27T22:39:08Z

the artifact upload/download looks nice! the runtime formatter gives a nice easy to split output, so that should be usable too but if junit is builtin to all test runners and has that info that might be simpler

…

On Thu, Jun 27, 2024 at 12:07 PM Gabriel Sobrinho ***@***.***> wrote: @grosser <https://github.com/grosser> we ended optimizing that big file to be near to the other files and we used split-test <https://github.com/mtsmfm/split-test>, things looks better now. We still see something here and there outling but not as much anymore. Random thought: why not use the junit output formatter to spit out the file and line number to be able to balance them? Assuming a well-behaved file, it wouldn't change much but for outliers files we had one with 22 min and next was around 10min, it would be splitted! :) — Reply to this email directly, view it on GitHub <#26 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAACYZYAZA7E77UZQGVORVTZJRPHVAVCNFSM6AAAAABJQMCLMCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJVGQ4DKMJZHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

grosser closed this as completed Jun 18, 2024

stevenharman mentioned this issue Dec 12, 2024

Parallel scenarios? grosser/parallel_tests#747

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the difference from parallel_split_test to parallel_tests? #26

What's the difference from parallel_split_test to parallel_tests? #26

sobrinho commented Jun 18, 2024

grosser commented Jun 18, 2024

sobrinho commented Jun 18, 2024

sobrinho commented Jun 18, 2024

grosser commented Jun 18, 2024

sobrinho commented Jun 18, 2024

sobrinho commented Jun 18, 2024

grosser commented Jun 19, 2024

sobrinho commented Jun 27, 2024

grosser commented Jun 27, 2024 via email

What's the difference from parallel_split_test to parallel_tests? #26

What's the difference from parallel_split_test to parallel_tests? #26

Comments

sobrinho commented Jun 18, 2024

grosser commented Jun 18, 2024

sobrinho commented Jun 18, 2024

sobrinho commented Jun 18, 2024

grosser commented Jun 18, 2024

sobrinho commented Jun 18, 2024

sobrinho commented Jun 18, 2024

grosser commented Jun 19, 2024

sobrinho commented Jun 27, 2024

grosser commented Jun 27, 2024 via email