Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

What's the difference from parallel_split_test to parallel_tests? #26

Closed
sobrinho opened this issue Jun 18, 2024 · 9 comments
Closed

What's the difference from parallel_split_test to parallel_tests? #26

sobrinho opened this issue Jun 18, 2024 · 9 comments

Comments

@sobrinho
Copy link

Hey there!

By the documentation it's not clear and at first glance looks like parallel_tests is more robust.

Am I missing something here?

@grosser
Copy link
Owner

grosser commented Jun 18, 2024

"Split a big test file into multiple chunks and run them in parallel"
parallel_test can't do that
this library is not as robust or well-documented, more of a POC that I never really ended up using in any big project

@grosser grosser closed this as completed Jun 18, 2024
@sobrinho
Copy link
Author

So we could say that given two files:

file_a.spec:1 => 1sec
file_a.spec:2 => 5sec
file_b.spec:1 => 5sec

Assuming 2 CPU, parallel_tests would run:

process1: file_a
process2: file_b

And parallel_split_test would run:

process1: file_a:1, file_b:1
process2: file_b:1

What I'm trying to get is which gem is the best.

We have hundreds of thousands of tests and we run 100 machines here, each machine with 4 CPU.

Although we have a file here and there that alone takes 10 minutes or even more to run.

@sobrinho
Copy link
Author

Also, we are noticing that parallel_split_test doesn't balance specs so the machine gets bored by running one process for too long compared to the others.

@grosser
Copy link
Owner

grosser commented Jun 18, 2024

use parallel_tests, it's much more advanced and especially with lots of tests should be much better

@sobrinho
Copy link
Author

I will give it a try then, thanks!

@sobrinho
Copy link
Author

@grosser that might be more a support than an issue so let me know if I should open this somewhere else but how I would parallel_test to use the ParallelTests::RSpec::RuntimeLogger to balance between different machines with more than 1 process?

I have 50 machines running in parallel with 4 vCPU each so I would like to balance the time between machines AND processes.

I can use something like https://github.com/mtsmfm/split-test to split the time between machines but then the scenario where one file that takes 22min won't be balanced and I will have 3 CPUs on the designed machine doing nothing.

We might take the path where we use 200 machines with 1 CPU instead but again that scenario won't be covered since one of the machines will have that outlier file.

@grosser
Copy link
Owner

grosser commented Jun 19, 2024

@sobrinho
Copy link
Author

@grosser we ended optimizing that big file to be near to the other files and we used split-test, things looks better now.

We still see something here and there outling but not as much anymore.

Random thought: why not use the junit output formatter to spit out the file and line number to be able to balance them?

Assuming a well-behaved file, it wouldn't change much but for outliers files we had one with 22 min and next was around 10min, it would be splitted! :)

@grosser
Copy link
Owner

grosser commented Jun 27, 2024 via email

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants