You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+20-2
Original file line number
Diff line number
Diff line change
@@ -57,11 +57,29 @@ results got down a lot (so either I did some mistake in the previous version or
57
57
- with 8 workers: 3993
58
58
- with 15 workers: 3056
59
59
- with 25 workers: 3484
60
+
61
+
The fourth (and last) version works like this:
60
62
- spawn a worker that is responsible to iterate through the file system. For each file that he find, increment the
61
63
number of workers to wait and send the filehandle to a new/free worker.
62
-
When all the worker ends their job send data to the main thread. To check when the job is done he just increase a counter
64
+
When a worker ends its job it sends the data to the first worker. To check when the job is done he just increase a counter
63
65
every time he find a new file to count, and increase another counter when a worker returns the results. When both are equal
64
-
we're done.
66
+
we're done (the first count has a bit of delay because otherwise workers respond so fast that the program is finished just after the start -> we increment to one the files to count and send the data to the worker and he respond before the first thread set the counter
67
+
to 2 so the program thinks he's done).
68
+
TODO:
69
+
- with 8 workers: 4505
70
+
- with 15 workers: 4187
71
+
- with 25 workers: 4163
72
+
73
+
I was expecting this solution to be faster compared to the poller one, but maybe there is a motivation. In this fourth solution we are
74
+
sending the first file to the first worker, the second to the second, and (if we have 8 workers) the ninth to the first worker again
75
+
(we are using round robin like), so there is a possibility that one worker gets all the heaviest files of the project and he may
76
+
be a bottleneck.
77
+
I'm not convinced though.
78
+
79
+
Looks like the third one using a polling algorithm is the fastest so I'll be using that in the application.
80
+
Anyway even if these numbers are taken on my machine, I'm very happy to see that the solutions that uses multiple workers
81
+
are a lot faster compared to the ones that uses only one worker or even worse the one that runs in the main thread
0 commit comments