[Re] Speedup Graph Processing by Graph Ordering #52

lecfab · 2021-03-26T11:10:59Z

Original article: Speedup Graph Processing by Graph Ordering by Hao Wei, Jeffrey Xu Yu, Can Lu, and Xuemin Lin, in Proceedings of SIGMOD 2016

PDF URL: repository/replication.pdf
Metadata URL: repository/metadata.tex
Code URL: github repository

Scientific domain: Algorithmics
Programming language: C++, Python 3, Bash
Suggested editor: Nicolas P. Rougier

lecfab · 2021-03-26T11:16:43Z

Dear all,
This is a replication written with Lionel Tabourier and Maximilien Danisch. It includes the implementation and test of various graph algorithms. The code is mainly in C++ and we suggest Nicolas P. Rougier as the editor.

Best regards,
Fabrice Lécuyer

rougier · 2021-03-29T07:12:03Z

@lecfab Thank you for this impressive submission. I'll edit it even though I'm not too familiar with the domain. Could you suggest some reviewer names from https://rescience.github.io/board/ or external reviewers?

lecfab · 2021-04-07T13:31:18Z

Hello @rougier, thanks for your reply.
After looking into Rescience reviewers, we suggest : @EHadoux and @ozancaglayan.
For people outside of Rescience whose research interests are closer to our replication, we suggest Antoine Amarilli (https://a3nm.net, a3nm@a3nm.net) or Stefano Leucci (https://www.stefanoleucci.com, stefano.leucci@univaq.it).

ozancaglayan · 2021-04-08T11:44:13Z

Thanks! I would be happy to review it.

rougier · 2021-04-09T07:42:43Z

@ozancaglayan Thanks. You can start your review while I look for second reviewer.
@EHadoux Coudl you review this submission ?

EHadoux · 2021-04-09T08:45:25Z

Hey, let me have a look and I'll revert during the day!

EHadoux · 2021-04-11T16:03:49Z

Hey, sorry for the delay, I'd be happy to review it although I'm not familiar with the application domain. I'll refresh my memory on the procedure for reviewing.
EDIT: Do you have a link to the original paper by any chance? The scholar one doesn't work http://www1.se.cuhk.edu.hk/~lucan/paper/sigmod16-speedup.pdf

lecfab · 2021-04-11T21:51:47Z

Hello, thank you for reviewing, here is a valid link (i updated my first message above): https://raw.githubusercontent.com/datourat/Gorder/master/paper.pdf

ozancaglayan · 2021-04-12T10:27:45Z

A preliminary review for the paper only, comments on the code and the reproducibility of the experimental pipeline will follow:

This paper replicates the results provided by the paper "Speedup Graph Processing by Graph Ordering by Hao Wei,
Jeffrey Xu Yu, Can Lu, and Xuemin Lin, published in Proceedings of SIGMOD 2016.". The paper is very well written and is a pleasure to read, offering thorough analyses across different ordering methods explored in the original paper. The authors also
incorporate a 9th "Random" ordering as the least favorable i.e. ~worst-case ordering. I particularly liked the way
results are compared in Figure 6. I have very minor questions and suggestions in what follows, none of them is critical:

Questions:

Table.3: I have a hard time to decipher the columns here. What is -r and -mr in columns 2, 4, and 5?
What does it mean that the "percentage columns respectively show the proportion of data that had to be retrieved beyond
L1, L2 and L3."? Although the text clarifies these, I think it would be good to rework the table caption a bit.
I was curious on how Fig.5 would look if the x-axis groups the orderings and the bars reflect the datasets. Maybe the
plot will be easier to read with less oscillation of the bars? Just thinking loudly, never mind if this doesn't make sense.

Other remarks:

Page 3. shortest paths

What does m denote here? It would be good to have it defined at the end of 2.1's beginning paragraph.
near the end: "this algorithms" -> "this algorithm"

Page 5. Energy formulations

I understand that \pi refers to a particular permutation of nodes but what does \pi_u and \pi_v denote here with
respect to nodes u and v? In page 6 (RCM), it is said that "\pi_u is the index of node u". Could you clarify these in the text?
How does the given energy function kick in for the first equations in page 5? Should that read
e(abs(\pi_u - \pi_v)) with e() denoting the energy function, for example?

Page 10 - line 3.

"In a fourth of the experiments" -> In one fourth? or fourth set of?

ozancaglayan · 2021-04-12T13:38:12Z

Running the code

Steps followed

The compilation of the code is straight-forward and pretty quick:

$ git clone --recurse-submodules https://github.com/lecfab/rescience-gorder.git
$ cd rescience-gorder/src
$ make

I tried to run the benchmark script on a Ubuntu 20.04.2 laptop booted with the Linux kernel version 5.8.0.

$ ./run-benchmark.sh
Cache sizes for each processor of this machine:
  Processor 0:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 1:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 2:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 3:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 4:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 5:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 6:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
  Processor 7:	L0-Data: 32K 	L1-Instruction: 32K 	L2-Unified: 256K 	L3-Unified: 6144K 	
Warning: impossible to measure cache in unknown machines; please install ocperf (see README) and edit this script to get cache measurements.
Experiments will soon start... (runtime will be measured but not cache-miss)
Results will be found in ../results/r4949 
<output stripped>

Remarks

The script succesfully produces runtime measurements related files and plots.
However, the script warns that ocperf should be installed and points us to the README file. However, there are no instructions in the README file for ocperf installation. In fact, the ocperf script comes from the pmu-tools submodule which
is already checked out. I think the warning is misleading because the problem is about not passing a command-line
argument to the script to identify the machine configuration.

The only configuration defined is mesu, which apparently is a cluster configuration that the authors used. I gave that a try as well:

$ ./run-benchmark.sh mesu
...
../pmu-tools/ocperf stat -e task-clock,cpu-cycles,instructions,L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses,cycle_activity.cycles_l1d_pending,cycle_activity.cycles_l2_pending -o ../results/r2118/perf-epinion-original-nq.txt ./benchmark ../datasets/edgelist-epinion-75k-508k.txt -a nq -o ../results/r2118/time-epinion-original-nq.txt -l 10
Cannot run perf

Remarks

I now realize that the perf util should be installed on the system. Could you add a simple check in the script
to detect whether perf is installed or not and tell the user to install it using the distribution's package manager? For my kernel version, the following command did the trick: apt install linux-tools-5.8.0-48-generic linux-cloud-tools-5.8.0-48-generic
It may actually be much more clearer if you split runtime measurements and cache measurements into two different scripts? What do you think on this?

Let's gave the script another try after installing perf:

$ ./run-benchmark.sh mesu
../pmu-tools/ocperf stat -e task-clock,cpu-cycles,instructions,L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses,cycle_activity.cycles_l1d_pending,cycle_activity.cycles_l2_pending -o ../results/r2959/perf-epinion-original-nq.txt ./benchmark ../datasets/edgelist-epinion-75k-508k.txt -a nq -o ../results/r2959/time-epinion-original-nq.txt -l 10
event syntax error: '..oad-misses,cycle_activity.cycles_l1d_pending,cycle_act..'
                                  \___ parser error
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]
...
...
Warning: dataset pokec does not exist.
Warning: dataset flickr does not exist.
Warning: dataset livejournal does not exist.
Warning: dataset wiki does not exist.
Warning: dataset gplus does not exist.
Warning: dataset pldarc does not exist.
Warning: dataset twitter does not exist.
Warning: dataset sdarc does not exist.

Creating figures for orders comparison
Relatve runtime and ranking frequency
Warning: 9 datasets miss Gorder on NQ
Warning: 9 datasets miss Gorder on BFS
Warning: 9 datasets miss Gorder on DFS
Warning: 9 datasets miss Gorder on SCC
Warning: 9 datasets miss Gorder on SP
Warning: 9 datasets miss Gorder on PR
Warning: 9 datasets miss Gorder on DS
Warning: 9 datasets miss Gorder on Kcore
Warning: 9 datasets miss Gorder on Diam
Worse score  1
Image has been saved in /home/ozan/git/rescience-gorder/results/r2959/img-speedup.pdf
Image has been saved in /home/ozan/git/rescience-gorder/results/r2959/img-ranking.pdf
CPU execution and cache stall of Gorder and Original on sdarc
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-gorder-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-gorder-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-nq.txt
nq [] []
Traceback (most recent call last):
  File "../results/gorder-cache-bars.py", line 31, in <module>
    gdata_stall.append(gperf[1] / operf[0])
IndexError: list index out of range
Cache-miss rates for pr on sdarc
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Traceback (most recent call last):
  File "../results/gorder-cache-table.py", line 47, in <module>
    billions(specs[0]),         # number of L1 references
IndexError: list index out of range

Remarks

Now apparently all the perf calls are failing because of the mesu config not being appropriate for the CPUs that I have.
They fail on both Intel Xeon Silver 4116 and Intel Skylake i5-8250 CPU. This further leads to errors in the python scripts
probably because there are no results produced for cache statistics.

Is it difficult to provide a sane set of defaults for perf that'll work on most of the modern machines out there and
will produce a subset of the results at least?
After a brute force trial-and-error, I removed the non-supported cycle_activity.* events from perf calls and perf
began to work but only after setting a procfs flag to -1 to allow regular users profiling or running the script as root. Could you add this into README as well? Although this created some cache-related stats as well, the python scripts still failed probably because of the missing perf events.
Even the problematic events were supported on my machine and everything was working flawlessly, I see that gorder-cache-bars.py and gorder-acche-table.py actually computes the plots on the sdarc dataset. However, we are only given the epinion dataset in this repository. I though that at least we will be able to produce all sets of plots/graphics from the paper but only on the epinion dataset. Could you clarify this? If epinion doesn't make sense for some experiments simply it is a quite small/simple dataset, could you document where and how to get the other datasets in the README.md file?
The README file is self-contained as it explains all the codes/scripts provided. However, I think the authors should add a particular section on top of the README file to explain the exact steps that are required to reproduce the results and the plots in their paper. Right now, these information are scattered into different sections of the README.md file.

NOTE: After looking through Google, I understand that the cycle_activity.* events are not supported on new Intel CPUs. The list of events for Skylake is here. On some other posts, I see that these events may not be reliable, i.e. 1, 2 So I wonder, whether there are some other events there, that are more robust and supported on newer CPUs? I should note that I am by no means an expert on these profiling stuff.

Comparing the results

I don't know what kind of information does this give but, I ran the benchmark twice (1) on my laptop and (2) on a server. In both cases, the system load was stable and low, no desktop environment or other processes were running. I created a plot that shows all 4 runs (2x laptop, 2x server) side by side compared to the plot in the paper, on the epinion dataset:

To me, the results look almost compatible across the board. Here are some questions:

Do you plot the mean or the median of 10 repetition runtimes? Is it possible that some of the runs are outliers because of dataset loading into RAM?
It's curious that for NQ, the server with more CPUs slowed down the runtime of three algos. Are the implementations thread/process aware? For the paper, those algos apparently perform as good as Gorder since we don't have bars in the plot, right?
In overall, ChDFS, InDegSort and RCM seem to be the ones that are not that stable when compared to the paper, in terms of conclusions.

(I know that, this is just on epinion but it still can show some interesting insights to us.)

EHadoux · 2021-04-12T13:55:53Z

I'll get started on it this evening, I'll overlook @ozancaglayan's review and any potential answer to it on purpose.

EHadoux · 2021-04-13T18:31:58Z

Here is a review of the paper, the review of running the code will come shortly.

This paper is the reproduction of Speedup Graph Processing by Graph Ordering by Wei et. al. In the original paper, the authors proposed an algorithm and graph agnostic node ordering procedure allowing for the reduction of CPU cache miss. The reproduction paper reproduces the experiment section on the same 9 algorithms and 8 datasets. It also contains an additional, smaller dataset.

Review

The paper reads nicely, addressing each point in turn in a non-confusing way, sometimes even adding more information than what the original paper contains. I appreciate that the authors have made their best efforts to reproduce the original implementation and, when not enough information was given, provided a close alternative instead of dropping it out altogether.

Comparing all the results, the authors draw the same as in the original paper, albeit to a milder extent concerning the performance of the Gorder method. It would have been interesting to discuss a bit more (if possible) the potential reasons for this difference.

Remarks

Each chart in Figure 5 could benefit from having vertical lines or different backgrounds for each graph, it’s sometimes a bit hard to see at a glance where one graph (i.e. a group of bars) ends and another one starts.
Comparing Table 2 in the reproduction and Table 9 in the original paper, most lines make sense given the explanations (e.g. SlashBurn), although I am surprised that most computation times are significantly faster in the reproduction with a less performant hardware. I don’t think it’s a problem in itself though given the objective of the paper.

Questions

Keeping in mind what is being said about the hardware (i.e. more performant in the original paper), does it mean that there is a speed/memory ratio below which this optimisation makes the improvement marginal?
Comparing Figure 1 in both papers as well as both Table 3, it looks like the reproduction implementation has less cache misses overall, even before reordering. Would it mean that a poorer initial implementation would benefit more from this optimisation than an already efficient one?
In the « limits of the visualisation », the paper mentions that the original paper « hides precise information the a runtime exceeds 1.5 times the runtime of Gorder ». Is there for each of them a permutation of rankings that would make the overall ranking exactly the same between the two papers? I’m not sure it is interesting though as this only concerns the slowest methods we don’t want to use anyway.

rougier · 2021-04-16T12:43:55Z

@ozancaglayan @EHadoux Waouh, many thanks for these very clean, fast and detailed review!!! We might break a record in time from submission to publication. @lecfab Could you address the comments and questions?

EHadoux · 2021-04-16T12:46:28Z

I'll add my review of the code most probably this evening.

lecfab · 2021-04-16T21:10:29Z

Hello all, thanks for the in-depth comments, i will get back to you when i update everything!

EHadoux · 2021-04-17T16:10:35Z

Review of the code and the instructions

General review

Past the very small number of hurdles (see below) the whole process is very smooth. The automatic generation of the charts is greatly appreciated.

I am confident that the results provided in the paper are a true depiction of the outputs of the code provided with it.

Specific remark

Download datasets in datasets/dl/ and use datasets/normalise.py to match the edgelist format. Note that epinion is readily available for quick testing.

Is the epinion file that's already there the output of normalise.py for this dataset? I assume yes because I can't use it as an input. Also, giving the direct links to the datasets would help. I found a link to a Flickr dataset http://konect.cc/networks/flickr-growth/ but I am not 100% sure it is the right one.

make doesn't work on Mac with Clang because it seems that concepts aren't available. Installing gcc through brew install gcc and modifying the Makefile solved the problem. Maybe adding these instructions or removing the concepts from the code if they are not mandatory.
The scripts require some Python packages (e.g. Matplotlib), providing a requirement.txt that we can just pip install would be good.

$ ./run-window.sh [REPEAT] tests different window sizes for Gorder on Flickr dataset
The script actually runs on epinion, it needs to be slightly modified to run on Flickr. However the modification to make is already in the script, commented. It is extremely obvious how to change it, but one would need to open the source code.

Using the provided scripts only

run-window

I used 1 as a parameter unlike 100 as the authors reported due to the time it takes to run it, and stopped at a window size of 16384 (2^14 < 2^20 in the paper). You can see below the output generated. We can see the downward trend, combined with the chart on epinion (downwards then upwards), I have confidence the chart on Flickr would have looked like the one reported in the paper.

run-annealing

It seems that the paper does not state which dataset has been used to generate Figure 3. I assumed it was epinion as it is the default setting in run-annealing.sh.

Although the units on the axes are different, the shapes for both graphs are very similar to the reported one.

run-benchmark

I cannot run it on Mac as the script uses /proc/cpuinfo that does not exist on Mac. I'll run it if the authors provide an alternative script that would cater for Mac.

I have yet to run the ./benchmark but glancing at the code, I am confident that what has been committed has been generated by the C++ code, if the same Python/Bash scripts have been used to collate the results.

lecfab · 2021-04-19T17:39:07Z

Hello!
Here are the first answers to your interesting remarks, and the corrections i made to the paper. The new version is not yet uploaded on github. Feel free to react to any of my comments below!

I will work on the code and your replications as soon as possible.

First updates after review

Remarks

Ozan Caglayan's comments

Could bars in figure 5 be grouped by order instead of by dataset?

Nice suggestion, I tried it and it turns out to be helpful to compare the efficiency of the orderings. It may be misleading though, because the time scale is not the same for different datasets. I will add this figure in supplementary material: img-speedup-grouped.pdf
In simulated annealing, how is the variation of energy e related to the total energy EE?

If indices of u and v are swapped, we could write (for MinLA):

$e=\sum_{w\in N_u} (|\pi_v-\pi_w| - |\pi_u-\pi_w|) + \sum_{w\in N_v} (|\pi_u-\pi_w| - |\pi_v-\pi_w|)$

The first sum is the energy variation for u, and the second is for v. In each sum, the first term adds the energy due to the new index and substracts the energy due to the old index. So e represents the change in the contribution of u and v in EE. Is this any clearer?
Runtime and cache measurements could be separated

Indeed it would clearer with separate scripts. However, the full experiment takes about a week, so we thought it was better to run them only once, and measure both time and cache-miss rates.
Do you plot the mean or the median of 10 repetition runtimes?

We plot the sum of 10 repetitions (which corresponds to averaging).
Can some of the 10 repetitions be outliers because they include dataset loading?

The algorithms are repeated 10 times after loading the graph in RAM, so this should not be the issue. Yet, we had many issues with fluctuations: on our machines, repeating the same procedure could have up to 20% of time variability (due to process competition, cache or RAM sharing, multithreading...). As we are not specialists for these issues, we ended up running everything on an isolated node (called mesu indeed!).

Note that epinion is even more subject to fluctuations because it is small and algorithms only take a fraction of a second.
It's curious that for NQ, the server with more CPUs slowed down the runtime of three algos. Are the implementations thread/process aware? For the paper, those algos apparently perform as good as Gorder since we don't have bars in the plot, right?

It means that the runtimes are very similar (under 10% of variation). Note that epinion is probably too small to see the effect of wise ordering. On my machine, NQ is also the algorithm with most computation time (figure 1) and less cache stall, so the fluctuations on this small dataset may be more significant than the speedup due to ordering.

Our code uses only one thread but is not protected from other threads, and that's maybe a reason why we had to use an isolated cluster to avoid heavy fluctuations. I believe the 6 bars above 2.2 are a sign of such unpredictable fluctuations.
In overall, ChDFS, InDegSort and RCM seem to be the ones that are not that stable when compared to the paper, in terms of conclusions.

I'm not sure: on your results, we see that these 3 orderings have small or invisible bars in either direction. It means that they are quite close to the performance of gorder and sometimes better. This sort of matches my results, though I think we reach the limit for epinion in term of analysis!

Emmanuel Hadoux's comments

Why are ordering times so different between the original work and the replication? (table 2)

While coding MinLA, MinLoga, Slashburn and LDG, we had to make a lot of choices of assumptions because the original paper did not explain much. (For instance, our Slashburn is a very simplified version, quite similar to Kcore). For this reason, the ordering time cannot be compared across the two papers. The remaining orderings (RCM, Deg, DFS) are quite standard so comparing is reasonable. Our Gorder was heavily based on the code provided by the original authors, so the comparison stands as well. With this restriction, our hardware is 2 to 5 times slower, which we think makes sense.
Is there a speed/memory ratio below which this optimisation makes the improvement marginal?

On most architectures, part of the data is physically closer to the processor and thus faster to retrieve. I'm really not an expert in hardware, but I would think that if processor speed is increased, most of the time will be spent in data retrieval (cache stall), which in figure 1 would make the grey bars tiny. In that case, cache optimisation is even more important. On the contrary, if the CPU slow, big amounts of distant data can be fetched while the CPU makes an operation, so cache optimisation is not meaningful.
Comparing Figure 1 in both papers as well as both Table 3, it looks like the reproduction implementation has less cache misses overall

I think the rates across machines cannot be compared without knowing the details of hardware architecture.
In the « limits of the visualisation », is there a permutation of rankings beyond x1.5 that would make the overall ranking exactly the same between the two papers?

No, indeed the approximation of cutting everything above x1.5 only affects the poor performers, and we see that good performers do not rank equally in both papers. To try and be fair, we gave the same rank to all the out-of-limits orderings (if there are 10 orderings and 6 of them are beyond x1.5, we all rank them fifth) so that Fig.6 is deterministic.

Paper

Major modifications are in brown in the updated paper of the repository.

Code

provide datasets link in readme > datasets > click on a NAME

Ozan Caglayan's comments

indicate how to install perf in readme > tools and versions
clean warning messages when several datasets are missing
check if perf is installed and propose install command otherwise
change benchmark.sh warning message
change figure 1 and table 3 generator when datasets are missing
add the procfs -1 setting into readme > tools and versions
benchmark: replace MACHINE with basic and advanced modes to activate different metrics. Basic is enough for Table 3 while Figure 1 requires advanced

Add a section on top in the readme with the exact steps required to reproduce the paper

I intended the Quick guide section to achieve this, could you please suggest what to add in it to complete it?

Emmanuel Hadoux's comments

remove -fconcepts in makefile (seems harmless)
add pip3 install matplotlib numpy to readme

I used 1 as a parameter unlike 100 as the authors reported due to the time it takes to run it, and stopped at a window size of 16384 (2^14 < 2^20 in the paper). You can see below the output generated. We can see the downward trend, combined with the chart on epinion (downwards then upwards), I have confidence the chart on Flickr would have looked like the one reported in the paper.

If possible time-wise, it would be interested to test with at least 3 repetitions (on flickr), to temper sudden fluctuations. I thik they are responsible for the big peaks in the figure.
I cannot run it on Mac as the script uses /proc/cpuinfo that does not exist on Mac.

Could you please provide the error message that this generates? A missing processor info should not stop the script, but i'm not familiar with mac (maybe some shell functions are different)

lecfab · 2021-04-21T15:53:49Z

@EHadoux: sorry for the mac-related problems! For run-benchmark.sh, i think it works if you remove all the display cache size information section (lines 4 to 21).
If you want to use other datasets, there links are given in the replication paper (in table 1). I will add them to the readme though.

EHadoux · 2021-04-21T16:00:57Z

@EHadoux: sorry for the mac-related problems! For run-benchmark.sh, i think it works if you remove all the display cache size information section (lines 4 to 21).

If you want to use other datasets, there links are given in the replication paper (in table 1). I will add them to the readme though.

Cheers for that, I'll check it soon.
For the concepts bits (I've seen you've put a ?), just in case it wasn't clear, you compile with the -fconcepts flag which is not available on Clang. If you don't use the notion of concepts in your code (I haven't checked), you can probably remove the flag altogether and not have to bother with instructions for Mac users.

lecfab · 2021-04-25T08:47:29Z

Thanks again for all your comments, @ozancaglayan @EHadoux. We think all remarks and questions have been taken into account in this new version. Please let us know if further improvements are possible.

rougier · 2021-04-26T07:41:12Z

@lecfab Thanks @ozancaglayan @EHadoux If you're satisfied with the corrections and answers, I think we can accept the paper. I'm waiting for you go and let me thank you again for your really nice and detailed reviews.

ozancaglayan · 2021-04-26T08:53:35Z

Hello,

Yes I am satisfied with the answers, thanks! Figure 5 now looks much better and the version in the supplement is a nice addition as well. Thank you for clarifying installation stuff in the README files and the scripts as well!

EHadoux · 2021-04-30T15:31:11Z

Hey, sorry for the delay, it's good to go on my side. Nice work everyone!

rougier · 2021-05-03T06:45:31Z

@lecfab Congratulations, your paper has been accepted! I'll edit it soon and come back to you.

rougier · 2021-05-03T08:12:00Z

I would need a link to your article sources

lecfab · 2021-05-03T09:38:54Z

Thank you all for this efficient process. Here is the link to the article sources in overleaf.

rougier · 2021-05-17T07:53:29Z

Sorry for the delay. I edited your post to remove the link to overleaf because it is an editable link and anybody could modify your manuscript. Would it be possible for you to create a GitHub Repository with the source of thre article and using the latest template at https://github.com/ReScience/template. It would make my life easier for publication.

lecfab · 2021-05-18T10:06:06Z

https://github.com/lecfab/rescience-gorder/tree/main/paper

I filled the files, but could not compile them into a pdf because of the following errors. But i presume it will work on your side after completing the last pieces of information in metadata.yaml.

make

latexmk -pdf -pdflatex="xelatex -interaction=nonstopmode" -use-make article.tex

Latexmk: This is Latexmk, John Collins, 26 Dec. 2019, version: 4.67.

Latexmk: applying rule 'pdflatex'...

Rule 'pdflatex': The following rules & subrules became out-of-date: 'pdflatex'

Run number 1 of rule 'pdflatex'

Running 'xelatex -interaction=nonstopmode -recorder "article.tex"'

sh: 1: xelatex: not found

Latexmk: fls file doesn't appear to have been made.

Latexmk: Errors, so I did not complete making targets

Collected error summary (may duplicate other messages):

pdflatex: Command for 'pdflatex' gave return code 127

  Refer to 'article.log' for details

This message may duplicate earlier message.

Latexmk: Failure in processing file 'article.tex':

(Pdf)LaTeX didn't generate the expected log file 'article.log'

Latexmk: Use the -f option to force complete processing,

unless error was exceeding maximum runs, or warnings treated as errors.

make: *** [Makefile:30 : article.pdf] Erreur 12

rougier · 2021-05-19T10:12:32Z

It's working on my side so it is fine. Also, I forgot but can your register your code at https://www.softwareheritage.org/save-and-reference-research-software/ and post here the SWID ?

lecfab · 2021-05-20T11:01:32Z

swh:1:dir:e318a0ad72f81e2cb2af1ca614d1c171dd3f0909
https://archive.softwareheritage.org/swh:1:dir:e318a0ad72f81e2cb2af1ca614d1c171dd3f0909

rougier · 2021-05-20T11:58:08Z

Perfect, thanks. Can you check the article at https://sandbox.zenodo.org/record/830143 (this is a sandboxed version not yet published)?

lecfab · 2021-05-23T10:48:46Z

@rougier This looks good to me thanks!

rougier · 2021-05-28T11:46:29Z

Cool! It is now online at https://zenodo.org/record/4836230/ and will soon appear on ReScience C website. Congratulations!

lecfab · 2021-05-31T09:18:48Z

That was a very smooth process, and thank you for the in-depth reviews!
Cheers everyone,
Fabrice

rougier · 2021-05-31T09:20:28Z

I forgot to make a PR to your repo with the publication changes. Will do in a few minutes.

lecfab · 2021-05-31T10:03:56Z

I validated it and updated the replication.pdf !

rougier added 01 - Request DOM: Computer Science LANG: C/C++ TYPE: Replication labels Mar 29, 2021

rougier self-assigned this Mar 29, 2021

rougier added the 02 - Review label May 3, 2021

rougier added 03 - Accepted LANG: Python labels May 3, 2021

rougier mentioned this issue May 19, 2021

Article number request ReScience/ReScience#48

Open

rougier closed this as completed May 28, 2021

[Re] Speedup Graph Processing by Graph Ordering #52

[Re] Speedup Graph Processing by Graph Ordering #52

Comments

lecfab commented Mar 26, 2021 • edited Loading

lecfab commented Mar 26, 2021 • edited Loading

rougier commented Mar 29, 2021

lecfab commented Apr 7, 2021

ozancaglayan commented Apr 8, 2021

rougier commented Apr 9, 2021

EHadoux commented Apr 9, 2021

EHadoux commented Apr 11, 2021 • edited Loading

lecfab commented Apr 11, 2021

ozancaglayan commented Apr 12, 2021 • edited Loading

ozancaglayan commented Apr 12, 2021 • edited Loading

Running the code

Steps followed

Remarks

Remarks

Remarks

Comparing the results

EHadoux commented Apr 12, 2021

EHadoux commented Apr 13, 2021

Review

Remarks

Questions

rougier commented Apr 16, 2021

EHadoux commented Apr 16, 2021

lecfab commented Apr 16, 2021

EHadoux commented Apr 17, 2021

Review of the code and the instructions

General review

Specific remark

Using the provided scripts only

run-window

run-annealing

run-benchmark

lecfab commented Apr 19, 2021 • edited Loading

First updates after review

Remarks

Ozan Caglayan's comments

Emmanuel Hadoux's comments

Paper

Code

Ozan Caglayan's comments

Emmanuel Hadoux's comments

lecfab commented Apr 21, 2021

EHadoux commented Apr 21, 2021

lecfab commented Apr 25, 2021

rougier commented Apr 26, 2021

ozancaglayan commented Apr 26, 2021

EHadoux commented Apr 30, 2021

rougier commented May 3, 2021

rougier commented May 3, 2021

lecfab commented May 3, 2021 • edited by rougier Loading

rougier commented May 17, 2021

lecfab commented May 18, 2021

rougier commented May 19, 2021

lecfab commented May 20, 2021

rougier commented May 20, 2021

lecfab commented May 23, 2021

rougier commented May 28, 2021

lecfab commented May 31, 2021

rougier commented May 31, 2021

lecfab commented May 31, 2021

lecfab commented Mar 26, 2021 •

edited

Loading

lecfab commented Mar 26, 2021 •

edited

Loading

EHadoux commented Apr 11, 2021 •

edited

Loading

ozancaglayan commented Apr 12, 2021 •

edited

Loading

ozancaglayan commented Apr 12, 2021 •

edited

Loading

lecfab commented Apr 19, 2021 •

edited

Loading

lecfab commented May 3, 2021 •

edited by rougier

Loading