-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Re] Speedup Graph Processing by Graph Ordering #52
Comments
Dear all, Best regards, |
@lecfab Thank you for this impressive submission. I'll edit it even though I'm not too familiar with the domain. Could you suggest some reviewer names from https://rescience.github.io/board/ or external reviewers? |
Hello @rougier, thanks for your reply. |
Thanks! I would be happy to review it. |
@ozancaglayan Thanks. You can start your review while I look for second reviewer. |
Hey, let me have a look and I'll revert during the day! |
Hey, sorry for the delay, I'd be happy to review it although I'm not familiar with the application domain. I'll refresh my memory on the procedure for reviewing. |
Hello, thank you for reviewing, here is a valid link (i updated my first message above): https://raw.githubusercontent.com/datourat/Gorder/master/paper.pdf |
A preliminary review for the paper only, comments on the code and the reproducibility of the experimental pipeline will follow: This paper replicates the results provided by the paper "Speedup Graph Processing by Graph Ordering by Hao Wei, Questions:
Other remarks:
|
Running the codeSteps followedThe compilation of the code is straight-forward and pretty quick: $ git clone --recurse-submodules https://github.com/lecfab/rescience-gorder.git
$ cd rescience-gorder/src
$ make I tried to run the benchmark script on a Ubuntu 20.04.2 laptop booted with the Linux kernel version 5.8.0. $ ./run-benchmark.sh
Cache sizes for each processor of this machine:
Processor 0: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 1: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 2: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 3: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 4: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 5: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 6: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Processor 7: L0-Data: 32K L1-Instruction: 32K L2-Unified: 256K L3-Unified: 6144K
Warning: impossible to measure cache in unknown machines; please install ocperf (see README) and edit this script to get cache measurements.
Experiments will soon start... (runtime will be measured but not cache-miss)
Results will be found in ../results/r4949
<output stripped> Remarks
The only configuration defined is $ ./run-benchmark.sh mesu
...
../pmu-tools/ocperf stat -e task-clock,cpu-cycles,instructions,L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses,cycle_activity.cycles_l1d_pending,cycle_activity.cycles_l2_pending -o ../results/r2118/perf-epinion-original-nq.txt ./benchmark ../datasets/edgelist-epinion-75k-508k.txt -a nq -o ../results/r2118/time-epinion-original-nq.txt -l 10
Cannot run perf Remarks
Let's gave the script another try after installing $ ./run-benchmark.sh mesu
../pmu-tools/ocperf stat -e task-clock,cpu-cycles,instructions,L1-dcache-loads,L1-dcache-load-misses,LLC-loads,LLC-load-misses,cycle_activity.cycles_l1d_pending,cycle_activity.cycles_l2_pending -o ../results/r2959/perf-epinion-original-nq.txt ./benchmark ../datasets/edgelist-epinion-75k-508k.txt -a nq -o ../results/r2959/time-epinion-original-nq.txt -l 10
event syntax error: '..oad-misses,cycle_activity.cycles_l1d_pending,cycle_act..'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
...
...
Warning: dataset pokec does not exist.
Warning: dataset flickr does not exist.
Warning: dataset livejournal does not exist.
Warning: dataset wiki does not exist.
Warning: dataset gplus does not exist.
Warning: dataset pldarc does not exist.
Warning: dataset twitter does not exist.
Warning: dataset sdarc does not exist.
Creating figures for orders comparison
Relatve runtime and ranking frequency
Warning: 9 datasets miss Gorder on NQ
Warning: 9 datasets miss Gorder on BFS
Warning: 9 datasets miss Gorder on DFS
Warning: 9 datasets miss Gorder on SCC
Warning: 9 datasets miss Gorder on SP
Warning: 9 datasets miss Gorder on PR
Warning: 9 datasets miss Gorder on DS
Warning: 9 datasets miss Gorder on Kcore
Warning: 9 datasets miss Gorder on Diam
Worse score 1
Image has been saved in /home/ozan/git/rescience-gorder/results/r2959/img-speedup.pdf
Image has been saved in /home/ozan/git/rescience-gorder/results/r2959/img-ranking.pdf
CPU execution and cache stall of Gorder and Original on sdarc
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-gorder-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-gorder-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-nq.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-nq.txt
nq [] []
Traceback (most recent call last):
File "../results/gorder-cache-bars.py", line 31, in <module>
gdata_stall.append(gperf[1] / operf[0])
IndexError: list index out of range
Cache-miss rates for pr on sdarc
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Error with /home/ozan/git/rescience-gorder/results/r2959/perf-sdarc-original-pr.txt
Traceback (most recent call last):
File "../results/gorder-cache-table.py", line 47, in <module>
billions(specs[0]), # number of L1 references
IndexError: list index out of range RemarksNow apparently all the
NOTE: After looking through Google, I understand that the Comparing the resultsI don't know what kind of information does this give but, I ran the benchmark twice (1) on my laptop and (2) on a server. In both cases, the system load was stable and low, no desktop environment or other processes were running. I created a plot that shows all 4 runs (2x laptop, 2x server) side by side compared to the plot in the paper, on the To me, the results look almost compatible across the board. Here are some questions:
(I know that, this is just on |
I'll get started on it this evening, I'll overlook @ozancaglayan's review and any potential answer to it on purpose. |
Here is a review of the paper, the review of running the code will come shortly. This paper is the reproduction of Speedup Graph Processing by Graph Ordering by Wei et. al. In the original paper, the authors proposed an algorithm and graph agnostic node ordering procedure allowing for the reduction of CPU cache miss. The reproduction paper reproduces the experiment section on the same 9 algorithms and 8 datasets. It also contains an additional, smaller dataset. ReviewThe paper reads nicely, addressing each point in turn in a non-confusing way, sometimes even adding more information than what the original paper contains. I appreciate that the authors have made their best efforts to reproduce the original implementation and, when not enough information was given, provided a close alternative instead of dropping it out altogether. Comparing all the results, the authors draw the same as in the original paper, albeit to a milder extent concerning the performance of the Gorder method. It would have been interesting to discuss a bit more (if possible) the potential reasons for this difference. Remarks
Questions
|
@ozancaglayan @EHadoux Waouh, many thanks for these very clean, fast and detailed review!!! We might break a record in time from submission to publication. @lecfab Could you address the comments and questions? |
I'll add my review of the code most probably this evening. |
Hello all, thanks for the in-depth comments, i will get back to you when i update everything! |
Review of the code and the instructionsGeneral reviewPast the very small number of hurdles (see below) the whole process is very smooth. The automatic generation of the charts is greatly appreciated. I am confident that the results provided in the paper are a true depiction of the outputs of the code provided with it. Specific remark
Is the epinion file that's already there the output of
Using the provided scripts onlyrun-windowI used 1 as a parameter unlike 100 as the authors reported due to the time it takes to run it, and stopped at a window size of 16384 (2^14 < 2^20 in the paper). You can see below the output generated. We can see the downward trend, combined with the chart on epinion (downwards then upwards), I have confidence the chart on Flickr would have looked like the one reported in the paper. run-annealingIt seems that the paper does not state which dataset has been used to generate Figure 3. I assumed it was epinion as it is the default setting in Although the units on the axes are different, the shapes for both graphs are very similar to the reported one. run-benchmarkI cannot run it on Mac as the script uses /proc/cpuinfo that does not exist on Mac. I'll run it if the authors provide an alternative script that would cater for Mac. I have yet to run the |
Hello! I will work on the code and your replications as soon as possible. First updates after reviewRemarksOzan Caglayan's comments
Emmanuel Hadoux's comments
PaperMajor modifications are in brown in the updated paper of the repository.
Code
Ozan Caglayan's comments
Emmanuel Hadoux's comments
|
@EHadoux: sorry for the mac-related problems! For |
Cheers for that, I'll check it soon. |
Thanks again for all your comments, @ozancaglayan @EHadoux. We think all remarks and questions have been taken into account in this new version. Please let us know if further improvements are possible. |
@lecfab Thanks @ozancaglayan @EHadoux If you're satisfied with the corrections and answers, I think we can accept the paper. I'm waiting for you go and let me thank you again for your really nice and detailed reviews. |
Hello, Yes I am satisfied with the answers, thanks! Figure 5 now looks much better and the version in the supplement is a nice addition as well. Thank you for clarifying installation stuff in the README files and the scripts as well! |
Hey, sorry for the delay, it's good to go on my side. Nice work everyone! |
@lecfab Congratulations, your paper has been accepted! I'll edit it soon and come back to you. |
I would need a link to your article sources |
Thank you all for this efficient process. Here is the link to the article sources in overleaf. |
Sorry for the delay. I edited your post to remove the link to overleaf because it is an editable link and anybody could modify your manuscript. Would it be possible for you to create a GitHub Repository with the source of thre article and using the latest template at https://github.com/ReScience/template. It would make my life easier for publication. |
https://github.com/lecfab/rescience-gorder/tree/main/paper I filled the files, but could not compile them into a pdf because of the following errors. But i presume it will work on your side after completing the last pieces of information in metadata.yaml. makelatexmk -pdf -pdflatex="xelatex -interaction=nonstopmode" -use-make article.texLatexmk: This is Latexmk, John Collins, 26 Dec. 2019, version: 4.67. Latexmk: applying rule 'pdflatex'... Rule 'pdflatex': The following rules & subrules became out-of-date: 'pdflatex' Run number 1 of rule 'pdflatex' Running 'xelatex -interaction=nonstopmode -recorder "article.tex"' sh: 1: xelatex: not found Latexmk: fls file doesn't appear to have been made. Latexmk: Errors, so I did not complete making targets Collected error summary (may duplicate other messages): pdflatex: Command for 'pdflatex' gave return code 127
This message may duplicate earlier message. Latexmk: Failure in processing file 'article.tex': (Pdf)LaTeX didn't generate the expected log file 'article.log' Latexmk: Use the -f option to force complete processing, unless error was exceeding maximum runs, or warnings treated as errors. make: *** [Makefile:30 : article.pdf] Erreur 12 |
It's working on my side so it is fine. Also, I forgot but can your register your code at https://www.softwareheritage.org/save-and-reference-research-software/ and post here the SWID ? |
swh:1:dir:e318a0ad72f81e2cb2af1ca614d1c171dd3f0909 |
Perfect, thanks. Can you check the article at https://sandbox.zenodo.org/record/830143 (this is a sandboxed version not yet published)? |
@rougier This looks good to me thanks! |
Cool! It is now online at https://zenodo.org/record/4836230/ and will soon appear on ReScience C website. Congratulations! |
That was a very smooth process, and thank you for the in-depth reviews! |
I forgot to make a PR to your repo with the publication changes. Will do in a few minutes. |
I validated it and updated the replication.pdf ! |
Original article: Speedup Graph Processing by Graph Ordering by Hao Wei, Jeffrey Xu Yu, Can Lu, and Xuemin Lin, in Proceedings of SIGMOD 2016
PDF URL: repository/replication.pdf
Metadata URL: repository/metadata.tex
Code URL: github repository
Scientific domain: Algorithmics
Programming language: C++, Python 3, Bash
Suggested editor: Nicolas P. Rougier
The text was updated successfully, but these errors were encountered: