-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Inconsistent performance by setting dgl_builtin=True in GCNLayer #57
Comments
Hi @ycjing, wow, that seems unexpected. In my experience, the performance was unchanged when enabling/disabling dgl_builtin flags, but the inference time of native DGL layers was obviously faster. Could you provide more details:
|
Hi @chaitjo Thank you for the response! I appreciate it. I follow the instructions at https://github.com/graphdeeplearning/benchmarking-gnns/blob/master/docs/01_benchmark_installation.md for installation. The pytorch version is 1.3.1. The DGL version is 0.4.2. I use one Tesla V100 GPU. To confirm and reproduce the results, I just deleted my previous repo and git clone a new one. After preparing the data, changing the code in benchmarking-gnns/layers/gcn_layer.py Line 40 in ef8bd8c
python main_SBMs_node_classification.py --dataset SBM_PATTERN --gpu_id 0 --seed 41 --config 'configs/SBMs_node_clustering_GCN_PATTERN_100k.json' , the full log is as follows:
However, when I set dgl_builtin=False, the results are consistent with those reported in the paper. This is a very weird thing. I have not tried other datasets yet. I will try other datasets and see the results these days. Thank you again! Best, |
Thanks @ycjing for bringing this up and I'll investigate into this issue. |
Hi @yzh119 Thank you for the response! Sure, the training log with dgl_builtin=False is as follows:
I really appreciate your help! Thank you! Best, |
Hi everybody, we reimplemented some parts of the pipeline in a separate project using pytorch geometric and are experiencing very similar behavior. The GCN model implemented in pytorch geometric reaches more than 80% accuracy after only very short training. Most components of our pipeline are the same/similar (also the weighted loss / accuracy computation) besides the framework used. Generally, this seems to be indicative of some issues when running with Cheers, |
@ExpectationMax |
1 similar comment
@ExpectationMax |
Hi @ycjing , after digging into the codebase, I found the implementation of benchmarking-gnns/layers/gcn_layer.py Lines 59 to 65 in ef8bd8c
When benchmarking-gnns/layers/gcn_layer.py Lines 15 to 18 in ef8bd8c
This implements a message passing module that averages the received messages. By contrast, DGL's provided @chaitjo Do you think it is reasonable to remove the |
Thank you for the response! I appreciate it. If my understanding is correct, the achieved 85.52% accuracy on PATTERN with If this is the case, the provided results in the benchmark paper might need updating, since the performances are so different. |
Hi @ExpectationMax In fact, I'm trying to run the experiment on CLUSTER, this dataset has a binary feature of size 6, do you use an Thank you in advance! |
Hi everyone, first of all, thank you for this discussion and apologies for the late response. Indeed, @jermainewang's explanation is correct regarding the performance difference between the built-in GCN layer which normalizes by sqrt of the src and dst degrees vs. our initial implementation performing mean pooling. After internal discussion, we plan to move to the DGL built-in GCN layer, and plan to update the benchmark leaderboard along with the next release of our paper/repository. |
Hi @chaitjo Thank you for the response! I truly appreciate it. Now I understand the whole thing. Also, thanks again for this great work! Best, |
Hi @chaitjo |
Hi @ycjing
|
Issue fixed in the recent update. Thanks everyone! |
Hi,
Thank you for the great work! This work is really wonderful. When I try to use GCN model for node classification by running:
python main_SBMs_node_classification.py --dataset SBM_PATTERN --gpu_id 0 --seed 41 --config 'configs/SBMs_node_clustering_GCN_PATTERN_100k.json'
I found that when I set dgl_builtin to false, the test acc is 63.77, which is consistent with the results reported in the paper; however, when I set dgl_builtin to true, the test acc became 85.56.
I do not think this behavior is normal. But I did not figure out why the performances are so different after struggling for some time. I would appreciate it if you could help me. Thank you! Have a nice day!
Best,
Yongcheng
The text was updated successfully, but these errors were encountered: