This is the repository containing the source code and artifact for the PPoPP'25 paper "APT: Adaptive Parallel Training for Graph Neural Networks". To reproduce results in the paper, please checkout to the artifact_evaluation
branch for instructions.
Follow these steps to prepare and install APT with all required dependcies.
To install and use APT, the following dependencies is required. We suggest you create a new conda environment for this.
- python >= 3.9
- cmake >= 3.27.4
- CUDA >= 11.8
- DGL >= 1.1.2
- Pytorch >= 2.0.1
Git clone the repo:
git clone --recurse-submodules https://github.com/kaihaoma/APT.git
From the root directory of this repo:
mkdir build; cd build
cmake ..; make -j20
From the root directory of this repo:
cd python; python setup.py install
We provide shell scripts for running both single-machine and multi-machine GNN training. See instructions in examples/
for detail.
We need to partition the graph and output the required format before APT can operate on it. We provide the code script to prepared the dataset in scripts/preprocess_dataset.py
. Especially, you will need to preprare your own dataset in advance in the binary format that can be loaded with dgl.load_graphs()
. The script goes through the following steps.
- Load your inital graph with
dgl.load_graphs()
. - Partition the graph using
dgl.distributed.partition_graph()
, either with Metis or random partitioning. - Calculate the ID offsets of each graph partition.
- Reorder the whole graph to make the IDs in each graph partition contiguous.
- Store the reordered graph and configs of the partitions in the output path.
- Count dryrun results (e.g., node hotness) if indicated.
Example config files are in npc_dataset/
.
This repo is under MIT License, see LICENSE
for further information.