Help working with large datasets #10076
Unanswered
GandarfTheWhite
asked this question in
Q&A
Replies: 0 comments
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
-
Hello, I just wanted to ask if anyone can give me tips or point me to any tutorials on how to build a graph and work with it when the dataset is extremely large (100gb).
I am working on node classification on the heterogenous TwiBot-22 dataset that comes in csv and json files.
I think I need to be using Database (SQLiteDatabase) and OnDiskDataset. Started by loading the edge list in with dask to process and trying to turn into edge_index and insert in SQLiteDatabase for each edge relation type (10). Please let me know if this is the right approach to start?
I am working on this for my undergrad computer science final project and am very new to GNNs so any advice will be much appreciated.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions