Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Update test-calibration-downloads.yml #2079

Merged
merged 7 commits into from
Feb 6, 2025
Merged

Update test-calibration-downloads.yml #2079

merged 7 commits into from
Feb 6, 2025

Conversation

arjunsuresh
Copy link
Contributor

No description provided.

@arjunsuresh arjunsuresh requested a review from a team as a code owner February 3, 2025 00:29
Copy link
Contributor

github-actions bot commented Feb 3, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@arjunsuresh
Copy link
Contributor Author

@anandhu-eng It seems we don't have the automation scripts for igbh calibration dataset download. Can you please check?

@anandhu-eng
Copy link
Contributor

anandhu-eng commented Feb 5, 2025

Hi @arjunsuresh @pgmpablo157321 , is calibration dataset generation from debug dataset supported for RGAT? As per this line, it seems to support only full dataset.

If it does not support, adding GitHub test for calibration dataset download (for full dataset mode) would be difficult as GitHub runners have limited storage.

Edit:
reference: https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories

@pgmpablo157321
Copy link
Contributor

@anandhu-eng it is not supported. It doesn't make sense to generate the calibration dataset for the debug dataset. As the dataset is currently setup, you can only download the calibration by downloading the whole dataset. And there is not a trivial solution, because the dataset is a graph, so the calibration is 'connected' to the whole dataset. I see 3 options:

  • Skip the test for RGAT
  • Add the option for create the calibration dataset for the debug setting. This will be testing that the debug dataset calibration is able to be downloaded.
  • We could separate the calibration dataset from the 'full' dataset, but this is not trivial. We need to analyze how to do it and it might take some time

@anandhu-eng
Copy link
Contributor

I think we could skip this test and add a test for generating calibration dataset from full dataset through a self-hosted GitHub runner? It could be run weekly. If the dataset is already present in the runner, the time for downloading the dataset could be skipped.

@arjunsuresh arjunsuresh merged commit 99c1c5e into dev Feb 6, 2025
12 checks passed
@arjunsuresh arjunsuresh deleted the arjunsuresh-patch-4 branch February 6, 2025 22:44
@github-actions github-actions bot locked and limited conversation to collaborators Feb 6, 2025
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants