Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Upgrade CDash #19605

Closed
jwnimmer-tri opened this issue Jun 15, 2023 · 11 comments
Closed

Upgrade CDash #19605

jwnimmer-tri opened this issue Jun 15, 2023 · 11 comments
Assignees
Labels
component: continuous integration Jenkins, CDash, mirroring of externals, website infrastructure priority: low type: feature request

Comments

@jwnimmer-tri
Copy link
Collaborator

From #19099 (comment):

I spoke to one of the maintainers of CDash. A new release of CDash is going to be released soon. He suggested that we wait until that is available and do an upgrade at the same time as moving to a different instance type.

@BetsyMcPhail
Copy link
Contributor

BetsyMcPhail commented Sep 28, 2023

The new CDash server is setup on a smaller instance and ready to test! There has been a few successful submissions from Jenkins:

drake-ci PR: RobotLocomotion/drake-ci#244
drake PR: #20231

Test build: https://drake-jenkins.csail.mit.edu/view/Experimental/job/linux-focal-gcc-cmake-experimental-release/9312/

The plan is to create a few more test cases with failing builds, tests, etc. If that goes well, do the switch next week, after Jenkins has been updated.

@BetsyMcPhail
Copy link
Contributor

@jwnimmer-tri would Thursday morning 10/5 be okay to do the update? CDash will be down for a bit but the rest of CI should be unaffected.

@jwnimmer-tri
Copy link
Collaborator Author

Sounds fine. CDash is not on the critical path for anything, so it's fine to down it whenever.

@BetsyMcPhail
Copy link
Contributor

We did some more testing and ran into some issues. Upgrade is on hold for now while we work through those.

@BetsyMcPhail
Copy link
Contributor

Moving this issue out of "In Progress", we are waiting for a CDash developer to have time to look at the issues.

@BetsyMcPhail
Copy link
Contributor

https://open.cdash.org encountered a similar issue with a CDash background process stopping and not restarting itself. After a fix has been implemented and tested out for a few days on that instance, we will try the TRI update again.

@BetsyMcPhail
Copy link
Contributor

The test CDash server has been updated and test submissions are looking good: http://ec2-34-203-13-209.compute-1.amazonaws.com/cdash/index.php?project=Drake&date=2023-11-07

The next step is to try to switch the test instance into production.

@BetsyMcPhail
Copy link
Contributor

BetsyMcPhail commented Jan 5, 2024

The CDash server has been upgraded.

Still outstanding:

  • Ensure new instance is auto-backed up (@svenevs)
  • Delete old instance
  • Set up certbot auto-renewal (@BetsyMcPhail)
  • Ensure CDash logs are visible on CloudWatch (AWS)

Note, this did not fix the Sonoma CDash issues #20718

@svenevs
Copy link
Contributor

svenevs commented Jan 5, 2024

https://us-east-1.console.aws.amazon.com/backup/home?region=us-east-1#/backupplan/details/83a5c41b-0f95-47f3-bac0-af87484543be/selection/details/3b288525-de5e-49a0-a19e-d0ce24dd97d8

New backup plan created, old one deleted. I am ready to check that box off after 5am tomorrow (aka lets confirm the backups exist next week first, they kick off at 5am every day).

The old instance has already been Stopped, when we are happy with our backups for drake-cdash-3 we can also then terminate it.

@BetsyMcPhail
Copy link
Contributor

The certbot auto-renewal and cloudwatch logging have been setup on the new CDash server. @svenevs when you're happy with the backups, please go ahead and delete the old instance and close this issue.

@svenevs
Copy link
Contributor

svenevs commented Jan 15, 2024

drake-cdash-3 is backing up as expected. drake-cdash and its associated volume have been terminated.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
component: continuous integration Jenkins, CDash, mirroring of externals, website infrastructure priority: low type: feature request
Development

No branches or pull requests

3 participants