Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

github-runner: fix package layout and script patches #186325

Merged
merged 1 commit into from
Aug 22, 2022
Merged

github-runner: fix package layout and script patches #186325

merged 1 commit into from
Aug 22, 2022

Conversation

cmoog
Copy link
Contributor

@cmoog cmoog commented Aug 12, 2022

Description of changes
  1. Fix fatal startup crashes caused by recent changes to the source layout.
  2. Add a default value for RUNNER_ROOT.

How to test:

nix-build -A github-runner
./result/bin/config.sh --url https://github.com/xxx/xxxx --token xxxx --ephemeral
./result/bin/run.sh
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.11 Release Notes (or backporting 22.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

@cmoog
Copy link
Contributor Author

cmoog commented Aug 12, 2022

cc @veehaitch @newAM

@ofborg ofborg bot requested review from kfollesdal, veehaitch and newAM August 12, 2022 19:21
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 10.rebuild-linux: 1 labels Aug 12, 2022
Copy link
Member

@newAM newAM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can you explain what crashes you saw? I tested the last pull-request on aarch64-linux and had no issues. If you experienced problems I would like to know what I can test in the future to catch this :)
  2. What problem are you trying to solve by setting RUNNER_ROOT? I tried to find what this variable does, but the only relevant result was this nixpkgs issue: GitHub runner failing #158970

Unfortunately with this pull-request I can no longer start a runner on aarch64-linux.

systemd[1]: Starting GitHub Actions runner...
systemd[1]: Started GitHub Actions runner.
runsvc.sh[30835]: /nix/store/97dfdd78d6rf580wj11x3knfpr1f2qpd-github-runner-2.295.0/bin/runsvc.sh: line 17: ./externals/node16/bin/node: No such file or directory
systemd[1]: github-runner.service: Main process exited, code=exited, status=127/n/a
systemd[1]: github-runner.service: Failed with result 'exit-code'.

With #186173 it works on aarch64-linux:

systemd[1]: Starting GitHub Actions runner...
systemd[1]: Started GitHub Actions runner.
runsvc.sh[31296]: Starting Runner listener with startup type: service
runsvc.sh[31296]: Started listener process, pid: 31310
runsvc.sh[31296]: Started running service
runsvc.sh[31296]: √ Connected to GitHub
runsvc.sh[31296]: Current runner version: '2.295.0'
runsvc.sh[31296]: 2022-08-13 15:09:49Z: Listening for Jobs

@cmoog
Copy link
Contributor Author

cmoog commented Aug 13, 2022

  1. Can you explain what crashes you saw? I tested the last pull-request on aarch64-linux and had no issues. If you experienced problems I would like to know what I can test in the future to catch this :)
  2. What problem are you trying to solve by setting RUNNER_ROOT? I tried to find what this variable does, but the only relevant result was this nixpkgs issue: GitHub runner failing #158970 cc @newAM
  1. Here is what happens on master on x86_64-linux. I haven't looked too closely at the NixOS module wrapper, but it looks like some of that logic may be causing different behavior for your test workflow.
$ nix-build -A github-runner
this path will be fetched (22.55 MiB download, 73.57 MiB unpacked):
  /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0
copying path '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0' from 'https://cache.nixos.org'...
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0
$ ./result/bin/config.sh
touch: cannot touch '.env': Permission denied
./env.sh: line 37: .path: Permission denied
./env.sh: line 32: .env: Permission denied
./env.sh: line 32: .env: Permission denied
Unhandled exception. System.UnauthorizedAccessException: Access to the path '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/_diag' is denied.
 ---> System.IO.IOException: Permission denied
   --- End of inner exception stack trace ---
   at System.IO.FileSystem.CreateDirectory(String fullPath)
   at System.IO.Directory.CreateDirectory(String path)
   at GitHub.Runner.Common.HostTraceListener..ctor(String logFileDirectory, String logFilePrefix, Int32 pageSizeLimit, Int32 retentionDays)
   at GitHub.Runner.Common.HostContext..ctor(String hostType, String logFile)
   at GitHub.Runner.Listener.Program.Main(String[] args)
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/config.sh: line 81: 3054140 Aborted                 (core dumped) /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/Runner.Listener configure "$@"
$ ./result/bin/run.sh
cp: cannot stat '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh.template': No such file or directory
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run.sh: line 16: /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh: No such file or directory
Exiting runner...

$ RUNNER_ROOT=$(pwd) ./result/bin/run.sh
cp: cannot stat '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh.template': No such file or directory
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run.sh: line 16: /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh: No such file or directory
Exiting runner...
  1. RUNNER_ROOT is an environment variable added only in the nix build of github-runner, relevant patch code here. In the regular runner distribution, it writes runtime configuration state to a child directory of the source code. Given that /nix/store is readonly at runtime, we need to specify an alternative directory for local state. At present, RUNNER_ROOT is only set in the NixOS module for github-runner and isn't really documented anywhere else. For users trying to use the github-runner pkg derivation directly, this causes a broken-by-default experience. (When RUNNER_ROOT is unset the config script crashes immediately when it tries to write to the store). In addition to the default value provided in this PR, we should also probably document its behavior in the longDescription.

@cmoog
Copy link
Contributor Author

cmoog commented Aug 13, 2022

Unfortunately with this pull-request I can no longer start a runner on aarch64-linux.

I'm realizing now I should also test that these changes work with the NixOS module. I'll update this PR with the relevant fixes as soon as possible.

@cmoog
Copy link
Contributor Author

cmoog commented Aug 15, 2022

I undid the change that broke compatibility with the NixOS service module. I've verified that a basic usage of the service module now works again on x86_64-linux.

@ofborg ofborg bot requested a review from newAM August 15, 2022 02:16
Copy link
Member

@veehaitch veehaitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. This makes sense.

Comment on lines 247 to 255
# Set working dir in config.sh
sed -i '2 i cd $RUNNER_ROOT' $out/lib/env.sh
sed -i '2 i cd '$out/ $out/lib/config.sh

# Set default value for RUNNER_ROOT
sed -i '2 i export RUNNER_ROOT="$\{RUNNER_ROOT:-$HOME/.github-runner}"' $out/lib/config.sh
sed -i '3 i mkdir -p $RUNNER_ROOT' $out/lib/config.sh
sed -i '2 i export RUNNER_ROOT="$\{RUNNER_ROOT:-$HOME/.github-runner}"' $out/lib/run.sh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try to integrate those changes into host-context-dirs.patch which introduces the RUNNER_ROOT environment variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that could be cleaner. What should we do about the root dir in env.sh then? It's called in config.sh before the dotnet package in invoked at all. We'd need to hardcode the runner root default in both places, which actually feels worse. For reference: https://github.com/actions/runner/blob/main/src/Misc/layoutroot/env.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think these specific calls to sed are actually quite resilient. config.sh and run.sh are essentially the public APIs of the runner package, and placing these patches in the second line of the script doesn't assume much except that it's a bash session.

Screen Shot 2022-08-15 at 10 35 58 AM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, we created wrappers in bin/ for the shell scripts. As far as I can tell, makeWrapper (particularly, --set-default and --chdir) would work well here as well (though I'd move the invocations to postFixup).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers. I've refactored the change to use those helper functions. Looks much cleaner now.

Given some quirks with bash quote escaping, I wasn't able to use --set-default, but --run did the trick.

@cmoog cmoog requested a review from veehaitch August 16, 2022 13:46
Copy link
Member

@newAM newAM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The systemd service works now on x86_64-linux, and the changes look reasonable to me. @veehaitch should also review this before it gets merged though.

@bobby285271 bobby285271 added the 12.approvals: 1 This PR was reviewed and approved by one reputable person label Aug 19, 2022
@ofborg ofborg bot requested a review from newAM August 20, 2022 19:54
@cmoog
Copy link
Contributor Author

cmoog commented Aug 21, 2022

@veehaitch Like I mentioned here, all of those proposed changes would cause breakage due to their escape/quote handling. Making those proposed changes would break this PR.

Copy link
Member

@veehaitch veehaitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@veehaitch Like I mentioned here, all of those proposed changes would cause breakage due to their escape/quote handling. Making those proposed changes would break this PR.

Sorry, I missed that!

@newAM newAM added 12.approvals: 2 This PR was reviewed and approved by two reputable people 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in the package and removed 12.approvals: 1 This PR was reviewed and approved by one reputable person labels Aug 21, 2022
@veehaitch veehaitch requested a review from winterqt August 22, 2022 07:35
@winterqt
Copy link
Member

Something I'm confused about is why these errors seemingly happen inconsistently (see #187009 and #182189 (review)). Does anyone (@cmoog maybe) have any clue why this is?

@cmoog
Copy link
Contributor Author

cmoog commented Aug 22, 2022

Something I'm confused about is why these errors seemingly happen inconsistently (see #187009 and #182189 (review)). Does anyone (@cmoog maybe) have any clue why this is?

That is likely due to the code paths introduced by the service module wrapper. This PR fixes the derivation itself, which is broken on master 100% of the time.

Copy link
Member

@winterqt winterqt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rebase so we can get rid of the merge commit? (Also, it might be worth squashing the changes down, at least a little, since the state of the first commit is drastically different than the last one. Your call, though.)

@ofborg ofborg bot requested review from newAM and veehaitch August 22, 2022 16:09
@github-actions
Copy link
Contributor

Successfully created backport PR #187909 for release-22.05.

@cmoog cmoog deleted the github-runner branch August 22, 2022 20:01
@Janik-Haag Janik-Haag added the 12. first-time contribution This PR is the author's first one; please be gentle! label Jun 12, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 10.rebuild-linux: 1 12.approvals: 2 This PR was reviewed and approved by two reputable people 12.approved-by: package-maintainer This PR was reviewed and approved by a maintainer listed in the package 12. first-time contribution This PR is the author's first one; please be gentle!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants