Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Log a fatal before exit if node cannot be created #3375

Merged
merged 1 commit into from
Sep 9, 2024

Conversation

aaronbuchwald
Copy link
Collaborator

Why this should be merged

Currently, if AvalancheGo runs into a fatal error when initializing the node, it will return that error and propagate it to main. This will print it to stdout, but does not show up in main.log, which makes it more difficult to discover the error when using setups like tmpnet where stdout may not be collected.

For example, if you run a node and try to listen on a port that's already taken then the error is printed to stdout, but does not show up in main.log, which looks like this:

scode/.tmpnet/networks/20240909-154939.215011-morpheusvm-e2e-tests/NodeID-D8he6uMyi5bdNADZLtSMJVWao1gyfz4N4/chainData","processContextFilePath":"/home/vscode/.tmpnet/networks/20240909-154939.215011-morpheusvm-e2e-tests/NodeID-D8he6uMyi5bdNADZLtSMJVWao1gyfz4N4/process.json"}}
[09-09|15:49:39.367] INFO node/node.go:952 initializing NAT
[09-09|15:49:39.367] INFO node/node.go:970 initializing API server

How this works

This PR updates AvalancheGo app Run to log a fatal error if the node cannot be initialized correctly.

How this was tested

Tested locally running two nodes trying to listen on the same port correctly adds the error to main.log:

[09-09|12:06:48.157] INFO node/node.go:970 initializing API server
[09-09|12:06:48.158] FATAL app/app.go:74 failed to initialize node {"error": "couldn't initialize API server: listen tcp 127.0.0.1:9650: bind: address already in use"}

@marun
Copy link
Contributor

marun commented Sep 9, 2024

Is this preferable to redirecting stdout to its own log?

@aaronbuchwald
Copy link
Collaborator Author

Is this preferable to redirecting stdout to its own log?

I thought about that too and preferred this smaller change. We could do that too, we'd probably just want to pull the logger out of the node package and redirect stdout/stderr to the log from main.

marun
marun previously approved these changes Sep 9, 2024
@marun marun dismissed their stale review September 9, 2024 17:01

log.Fatal calls Exit, so any subsequent code will not be executed.

app/app.go Show resolved Hide resolved
@containerman17
Copy link
Contributor

containerman17 commented Sep 9, 2024

It worked perfectly for me!

I updated my local copy of HyperSDK with this command:

go get github.com/ava-labs/avalanchego@node-log-fatal-on-init

Then, I launched a dummy HTTP server to block port 9650:

npx -y http-server -p 9650

This is what I saw in the logs:

$ tail -n 3 /home/vscode/.tmpnet/networks/20240909-172756.69973-morpheusvm-e2e-t
ests/NodeID-2oNT4GzhNYA9GvL4rjQsMXU8HaEYhuk9T/logs/main.log 
[09-09|17:27:57.003] INFO node/node.go:952 initializing NAT
[09-09|17:27:57.003] INFO node/node.go:970 initializing API server
[09-09|17:27:57.004] FATAL app/app.go:74 failed to initialize node {"error": "couldn't initialize API server: listen tcp :9650: bind: address already in use"}

@marun
Copy link
Contributor

marun commented Sep 9, 2024

Maybe also check if the fixed port is bound (either in tmpnet or the e2e suite)? That would avoid having to check the logs when something goes wrong.

Could also have tmpnet output the main.log when a node fails to go healthy before timeout...

@aaronbuchwald
Copy link
Collaborator Author

aaronbuchwald commented Sep 9, 2024

Maybe also check if the fixed port is bound (either in tmpnet or the e2e suite)? That would avoid having to check the logs when something goes wrong.

Could also have tmpnet output the main.log when a node fails to go healthy before timeout...

Ya that's definitely the first debugging step, but when you see the output from tmpnet, you don't know why it's failing, so you have to know to check for that case.

@StephenButtolph StephenButtolph added this pull request to the merge queue Sep 9, 2024
@StephenButtolph StephenButtolph added the monitoring This primarily focuses on logs, metrics, and/or tracing label Sep 9, 2024
@StephenButtolph StephenButtolph added this to the v1.11.12 milestone Sep 9, 2024
Merged via the queue into master with commit 7c0a6c7 Sep 9, 2024
20 of 21 checks passed
@StephenButtolph StephenButtolph deleted the node-log-fatal-on-init branch September 9, 2024 22:20
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
monitoring This primarily focuses on logs, metrics, and/or tracing
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants