Skip to content

ec2-instance-connect open-tunnel doesn't exit after pipe is closed #9344

Open
@mattlqx

Description

@mattlqx

Describe the bug

When using aws ec2-instance-connect open-tunnel as a pipe, after a successful connection happens and parent process closes the pipe, the aws process continues to linger and uses a surprising amount of CPU. When attaching a debugger to the process, it appears stuck on a futex syscall.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The aws process should terminate gracefully once the parent connection closes the pipe and exits.

Current Behavior

The process lingers after the parent process exits. (Note high CPU usage)

matt     93039 84.1  0.2 631504 87456 pts/6    SNl  13:10   0:03 aws --debug --region us-east-1 ec2-instance-connect open-tunnel --instance-id i-xxxx --private-ip-address 172.x.x.x --instance-connect-endpoint-id eice-xxxx --max-tunnel-duration 3600

In debug mode, it logs a lot of this to stderr in this state.

[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T
[DEBUG] [2025-03-05T13:11:22Z] [00007e1d716006c0] [websocket] - id=0x7e1d64017240: Enqueuing outgoing frame with opcode=2(binary) length=0 fin=T

Attaching to the process, we see:

strace: Process 93039 attached
futex(0x7ad7bf0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY

Also looking at ss -np output, we can see AWS is still holding open established TCP connections for the websockets.

Reproduction Steps

I haven't found a general reproduction case outside of what I was attempting to use this command for. I was adding proxy_command support to Hashicorp's Terraform communicator (hashicorp/terraform#36643). Basically, if the aws process doesn't get a signal to cleanup, it will still continue to run with the websocket active. I got the desired behavior by ensuring the aws process receives a SIGHUP when cleaning up the connection but I also got the desired behavior by adding more cleanup to the websocket code in aws. PR forthcoming.

Possible Solution

Implement more cleanup to the websocket code.

Additional Information/Context

No response

CLI version used

2.24.17

Environment details (OS name and version, etc.)

Ubuntu 22.04.2

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.ec2-instance-connectp3This is a minor priority issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions