Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Tunneldigger process management broken (?): can end up with multiple tunneldigger running #148

Open
RalfJung opened this issue Jul 21, 2022 · 5 comments

Comments

@RalfJung
Copy link

The latest lead in a long-standing issue seems to indicate that tunneldigger process management sometimes goes wrong, and we can end up with 2 instances of tunneldigger running (ps showing 6 tunneldigger processes, rather than the usual 3). This then leads to those 2 instances interrupting each other all the time, which is essentially a DoS attack on the gateway.

I don't know how to reproduce this, and have not actually seen these 6 tunneldigger processes myself (I never managed to get SSH onto an affected device), but this is the best lead so far. So I wonder... how could a Gluon device end up in a situation where tunneldigger runs twice?

@T-X
Copy link

T-X commented Jul 21, 2022

Can you maybe reproduce something like this if you try to do multiple tunneldigger restarts at the same time? Something like:

for i in `seq 1 1000`; do
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  wait
# sleep 1
done

I'm wondering if the tunneldigger-watchdog micron can sometimes result it multiple restarts being run in parallel? Just some weild guesses.

@RalfJung
Copy link
Author

RalfJung commented Aug 7, 2022

Hm, when I tried this even just with a loop count of 20, my device just reboots after a bit... nothing it prints via SSH shows any indication why.

It's a pretty weak device with very little RAM, so it's probably not good for such tests. It's the only one I have though...

@T-X
Copy link

T-X commented Aug 19, 2022

You should be able to find out if it's an out-of-memory or other crash via /sys/kernel/debug/crashlog after the device rebooted, as long as you don't power cycle it. Or via a serial console, of course. Not sure if that'd help for this issue, but maybe there could be some unexpected hints in there?

@valcryst
Copy link

We had this issue on alot of routers and it seems that this occours after a reboot (daily reboots).
Alot of our refugee routers where also affected so we needed a quick solution to overcome this
issue and adapted some of the old tunneldigger-watchdog code by @lcb01a

Patched this function

https://github.com/freifunk-gluon/gluon/blob/master/package/gluon-mesh-vpn-tunneldigger/luasrc/usr/bin/tunneldigger-watchdog#L5

to

local function restart_tunneldigger()
	os.execute('logger -t tunneldigger-watchdog "Restarting Tunneldigger."')
	os.execute('/etc/init.d/tunneldigger stop')
	os.execute('sleep 1')
	os.execute('killall -KILL tunneldigger')
	os.execute('rm -f /var/run/tunneldigger.mesh-vpn.pid'
	os.execute('sleep 5')
	os.execute('/etc/init.d/tunneldigger start')
end

With this change we dont have this issue anymore, but i still cant tell how the routers end up
running multiple Tunneldiggers without a PID, wich seems to happen here after reboots.

@rotanid
Copy link
Member

rotanid commented Dec 23, 2024

tunneldigger has been deprecated in gluon and removed in main branch: freifunk-gluon/gluon#3109
it is part of community packages repo now: https://github.com/freifunk-gluon/community-packages/tree/master/ff-mesh-vpn-tunneldigger

@rotanid rotanid transferred this issue from freifunk-gluon/gluon Dec 23, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants