Skip to content
This repository has been archived by the owner on Oct 27, 2020. It is now read-only.

Hardware remains powered on app restart #712

Closed
gordonm867 opened this issue Jun 5, 2019 · 6 comments · Fixed by codyduong/FTC9108-SkyStone#7
Closed

Hardware remains powered on app restart #712

gordonm867 opened this issue Jun 5, 2019 · 6 comments · Fixed by codyduong/FTC9108-SkyStone#7

Comments

@gordonm867
Copy link

gordonm867 commented Jun 5, 2019

For safety reasons, if an OpMode refuses to begin to stop after a few seconds of the "stop" button being pressed, fails to initialize in a timely manner, or fails to stop in a timely manner, the OpModeManagerImpl.OpModeStuckCodeMonitor.Runner.run() method sends a kill request that shuts down the RC app, which is restarted a few seconds later.

However, the watching class fails to force the REV hub to stop powering motors. In fact, since the active OpMode is shutdown by the app crash, motors are stuck in their last state until the app relaunches. This uncontrolled period can result in damage to the robot, people/items/other robots surrounding it, the field, etc., and is itself a safety issue.

Ensuring that all hardware is powered off is something that is taken care of in other places in the OpModeManagerImpl class. Specifically, this code seems to shut down all powered motors:

Iterator var1 = this.hardwareMap.getAll(DcMotorSimple.class).iterator();
            while(var1.hasNext()) {
                DcMotorSimple motor = (DcMotorSimple)var1.next();
                if (motor.getPower() != 0.0D) {
                    motor.setPower(0.0D);
                }
            }

The fact that the RC app restarts itself successfully after one of these shutdowns is proof that the OpModeManagerImpl.OpModeStuckCodeMonitor.Runner.run() method is run. The app killing is part of this method. However, in the same way that the code gives a message on the phones to notify the user as to what is happening, it would seem that shutting down hardware prior to the app's shutdown is plausible and would greatly improve the safety of the SDK.

@Windwoes
Copy link

Windwoes commented Jun 6, 2019

So I've actually been looking into this myself. The issue is that if the user code is holding the main USB RX/TX reentrant lock for the Lynx module, then the SDK cannot squeeze in to send a failsafe command. Furthermore, even assuming the SDK was able to grab the lock, it would need to continue to hold it until the app restarts to prevent the user code from sending other rogue commands after the SDK sent the failsafe command. However, that main reentrant lock is set to force unlock itself after a timeout to prevent possible deadlock situations; thus a slightly different locking approach would be needed.

I've been experimenting with a 2-layer locking system with a "master-master" lock which the SDK can grab to forcibly hang user code during the restart.

@Windwoes
Copy link

Windwoes commented Jun 6, 2019

Also note that this is not the only problem in the SDK pertaining to rogue user code - see this other issue I had posted about

@Windwoes
Copy link

Windwoes commented Jun 6, 2019

One more thing I will add though, is that unlike MR, the Lynx module has a 2500ms timeout before entering failsafe mode. Thus, the hardware is in fact stopped at some point during the app restart, just not quite as soon as would be desirable :)

@rgatkinson
Copy link
Collaborator

MR also has a similar timeout.

@Windwoes
Copy link

Windwoes commented Oct 4, 2019

Update: A fix for this has been merged into the internal repo and will be included in the v5.3 release.

@Windwoes
Copy link

Windwoes commented Nov 2, 2019

v5.3 is out, this can be closed now

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants