Skip to content
This repository has been archived by the owner on Oct 27, 2020. It is now read-only.

"Restart Robot" abandons watchdog on active OpMode #709

Closed
Windwoes opened this issue May 10, 2019 · 12 comments
Closed

"Restart Robot" abandons watchdog on active OpMode #709

Windwoes opened this issue May 10, 2019 · 12 comments

Comments

@Windwoes
Copy link

Windwoes commented May 10, 2019

As I'm sure anyone who has ever written an autonomous OpMode knows, if your thread does not exit within 1000ms of a stop being requested, the watchdog growls, and the app force-crashes itself in order to terminate your rogue thread. (This is necessary because it is not possible to kill -9 a single thread).

However, I have found a serious bug in the implementation of the watchdog: it appears that the watchdog of the active OpMode is abandoned when a "Restart Robot" command is issued. This causes the RC device to perform a restart robot as if everything were normal, but the rogue thread continues to run in the background.

Here's a sample OpMode:

@TeleOp
public class RogueLoop extends LinearOpMode
{
    @Override
    public void runOpMode()
    {
        waitForStart();

        while (true) //muahaha Thread.interrupt() ain't gonna do you any good here!
        {
            long iT = System.currentTimeMillis();

            while (System.currentTimeMillis() - iT < 500)
            {
                Thread.yield();
            }

            System.out.println("Sysout from rogue loop");
        }
    }
}

Steps to reproduce:

  1. Deploy the SDK to the RC phone with the above OpMode added to the TeamCode module
  2. Ensure that Android Studio is printing logcat to the bottom pane
  3. Ensure the DS and RC phones are connected as normal
  4. Init and run the "RogueLoop" OpMode
  5. Observe that the OpMode prints a message to logcat every 500ms
  6. While the OpMode is still running, press "Restart Robot".
  7. Wait for the "Restart Robot" to complete
  8. Observe that the rogue OpMode continues to print messages to the logcat every 500ms, despite the DS and RC showing every indication that only the "StopRobot" OpMode is being run
@Windwoes
Copy link
Author

@gearsincorg will this be fixed in v5.0 too?

@gearsincorg
Copy link
Collaborator

No. I've added it as an issue, but I'm not skilled in this area to attempt a fix. (Suggestions?)
5.0 is an early-bird release, primarily for the Control Hub Pilot teams.
The hope is to have another 5.x pre-release, and then the main 5.y release in September.

@Windwoes
Copy link
Author

@gearsincorg I did some digging and it appears this issue can be fixed by adding the following as the first thing in EventLoopManager.stopEventLoop():

    if(eventLoop.getOpModeManager() != null)
    {
        eventLoop.getOpModeManager().stopActiveOpMode();
    }

@gearsincorg
Copy link
Collaborator

Excellent... I can implement this in the current private repo and see if I can replicate your results.

We appreciate your efforts to find and suggest fixes for these problem.

Phil.

@Windwoes
Copy link
Author

@gearsincorg any update on this?

@Windwoes
Copy link
Author

Also, one other thing, I'd like to revise my suggested fix to:

    if(eventLoop.getOpModeManager() != null)
    {
        eventLoop.getOpModeManager().initActiveOpMode(OpModeManagerImpl.DEFAULT_OP_MODE_NAME);
    }

@NoahAndrews
Copy link

Why the switch?

@Windwoes
Copy link
Author

Why the switch?

Because that's what's done when the DS sends the command to stop an OpMode. It doesn't call stopActiveOpMode(), instead it calls initActiveOpMode(OpModeManagerImpl.DEFAULT_OP_MODE_NAME)

@gearsincorg
Copy link
Collaborator

@FROGbots-4634 I've tested both of your proposed fixes.

The first one successfully detects the stuck loop condition and aborts the running thread the same way it would do if a stop had been requested. The DS is sent the Stuck-Loop error and the RC restarts the app and comes up in a ready to run condition.

The second suggestion did not appear to have any effect. The stuck opmode continued to run through the restart operation, and stayed in memory to permit multiple instances to execute.

After some more testing, I'll be submitting the first approach as a change to the code base.

Thanks for your input.

@Windwoes
Copy link
Author

Windwoes commented Aug 6, 2019

@gearsincorg Ok cool! I never actually tested the second suggestion..... though don't really understand why it wouldn't work, given that's what the DS does. Then again I don't quite fully understand the state machine in OpModeManagerImpl...

@Windwoes
Copy link
Author

@ftctechnh does this mean it was fixed in the internal repo?

@ftctechnh
Copy link
Owner

@FROGbots-4634
Yes. Thanks for the fix! It works well.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants