Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Master update hook vs callback queue #91

Merged

Conversation

psoetens
Copy link
Member

This PR is supposed to fix #70 in a similar, but improved way
as the original proposal (I'm not going to repeat the bug description here).

There are two diagrams that explain the new behavior (made using Inkscape), which you should look at before trying to understand the code.
componenttriggering
updatestepdetail

The main introduction is the addition of RunnableInterface::work( WorkReason )
and ActivityInterface::timeout() . This allows us to inform the runnable
interface of the reason why it is being executed: because of a TimeOut,
a Trigger, or IO ready. Depending on the reason, the RunnableInterface
subclass can choose to do something different. The ExecutionEngine uses this
to distinguish between message bookkeeping (Triggers) and real work to do
in updateHook ( TimeOut and IOReady ).

In addition, when a TaskContext is Running, an EventPort will call updateHook()
upon newdata or, if a callback is attached, that callback and not updateHook.
This works for periodic and non-periodic *Activity implementations.

Unfortunately, the TaskContext::trigger() will forward to ActivityInterface::timeout()
and the ExecutionEngine will use ActivityInterface::trigger() internally to
do the message/operation bookkeeping. This is confusing in the implementation
side, but keeps the API fully backwards portable.

This PR implements this mechanism for each Activity kind in RTT.

The negative side effects are minimal to none, ie, no existing code should be ported/changed to make use of the improvements in this PR.... unless your code in updateHook() relies on the fact that it is called too much. In Master-Slave setups, the EventPort callbacks are now immediately executed by the master, instead of waiting for a slave::update() call.

If you used ExecutionEngine sharing, patch 42a62e8 removed this capability.

Peter Soetens and others added 16 commits April 13, 2015 00:40
This patch is supposed to fix orocos-toolchain#70 in a similar, but improved way
as the original proposal.

The main introduction is the addition of RunnableInterface::work( WorkReason )
and ActivityInterface::timeout() . This allows us to inform the runnable
interface of the reason why it is being executed: because of a TimeOut,
a Trigger, or IO ready. Depending on the reason, the RunnableInterface
subclass can choose to do something different. The ExecutionEngine uses this
to distinguish between message bookkeeping (Triggers) and real work to do
in updateHook ( TimeOut and IOReady ).

In addition, when a TaskContext is Running, an EventPort will call updateHook()
upon newdata or, if a callback is attached, that callback and not updateHook.
This works for periodic and non-periodic *Activity implementations.

Unfortunately, the TaskContext::trigger() will forward to ActivityInterface::timeout()
and the ExecutionEngine will use ActivityInterface::trigger() internally to
do the message/operation bookkeeping. This is confusing in the implementation
side, but keeps the API fully backwards portable.

The implementation implements this mechanism for each Activity kind in RTT.

The side effects are minimal to none. We do call always loop() now from Thread,
and this patch does not yet detect Thread overruns (to be implemented in Activity).

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
If no callback is attached, the EventPort triggers updateHook(), if a callback is attached,
this replaces that trigger with the new callback instead.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Signed-off-by: Peter Soetens <peter@thesourceworks.com>
It did not call work(reason) on the runner, causing the EE not being executed,
since the EE only listens for work().

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Signed-off-by: Peter Soetens <peter@thesourceworks.com>
So a timeout or ioready also causes callbacks to be executed.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Signed-off-by: Peter Soetens <peter@thesourceworks.com>
This will allow us to interprete what is going on in a component
and debug different callback/update cycles.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
…ty constructors and fixed compiler warnings with -Wunused-result

The missing initializers had no effect as these boolean flags are reset in start() anyway.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
The implementation always did a Trigger before a Timeout,
which caused double cycles.
Renamed mtrigger to mtimeout for consistency
Made the TaskCore counters attributes

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Signed-off-by: Peter Soetens <peter@thesourceworks.com>
In ExecutionEngine::processMessages(), this avoids a mutex lock/unlock in
most cases, where there are no messages.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Per suggestion of @meyerj, kick out the port queue and
re-use the message queue by scheduling a disposable
interface which executes the callback.

This also unifies the master-slave relation, where the
master will execute the full callback step of each slave,
according to that slave's policy (wrt dataOnPortHook() )

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
Signed-off-by: Peter Soetens <peter@thesourceworks.com>
This feature was ambiguous at best and not tested. It's
assumed now that a TaskCore always has an engine. We
still check for an activity, since in theory it may
not have one.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
This requires another set of patches to fix that issue.

Signed-off-by: Peter Soetens <peter@thesourceworks.com>
@snrkiwi
Copy link
Contributor

snrkiwi commented Apr 13, 2015

Patch looks good, and diagrams look great! This patch makes an enormous positive difference in our ability to run hardware in hard real-time, and to reason about when things will happen on which cycle. Not having state machine run programs execute multiple times per cycle is a huge win, as is the fixes to the master/slave call operation semantics.

This patch is a huge, positive step forward.

@doudou
Copy link
Contributor

doudou commented Apr 13, 2015

👍 It might break things on our side, but that's really really for a good cause

meyerj and others added 2 commits April 13, 2015 16:39
Broadcasting condition variables without holding the associated mutex can block for some reasons.
This patch adds MutexLocks whereever the msg_cond is triggered.

From http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_broadcast.html:
> The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether
> or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait()
> have associated with the condition variable during their waits; however, if predictable scheduling
> behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or
> pthread_cond_signal().

This patch also clears the Thread active flag in Thread::terminate() is called and forces the thread
function to be terminated in the destructor of class Activity. This solves some issues with pure virtual
method calls during the destruction phase if threads are still running and triggering each other.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
Fixed blocking calls to ExecutionEngine::process()
@jmachowinski
Copy link
Member

Looks good.
Hm, I am wondering, can we misuse the timeout thing on the filedescriptor activity to get a behaviour, that is periodic, and FD driven ?

@snrkiwi
Copy link
Contributor

snrkiwi commented Apr 22, 2015

On Apr 22, 2015, at 05:49 AM, jmachowinski notifications@github.com wrote:

Looks good.
Hm, I am wondering, can we misuse the timeout thing on the filedescriptor activity to get a behaviour, that is periodic, and FD driven ?
 
What are you actually trying to achieve here? If you simply want to use the FDA to get a periodic activity, then what's the benefit of that over the standard periodic Activity implementation? The timeout does not actually write to the FD, it simply wakes up the activity.
S

@jmachowinski
Copy link
Member

We had the problem, that the task would just stall if no data was coming in on the FD.
In this case we would just enter an exception state.
Ok, I just looked up the API of the FDA. Lets just say we never discovered the timeout
in the past 4 years...

We also had this problem in the reverse direction. If our task is port driven, and we don't get
data. In this case we want to enter an exception state if we did not get data on the ports in X ms.
If I did not miss anything, this patch fixes exactly this issue.

@jmachowinski
Copy link
Member

To clarify this, our use case is a task for a motor driver. We want to enter exception, if either
the hardware did not give us an alive for X ms. We also want to enter the exception state (and shutdown the motor) if our controller stalled. For now we always used periodic activities for this,
which delays the processing of the incoming data. And we would like to be FD and Port driven.

meyerj added a commit to orocos/rtt_ros_integration that referenced this pull request May 12, 2015
…chain/rtt#91

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
@meyerj meyerj added this to the 2.9 milestone Oct 27, 2015
@psoetens
Copy link
Member Author

This PR is quite ok, except that in commit f266337 we collate the message queue with the port queue. This is a bad idea because we can have cases that your own port callback is called during a 'peer.foo.call()' in your code, which never happened in 2.8, since in 2.8, port callbacks are processed after all messages are processed.

For merging this into 2.9, we should ammend this patch such that the port queue is re-added, but a processPorts() function is created which then executes all callbacks in that queue, instead of the prepareforupdatehook() function. This cleans up, without changing semantics.

psoetens and others added 3 commits November 19, 2015 16:01
If we want the SM to really execute, a timeout must be given, trigger
will only process new messages, which leads to a no-op.

Signed-off-by: Peter Soetens <peter@intermodalics.eu>
This reverts commit f266337.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>

Conflicts:
	rtt/ExecutionEngine.cpp
	rtt/TaskContext.cpp
This is an improved version of f266337, which solves
the problem that operation calls and port callbacks could be interweaved unexpectedly.

A new method dataOnPortCallback(PortInterface*) has been added to the TaskContext
interface which can be overridden in subclasses and which will be called exactly once
per incoming sample on an EventPort by the ExecutionEngine.

The former internal methods TaskContext::dataOnPortCallback() and TaskContext::dataOnPortRemoved()
methods for callback management have been renamed to setDataOnPortCallback() and
removeDataOnPortCallback() for clarity.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
BOOST_ASSERT() is not related to the Boost Unit Test Framework and only active in Debug builds where
NDEBUG is not defined. The equivalent of gtest's ASSERT_*(...) macros in Boost is BOOST_REQUIRE(...).
That's where the confusion came from.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Dec 22, 2017
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Dec 22, 2017
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Dec 26, 2017
…full periods (as before orocos-toolchain#91)

# Conflicts:
#	rtt/Activity.cpp
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Jan 2, 2018
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Jan 2, 2018
hamalMarino pushed a commit to Intermodalics/rtt that referenced this pull request Jan 2, 2018
…full periods (as before orocos-toolchain#91)

# Conflicts:
#	rtt/Activity.cpp
meyerj added 3 commits January 9, 2018 01:43
…/amend comments

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
…policies_toolchain-2.9

Fix Activity default wait period policy and behavior of ORO_WAIT_REL in rtt 2.9
meyerj added a commit that referenced this pull request Jan 16, 2018
…ack-queue into toolchain-2.9

- Merge pull request Intermodalics#2 from fix/activity_sleep_oro_wait_policies_toolchain-2.9:
  Fix Activity default wait period policy and behavior of ORO_WAIT_REL
@meyerj
Copy link
Member

meyerj commented Jan 16, 2018

Merged into toolchain-2.9 again in 71cd9f6.

meyerj added a commit that referenced this pull request Apr 4, 2019
#91 for the constructor introduced in #74

This constructor was obviously forgotten to be updated when rebasing the patch from an earlier version to master.
francisco-miguel-almeida pushed a commit that referenced this pull request Apr 5, 2019
#91 for the constructor introduced in #74

This constructor was obviously forgotten to be updated when rebasing the patch from an earlier version to master.
francisco-miguel-almeida pushed a commit that referenced this pull request Apr 5, 2019
…full periods (as before #91)

# Conflicts:
#	rtt/Activity.cpp
francisco-miguel-almeida pushed a commit that referenced this pull request Apr 9, 2019
#91 for the constructor introduced in #74

This constructor was obviously forgotten to be updated when rebasing the patch from an earlier version to master.
francisco-miguel-almeida pushed a commit that referenced this pull request Apr 9, 2019
…full periods (as before #91)

# Conflicts:
#	rtt/Activity.cpp
@meyerj
Copy link
Member

meyerj commented May 8, 2019

Merging into master now. The new behavior is considered stable and has been used in the toolchain-2.9 for years now, with a few iterations. Other pull requests currently targeting toolchain-2.9 heavily depend on this one.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants