Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Timer at high rate causes a high CPU usage on armhf #520

Open
lukicdarkoo opened this issue Mar 4, 2020 · 14 comments
Open

Timer at high rate causes a high CPU usage on armhf #520

lukicdarkoo opened this issue Mar 4, 2020 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@lukicdarkoo
Copy link

I installed ROS2 Eloquent on Raspberry Pi Zero W and I have issues publishing messages at a high rate (~31Hz). The problem is high CPU usage and it is caused by calling a callback via create_timer(). Therefore, even though the callback function does nothing (pass) the CPU usage is still very high. Also, it seems the CPU usage rises linearly with callback calling rate, but I didn't save the results support this (CPU usage is low for the timer running at 0.5s interval).

Required Info:

  • Operating System:
    Raspbian Buster

  • Installation type:
    from source

  • Version or commit hash:
    42cabefef332c394cfa96f08f1e35fe1cd82e6b8

  • DDS implementation:
    Fast-RTPS

  • Client library (if applicable):
    rclpy

Steps to reproduce issue

For tests I used a simple publisher example given at:
https://raw.githubusercontent.com/ros2/examples/master/rclpy/topics/minimal_publisher/examples_rclpy_minimal_publisher/publisher_member_function.py
with the following modifications:

  • timer interval is set to 32/1000 (32ms) and
  • logger method is deleted (self.get_logger().info('Publishing: "%s"' % msg.data))

Expected behaviour

This is CPU and memory usage for equivalent example in C++ running on the same device:
https://i.imgur.com/N093gQV.png

Actual behavior

The same test for the Python example:
https://i.imgur.com/wO1cmsF.png

@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2020-03-18/13313/1

@hidmic
Copy link
Contributor

hidmic commented Mar 21, 2020

I do not have your exact same hardware to reproduce, nor a 32 bit ARM platform at hand, but I can see the significant gap between rclcpp and rclpy, both on amd64 and arm64. I'll try do some profiling next week, see if there's a clear bottleneck.

You're welcome to contribute as well. Thanks for reporting.

@hidmic
Copy link
Contributor

hidmic commented Oct 21, 2020

Well, I never circled back here. @lukicdarkoo since Eloquent is going EOL in a month, we may want to focus on Foxy. Have you experienced these issues on Foxy?

@lukicdarkoo
Copy link
Author

lukicdarkoo commented Oct 21, 2020

I cannot test it on Raspberry Pi now, but it seems it works ok on a PC:

import rclpy
from rclpy.node import Node


PUBLISH_PERIOD = 1e-3


class Test(Node):
    def __init__(self):
        super().__init__('test')
        self.timer = self.create_timer(PUBLISH_PERIOD, self.timer_callback)

    def timer_callback(self):
        pass


def main(args=None):
    rclpy.init(args=args)
    test = Test()
    rclpy.spin(test)
    test.destroy_node()
    rclpy.shutdown()


if __name__ == '__main__':
    main()

CPU reaches 100% utilization (on PC) only when PUBLISH_PERIOD = 1e-4 (10KHz), that seems to be reasonable I guess.

@hidmic
Copy link
Contributor

hidmic commented Oct 21, 2020

Will you be able to test on armhf any time soon? I'm inclined to close this issue, but I'd like to be sure it also works OK in the originally reported architecture.

@lukicdarkoo
Copy link
Author

lukicdarkoo commented Oct 25, 2020

I have just tested on armhf with Foxy and at 30Hz the CPU utilization is around 30%. Therefore, the performances are still the same.

From my point of view, it is a big overhead. If you need 1/3 of CPU just to execute an empty callback at 30Hz it will be "no go" for most the use-cases. Often, we need even a higher rate and more timers. For example, we implemented our ROS2 nodes in C++ as we quickly reached full CPU utilization in Python (mostly caused by timers).

However, I am not sure how many users are out there using Python ROS2 nodes on armhf. Maybe that could be a reason to don't put too much weight on the issue:
https://discourse.ros.org/t/potential-downgrade-of-arm32-support-to-tier-3/14136/2

If you give me some pointers I may try to debug it, but I can't guarantee as I am constraint by time and we are not actively developing the project on the armhf platform anymore.

@hidmic
Copy link
Contributor

hidmic commented Nov 10, 2020

If you give me some pointers I may try to debug it, but I can't guarantee as I am constraint by time and we are not actively developing the project on the armhf platform anymore.

I see. I'd go look into rclpy.executors.Executor implementation, specifically rclpy.executors.Executor._wait_for_ready_callbacks(), and what follows going down the stack (rclpy C extensions, rcl, rmw). The fact that it doesn't show on C++ may suggest an issue with rclpy itself though.

@hidmic hidmic added the bug Something isn't working label Mar 9, 2021
@zdzhaoyong
Copy link

zdzhaoyong commented Mar 8, 2022

The cpu usage is high even when it is an empty node where no timer exists. When we use rosbridge of ROS2, the CPU is dramaticly high (over 100%) while ROS1 is ok (below 20%). Please fix the bug, please....

@Barry-Xu-2018
Copy link
Contributor

The cpu usage is high even when it is an empty node where no timer exists. When we use rosbridge of ROS2, the CPU is dramaticly high (over 100%) while ROS1 is ok (below 20%). Please fix the bug, please....

Which version of ROS2 do you use ?
rclpy had been refactored based on pybind11.
Not sure if this problem also exists in latest version. If possible, you can try latest version.

@zdzhaoyong
Copy link

I was using the default rclpy along with ros foxy.
Just trying to compile this latest version of rclpy, does it works along with foxy?

@Barry-Xu-2018
Copy link
Contributor

Just trying to compile this latest version of rclpy, does it works along with foxy?

Do you mean you want to use latest code of rclpy for foxy ? If yes, I think build should fail.
Is it possible to try Galactic or Rolling in your environment ?

@fujitatomoya
Copy link
Collaborator

The cpu usage is high even when it is an empty node where no timer exists. When we use rosbridge of ROS2, the CPU is dramaticly high (over 100%) while ROS1 is ok (below 20%).

this seems to be different problem. it would be nice to create a dedicated issue for this problem. if this problem can be observed when we use ros1_bridge, that would be place to create issue.

@zdzhaoyong
Copy link

Just trying to compile this latest version of rclpy, does it works along with foxy?

Do you mean you want to use latest code of rclpy for foxy ? If yes, I think build should fail. Is it possible to try Galactic or Rolling in your environment ?

I reimplemented the functional of rosbridge with c++, and now probelm solved. Hope I can share the code if the company agrees.

@zdzhaoyong
Copy link

I reimplemented the functional of rosbridge with c++, and now probelm solved. Hope I can share the code if the company agrees.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants