Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Upper_body_detection segfault, memory corruption #10

Closed
srinimd2005 opened this issue Jun 17, 2016 · 8 comments
Closed

Upper_body_detection segfault, memory corruption #10

srinimd2005 opened this issue Jun 17, 2016 · 8 comments

Comments

@srinimd2005
Copy link

srinimd2005 commented Jun 17, 2016

Hi I managed to run the spencer tracking with some more modification. But when the tracking is done sometimes I get this error.
Can someone please help me to resolve this. This happens only for upper_body_detection and not for HOG based tracking.

*** Error in `/home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector': double free or corruption (!prev): 0x0000000002815770 ***
[spencer/perception_internal/people_detection/rgbd_front_top/upper_body_detector-32] process has died [pid 18968, exit code -6, cmd /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/e03e4c96-347d-11e6-a1de-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log].
log file: /home/ezc2x-ros/.ros/log/e03e4c96-347d-11e6-a1de-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32*.log
@srinimd2005
Copy link
Author

srinimd2005 commented Jun 20, 2016

When I run the node in gdb I found problems like this. Can someone give me some clue to resolve the bug..

The program being debugged has been started already.
Start it from the beginning? (y or n) 
Starting program: /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/388125ca-36e7-11e6-9336-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc230700 (LWP 18986)]
[New Thread 0x7fffdafd8700 (LWP 18988)]
[New Thread 0x7fffda7d7700 (LWP 18989)]
[New Thread 0x7fffd9fd6700 (LWP 18990)]
[New Thread 0x7fffd97d5700 (LWP 18995)]

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff505b760 <main_arena>, bytes=136488) at malloc.c:3629
3629    malloc.c: No such file or directory.
The program being debugged has been started already.
Start it from the beginning? (y or n) 
Starting program: /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/388125ca-36e7-11e6-9336-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc230700 (LWP 19131)]
[New Thread 0x7fffdafd8700 (LWP 19133)]
[New Thread 0x7fffda7d7700 (LWP 19134)]
[New Thread 0x7fffd9fd6700 (LWP 19135)]
[New Thread 0x7fffd97d5700 (LWP 19140)]

Program received signal SIGABRT, Aborted.
0x00007ffff4cd3c37 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
The program being debugged has been started already.
Start it from the beginning? (y or n) 
Starting program: /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/388125ca-36e7-11e6-9336-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc230700 (LWP 19264)]
[New Thread 0x7fffdafd8700 (LWP 19266)]
[New Thread 0x7fffd27d7700 (LWP 19267)]
[New Thread 0x7fffda7d7700 (LWP 19268)]
[New Thread 0x7fffd9fd6700 (LWP 19273)]

Program received signal SIGSEGV, Segmentation fault.
0x000000000047892e in Detector::ExtractPointsInROIs(Vector<ROI>&, Matrix<int> const&, Matrix<int> const&, PointCloud const&, Matrix<int> const&) ()
No symbol "all" in current context.

Thread 6 (Thread 0x7fffd9fd6700 (LWP 19273)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
No locals.
#1  0x00007ffff707fd25 in bool boost::condition_variable::timed_wait<boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> >(boost::unique_lock<boost::mutex>&, boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> const&) () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#2  0x00007ffff707dcad in ros::CallbackQueue::callAvailable(ros::WallDuration)
    () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#3  0x00007ffff70ad9e4 in ros::internalCallbackQueueThreadFunc() ()
   from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#4  0x00007ffff2b8aa4a in ?? ()
   from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
No symbol table info available.
#5  0x00007ffff653d184 in start_thread (arg=0x7fffd9fd6700)
    at pthread_create.c:312
        __res = <optimized out>
        pd = 0x7fffd9fd6700
Quit
The program being debugged has been started already.
Start it from the beginning? (y or n) 
Starting program: /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/388125ca-36e7-11e6-9336-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc230700 (LWP 19464)]
[New Thread 0x7fffdafd8700 (LWP 19466)]
[New Thread 0x7fffda7d7700 (LWP 19467)]
[New Thread 0x7fffd9fd6700 (LWP 19468)]
[New Thread 0x7fffd97d5700 (LWP 19473)]

Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff505b760 <main_arena>, bytes=136488) at malloc.c:3629
3629    malloc.c: No such file or directory.

Thread 6 (Thread 0x7fffd97d5700 (LWP 19473)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
No locals.
#1  0x00007ffff707fd25 in bool boost::condition_variable::timed_wait<boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> >(boost::unique_lock<boost::mutex>&, boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> const&) () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#2  0x00007ffff707dcad in ros::CallbackQueue::callAvailable(ros::WallDuration)
    () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#3  0x00007ffff70ad9e4 in ros::internalCallbackQueueThreadFunc() ()
   from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#4  0x00007ffff2b8aa4a in ?? ()
   from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
No symbol table info available.
#5  0x00007ffff653d184 in start_thread (arg=0x7fffd97d5700)
    at pthread_create.c:312
        __res = <optimized out>
        pd = 0x7fffd97d5700
Quit
The program being debugged has been started already.
Start it from the beginning? (y or n) 
Starting program: /home/ezc2x-ros/cat_ws/devel/lib/rwth_upper_body_detector/upper_body_detector __name:=upper_body_detector __log:=/home/ezc2x-ros/.ros/log/388125ca-36e7-11e6-9336-0022156bbb6b/spencer-perception_internal-people_detection-rgbd_front_top-upper_body_detector-32.log
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc230700 (LWP 19597)]
[New Thread 0x7fffdafd8700 (LWP 19599)]
[New Thread 0x7fffda7d7700 (LWP 19600)]
[New Thread 0x7fffd9fd6700 (LWP 19601)]
[New Thread 0x7fffd97d5700 (LWP 19606)]

Program received signal SIGABRT, Aborted.
0x00007ffff4cd3c37 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

Thread 6 (Thread 0x7fffd97d5700 (LWP 19606)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
No locals.
#1  0x00007ffff707fd25 in bool boost::condition_variable::timed_wait<boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> >(boost::unique_lock<boost::mutex>&, boost::date_time::subsecond_duration<boost::posix_time::time_duration, 1000000l> const&) () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#2  0x00007ffff707dcad in ros::CallbackQueue::callAvailable(ros::WallDuration)
    () from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#3  0x00007ffff70ad9e4 in ros::internalCallbackQueueThreadFunc() ()
   from /opt/ros/indigo/lib/libroscpp.so
No symbol table info available.
#4  0x00007ffff2b8aa4a in ?? ()
   from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.54.0
No symbol table info available.
#5  0x00007ffff653d184 in start_thread (arg=0x7fffd97d5700)
    at pthread_create.c:312
        __res = <optimized out>
        pd = 0x7fffd97d5700
Quit

@tlind
Copy link
Member

tlind commented Jun 22, 2016

This is a known issue that we also experienced a few times, but we currently cannot pinpoint the exact root cause (@lucasb-eyer). Setting respawn='true' in your launch file is a temporary workaround.

@tlind tlind changed the title Upper_body_detection Upper_body_detection segfault, memory corruption Jun 22, 2016
@lucasb-eyer
Copy link

I have edited your issue to fix formatting, please use formatting to make it at all readable!

Yes, we've had this issue for years. I've tried to figure it out multiple times in the past, to no avail. So yes, the best recommendation we can give is to set respawn='true' as @tlind suggested.

I've managed to boil the problem down to a double-free of an ImagePtr, which supposedly should never happen since it's a smart pointer. I was never able to track the first instance of it beeing free'd and neither was I able to determine under which conditions it crashes. Also, all safe-guarding I have tried failed. I gave up on this error.

@srinimd2005
Copy link
Author

Thank you for your help. I guess for me the problem starts once if I move my kinect xbox 360. Currently I changed the ground plane fixed to ground plane really fixed node. Seems to be little stable now. But I cannot move my kinect while running rgbd detector.

@lucasb-eyer
Copy link

Oh, that sounds very different though. The bug that's hunting me is completely unrelated to movement, so maybe this is not what I thought it is.

You should compile in debug mode, and also comment out this line, where -O3 was hardcoded (don't ask why 😄) in order to get useful stack traces. Then, run it in gdb again, and see if it always crashes at the same code line. When you hit a crash in gdb, please also type bt and show us the output, as well as both info locals and info args. Maybe we can see what goes wrong, but I don't have too much hope as I'm not all too familiar with the code and the person who wrote it has since moved on.

@rentt
Copy link

rentt commented Jul 29, 2016

pull request #12 has fixed some bugs. The upper_body_detector has been ran stable for 4 hours in my testing. I am using kinect one.

@tlind
Copy link
Member

tlind commented Aug 22, 2020

I assume that this bug has been resolved through the merge of PR #59, therefore I'll close this issue.

@tlind tlind closed this as completed Aug 22, 2020
@lucasb-eyer
Copy link

image

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants