• 10 dec 2017: forum version update. In case of issues use this topic.
  • 30 nov 2017: pilight moved servers. In case of issues use this topic.
Hello There, Guest! Login Register


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
occasional segfaults in plua_gc_unreg
#1
Occasionally pilight (staging) crashes with a segfault.  It can take days or even weeks before the segfault occurs again, so it is not easy to catch. My putty sessions usually are dead long befor the next segfault occurs, so I attached a monitor to the PI and let pilight run under gdb without putty and finally I caught one. 

It was in plua_gc_unreg in lua.c with this statement.

Code:
   if(state->gc.list[i]->ptr == ptr) {

When was looking where to put some debug logging into lua.c to help finding the cause, I noticed by chance that plua_clear_state has a mutex_unlock, but no matching mutex_lock. 

I guess that a missing mutex_lock may well be causing such rare segfaults.
 
Reply
#2
I would love to see that backtrace. The mutex is locked in plua_get_free_state.

Also let me know what the exact error is. The ptr that is null, the i index that doesn't exist, the state itself?

I did see some other point of interest that i fixed.
 
Reply
#3
I caught the segfault! This was without the fix you made.
I will now apply that fix and try again.

Code:
[Nov 07 00:25:21:273059] pilight-daemon: DEBUG: plua_gc_unreg state->gc.nr=1, i=0, ptr=0xb4e9f520
[Nov 07 00:25:21:386240] pilight-daemon: DEBUG: rule #134 flowlog was parsed in 0.002803 seconds
[Nov 07 00:25:21:399433] pilight-daemon: DEBUG: rule #135 moistlog was parsed in 0.003315 seconds
[Nov 07 00:25:21:409439] pilight-daemon: DEBUG: rule #136 bewateringstatus_request_a was parsed in 0.000245 seconds

Thread 1 "pilight-daemon" received signal SIGSEGV, Segmentation fault.
0xb681b894 in plua_gc_unreg (L=0xb63761c0, ptr=0xb4e9f520) at /home/pi/pilight/libs/pilight/lua_c/lua.c:1401
1401                    if(state->gc.list[i]->ptr == ptr) {
(gdb) backtrace
#0  0xb681b894 in plua_gc_unreg (L=0xb63761c0, ptr=0xb4e9f520) at /home/pi/pilight/libs/pilight/lua_c/lua.c:1401
#1  0xb68134e4 in thread_free (req=0xb4e12338, status=0) at /home/pi/pilight/libs/pilight/lua_c/async.c:58
#2  0xb6779c88 in uv__queue_done (w=0xb4e12364, err=0) at /home/pi/pilight/libs/libuv/threadpool.c:318
#3  0xb6779b1c in uv__work_done (handle=0xb6fa8120 <default_loop_struct+96>) at /home/pi/pilight/libs/libuv/threadpool.c:293
#4  0xb677d058 in uv__async_io (loop=0xb6fa80c0 <default_loop_struct>, w=0xb6fa81b0 <default_loop_struct+240>, events=1)
   at /home/pi/pilight/libs/libuv/unix/async.c:118
#5  0xb67877d4 in uv__io_poll (loop=0xb6fa80c0 <default_loop_struct>, timeout=893) at /home/pi/pilight/libs/libuv/unix/linux-core.c:400
#6  0xb677de88 in uv_run (loop=0xb6fa80c0 <default_loop_struct>, mode=UV_RUN_DEFAULT) at /home/pi/pilight/libs/libuv/unix/core.c:362
#7  0x0001cde4 in main (argc=2, argv=0xbefff7c4) at /home/pi/pilight/daemon.c:3470
(gdb) frame 0
#0  0xb681b894 in plua_gc_unreg (L=0xb63761c0, ptr=0xb4e9f520) at /home/pi/pilight/libs/pilight/lua_c/lua.c:1401
1401                    if(state->gc.list[i]->ptr == ptr) {
(gdb) frame 1
#1  0xb68134e4 in thread_free (req=0xb4e12338, status=0) at /home/pi/pilight/libs/pilight/lua_c/async.c:58
58                      plua_gc_unreg(lua_thread->L, lua_thread);
(gdb) frame 2
#2  0xb6779c88 in uv__queue_done (w=0xb4e12364, err=0) at /home/pi/pilight/libs/libuv/threadpool.c:318
318       req->after_work_cb(req, err);
(gdb) frame 3
#3  0xb6779b1c in uv__work_done (handle=0xb6fa8120 <default_loop_struct+96>) at /home/pi/pilight/libs/libuv/threadpool.c:293
293         w->done(w, err);
(gdb) frame 4
#4  0xb677d058 in uv__async_io (loop=0xb6fa80c0 <default_loop_struct>, w=0xb6fa81b0 <default_loop_struct+240>, events=1)
   at /home/pi/pilight/libs/libuv/unix/async.c:118
118         h->async_cb(h);
(gdb) frame 5
#5  0xb67877d4 in uv__io_poll (loop=0xb6fa80c0 <default_loop_struct>, timeout=893) at /home/pi/pilight/libs/libuv/unix/linux-core.c:400
400               w->cb(loop, w, pe->events);
(gdb) frame 6
#6  0xb677de88 in uv_run (loop=0xb6fa80c0 <default_loop_struct>, mode=UV_RUN_DEFAULT) at /home/pi/pilight/libs/libuv/unix/core.c:362
362         uv__io_poll(loop, timeout);
(gdb) frame 7
#7  0x0001cde4 in main (argc=2, argv=0xbefff7c4) at /home/pi/pilight/daemon.c:3470
3470                    uv_run(uv_default_loop(), UV_RUN_DEFAULT);

The first line is logged right before the statement where it crashes:

Code:
logprintf(LOG_DEBUG, "plua_gc_unreg state->gc.nr=%i, i=%i, ptr=%p", state->gc.nr, i, ptr);
if(state->gc.list[i]->ptr == ptr) {

Because it was likely that "state->gc.list[i]->ptr"was causing the segfault, i didn't put that in the debug log. Otherwise it would have crashed on the logprintf statement.
 
Reply
#4
Good to see that backtrace. What i miss however, is the actual values of the failing variables. So, in the next backtrace please post the output of:
Code:
print state
print state->gc
print state->gc.list
print i
print state->gc.list[i]
print state->gc.list[i]->free
print state->gc.list[i]->ptr
print ptr
 
Reply
#5
Can you try my latest fix, and if it segfaults again, please post the values as posted in my previous post.
 
Reply
#6
Any news?
 
Reply
#7
Well, yes, but I didn't have the time to dive into it.
I got a different segfault and I left it in that state. Hopefully I can get the backtrace and the related values and post them tomorrow.
 
Reply
#8
As I understand you have a separate device running pilight to test it's stability. Is there a way how I can reproduce that device locally so I can help searching myself? In my setup everything is just stable.
 
Reply
#9
Can you please move the http errors to the splitted new thread?
https://forum.pilight.org/showthread.php?tid=3485
 
Reply
  


Possibly Related Threads...
Thread Author Replies Views Last Post
  [Solved] Nightly segfaults joe99 7 1,382 09-05-2017, 07:35 PM
Last Post: curlymo

Forum Jump:


Browsing: 1 Guest(s)