[uclibc-ng-devel] bug: stdio funcs, thread cancellation with futex hold

Yann Sionneau ysionneau at kalray.eu
Thu Sep 26 17:06:51 CEST 2019


Hello,

I would like to know if there is consensus on the fact that this program 
hanging on a deadlock is a uClibc-ng bug: https://pastebin.com/11qLsTW5

I've asked people to try this on several arch/libc combination, it works 
well (no hang) on:

* x86_64/glibc

* x86_64/musl

* armv7l/musl

Also, I've run the puts.c test (linked above) on 3 archs with uClibc-ng, 
it fails (hangs) on:

* or1k/uClibc-ng

* mips32r6/uClibc-ng

* k1c/uClibc-ng (port not completely published yet)

See failure logs + strace -f of what happens when hanging: 
https://mypads.framapad.org/mypads/?/mypads/group/uclibc-ng-10be37ap/pad/view/puts-issue-qt62e74n


My understanding of the issue is that:

puts, and possibly other libc functions, are taking a lock ( 
https://elixir.bootlin.com/uclibc-ng/latest/source/libc/stdio/puts.c#L17 
) and end up calling write() which is a cancellation point. ( 
https://elixir.bootlin.com/uclibc-ng/latest/source/libc/stdio/_stdio.h#L150 
)

So, if a thread is canceled, is asynchronous mode (which is the default 
one), and the cancelation is triggered by the write() inside the puts(), 
then the thread will unwind and exit without unlocking the puts lock.

Then, any other thread calling puts() will hang indefinetely (and hang 
other threads if it hangs with locks held...).

My understanding of what can be done to fix this issue:

1/ Either make puts a non cancelation point (see man 7 pthreads, puts is 
not listed in mandatory cancelation point, only in "may"). For instance 
it is not a cancelation point in glibc.

2/ Or keep puts as a cancelation point and fix the puts code so that it 
releases the lock upon cancelation (using pthread_cleanup_push/pop for 
instance)

In case people think this is indeed a bug, here are examples of code 
fixes that I have in mind, please don't hesitate to comment or/and 
propose something else:

1/ https://pastebin.com/ePsWJzdi

2/ https://pastebin.com/5EA4RedS


One problem of those fixes is that we need to identify all libc 
functions that take a lock and call a cancellable function and apply 
such kind of fixes... This is not easy and a bit painful.


Thanks!

Regards,

-- 

Yann Sionneau



More information about the devel mailing list