We have stumbled upon a problem with the wifi handling in legato that I still haven’t been able to pin-point in its entirety. I hope that someone here will be able to help me figure out what the problem is.
Our system is such that I have written a legato application that, upon startup, starts listening for connections on a specific port and then expose an API to a mono application for it to be able to control the hardware through the legato API’s. Using this system it is, of course also possible to control the wifi connection.
We have now stress tested the system some and found that after something like 76-124 iterations of the loop below, the legato loop stalls at the
- sleep (30)
- sleep (15)
After looking at the log file and adding some more logging I have found that it seems to be a problem in the file
pa_wifi_client_ti.c. The thread
WifiClientPaThreadMain seems to be unable to stop when asked to by the function
pa_wifiClient_Stop(). Even though the call to
fgets should be a cancellation point for the thread, it doesn’t seem like the call to
le_thread_Cancel()is able to cancel it. To remedy that I have made the
IWThreadPipePtr file pointer unblockable which ensures that the call to
pthreadnever blocks and then in the inner loop I have tested for the
errno value to see if it would have blocked. On top of this I have added a global cancellation boolean that the loop looks at such that it the
pa_wifiClient_Stop() function can set this variable to make the loop exit.
I have, by the way, also removed the call to
le_event_RunLoop() as I both can’t see why anyone would want that thread to start listening for legato events in the case that the main functionality of the thread have failed and on top of that it seems that in the
le_event_RunLoop()there is no cancellation points meaning that the thread will be totally unable to join.
However, all my attempts above has done nothing but delay the point where the error occurs. WIthout any changes the error occured around the 76th iteration of the test loop and with it I now have seen 124 successful iterations.
Have anyone seen something like it or do you have good ideas for what to look at? I am a bit stumped over the fact that event though the
fgets should be a cancellation point, it helps to make it non-blocking.