Wifi handling stalls after multiple enable-connect-disable cycles

Hi

We have stumbled upon a problem with the wifi handling in legato that I still haven’t been able to pin-point in its entirety. I hope that someone here will be able to help me figure out what the problem is.

Our system is such that I have written a legato application that, upon startup, starts listening for connections on a specific port and then expose an API to a mono application for it to be able to control the hardware through the legato API’s. Using this system it is, of course also possible to control the wifi connection.

We have now stress tested the system some and found that after something like 76-124 iterations of the loop below, the legato loop stalls at the le_wifiClient_Stop() call.

  • le_wifiClient_Start()
  • le_wifiClient_Create()
  • le_wifiClient_SetHiddenNetworkAttribute()
  • le_wifiClient_SetSecurityProtocol()
  • le_wifiClient_SetPassphrase()
  • le_wifiClient_Connect()
  • sleep (30)
  • le_wifiClient_Stop()
  • sleep (15)

After looking at the log file and adding some more logging I have found that it seems to be a problem in the file pa_wifi_client_ti.c. The thread WifiClientPaThreadMain seems to be unable to stop when asked to by the function le_thread_Cancel() in pa_wifiClient_Stop(). Even though the call to fgets should be a cancellation point for the thread, it doesn’t seem like the call to le_thread_Cancel()is able to cancel it. To remedy that I have made the IWThreadPipePtr file pointer unblockable which ensures that the call to pthreadnever blocks and then in the inner loop I have tested for the errno value to see if it would have blocked. On top of this I have added a global cancellation boolean that the loop looks at such that it the pa_wifiClient_Stop() function can set this variable to make the loop exit.

I have, by the way, also removed the call to le_event_RunLoop() as I both can’t see why anyone would want that thread to start listening for legato events in the case that the main functionality of the thread have failed and on top of that it seems that in the le_event_RunLoop()there is no cancellation points meaning that the thread will be totally unable to join.

However, all my attempts above has done nothing but delay the point where the error occurs. WIthout any changes the error occured around the 76th iteration of the test loop and with it I now have seen 124 successful iterations.

Have anyone seen something like it or do you have good ideas for what to look at? I am a bit stumped over the fact that event though the fgets should be a cancellation point, it helps to make it non-blocking.

Hi,

Could you please share more details about your setup?

  • Board: mangOH Green, mangOH Red or custom board?
  • Module: WP76xx, WP77xx, WP85xx, WP75xx, …
  • Legato version: legato version
  • Firmware version: cm info firmware
  • Image details: cat /etc/legato/version

If possible, try to reproduce the issue with latest Legato 19.04 available here: