System reboots whenever applying an update

Hello all,

For some reason, whenever I apply an app update or system update, the system reboots (Legato 18.04 on WP85).

The following log snippet seems suspect to me:

Jul  5 17:07:17 swi-mdm9x15 user.info Legato:  INFO | supervisor[485]/supervisor T=main | supervisor.c SigChildHandler() 781 | Reaping unconfigured child process 1179.
Jul  5 17:07:18 swi-mdm9x15 user.info Legato:  INFO | updateDaemon[518]/updateDaemon T=main | supCtrl.c supCtrl_RestartLegato() 126 | Requesting Legato restart.

More than anything, I’m really curious where this problem came from as this appears pseudo-random from my point of view.

If anyone has any ideas it would be much appreciated.

Hello Nick,

When launching a system update (FOTA), the update daemon installs and reboots systematically the board in order to apply the changes. This behaviour is expected when installing a complete Legato (.cwe in general).

It’s not the case when launching an app update (SOTA), the install is done on the run-time without needing a reboot.

This probably explains why it seems pseudo-random from your side. However, if you notice a reboot after a SOTA job, please let me know in order to investigate further.

Best regards,

Hey @oabid,

I’m seeing this with app updates as well (I’ve been applying system updates using update from an Ubuntu environment). It’s definitely a “new” problem for me, so something changed, but I’m really not sure where. I would definitely love to get this resolved as it makes development slower.

Thanks!

Okay, let’s investigate this together. Can you, please, share your target configuration (type, legato version, firmware version) ?

If possible, can you provide the binaries of you app or maybe the list of services used in the app? I’ll try to set up the same environment and launch few SOTA jobs.

Also, if you can reproduce the reboot systematically, a full debug log during the download/install would be great.

Thank’s,

Hey @oabid,

We have several Legato apps on our GitHub that produce this issue (https://github.com/brnkl). We’re running a WP85 with release 15 firmware (SWI9X15Y_07.12.14.00 r34472) with Legato 18.04.

I’ll post some more complete debug logs for the install/download by the end of the day.

Thanks!

Hey @oabid,

My apologies for letting this sit. Some logs showing an install of our GPS app can be found here: https://gist.github.com/nvandoorn/0ec6da6b4f7d5ab1ebaa385f1b088366

Thank’s for the logs.
I see that the reboot is due to a fault occurred in sensorToCloud application afther the installation of the GPS app. Here is the log:

DBUG | gpsMonitor[1289]/location T=main | location.c getLocation() 64 | Checking GPS position
INFO | gpsMonitor[1289]/location T=main | location.c getLocation() 87 | Failed to get reading... retrying in 1 seconds
INFO | watchdog[557]/watchdogDaemon T=main | watchdog.c CleanUpClosedClient() 355 | Client session closed
DBUG | digitalService[853]/framework T=main | brnkl_digital_server.c CleanupClientData() 148 | Client 0xb6fb05f4 is closed !!!
INFO | supervisor[518]/supervisor T=main | proc.c proc_SigChildHandler() 2035 | Process 'sensorToCloud' (PID: 926) has exited with exit code 1.
WRN- | _appStopClient[1290]/framework T=main | LE_FILENAME CreateSocket() 550 | Socket opened as standard i/o file descriptor 2!
*EMR* | supervisor[518]/supervisor T=main | app.c app_SigChildHandler() 3415 | Process 'sensorToCloud' in app 'sensorToCloud' faulted: Rebooting system.
*EMR* | supervisor[518]/supervisor T=main | supervisor.c framework_Reboot() 693 | Supervisor going down to trigger reboot.
DBUG | gpsMonitor[1289]/location T=main | location.c getLocation() 64 | Checking GPS position

So, my first guess is that in the .adef file of the application sensorToCloud, the behaviour in case of a fault is set to reboot: faultAction: reboot. If this is the case, I suggest reducing the severity to faultAction: restart to restart only the process. Check this page for further details: https://docs.legato.io/latest/defFilesAdef.html

Now, we need to dig further to get the origin of the fault in the sensorToCloud application.

Regards,

Hey @oabid,

Thanks for the suggestions. How should I go about debugging this fault? It only happens when installing apps and the logs do not seem to provide any clues.

Hey,
Any news about the fault origin on the sensorToCloud app?

Can you enable full logs on Legato “log level debug” and give it another try? Or, if possible, can you provide me the app source code, I’ll do the test on my side.

Thank’s,

@oabid I have taken over this project from Nick and I’m experiencing the same issue. It seems like if the .adef is set to faultAction: reboot when the update is getting applied the app sees it as a fault and reboots which roles back everything. Any thoughts about how to deal with this?

HI @dbeckwith,
as Oabid suggested in previous post, try to reduce the severity to faultAction: restart to restart only the process.
Check this page for further details: https://docs.legato.io/latest/defFilesAdef.html
Best Regards

@plu, Thank you for the response! Couple of follow up questions:

  1. Why restart and not restartApp?

  2. If I want to keep reboot as the faultAction then should is there a signal we can intercept which would allow the app to shutdown gracefully? In our experience we have chosen reboot because we have had issues with reliably recovering our AirVantage data sessions.

  3. Is this a known bug in the update process?

Thanks

@dbeckwith,
Can you share me the app source code, I’ll do the test on my side.
Regards

@mehdiALL1 It wasn’t one app in particular that caused the reset. Perhaps I can share a small sample if I understand more what you are looking for.

@dbeckwith, ah if this issue is happening with all apps, so, there is something wrong in your system. I noticed that you use Legato 18.04 on WP85 ! which is incompatible with the release 15 component.

Regards

@mehdiALL1 This also happens in Legato 18.09.0 with Release 16 which we moved to. It’s not all apps that exhibit this problem just some custom apps will randomly do this when updating.

@dbeckwith, The release 16 is based on 18.06.1, not 18.09.0 as you can check here: https://source.sierrawireless.com/resources/airprime/software/wpx5xx/wpx5xx-firmware-release-16/

Your custom apps are built with which version of legato? (app info appName).

For example, if your applications are built with legato 18.04.0 and the legato of your module is not 18.04.0, this can also cause problems. That is why is very important to respect the versions of the component of your image and legato.

I propose to flash your module with the release 16 (so have a legato 18.06.1 in the module) and build your apps with Lagato 18.06.1 and apply an app update, it should work without rebooting

Regards

@mehdiALL1, The Legato version running on the WP85 is 18.09.0 and the apps we are built against 18.09.0.

Are you saying that we can’t take advantage of newer Legato versions because R16 isn’t built against 18.09.0?

Thanks,
Darren

@dbeckwith, I just wanted to say that may be the cause of your issue. you can do the test with the right versions to check that.
Regards