fdMonitor only works sporadically

maybe you try the test build first

Ok, so I flashed the device with release 16.0.1 GENERIC via the Windows one-click-tool.

Then I downloaded the swiflash tool from http://downloads.sierrawireless.com/tools/swiflash/swiflash.zip and ran swiflash -m WP76xx -i yocto_wp76xx.4k_msm_serial_hs.cwe which (after the second attempt) resulted in the following log.

Then I ssh-ed onto the device but app status returned exactly the same status as before (the grayed-out apps are ours). Nothing changed.

image

Then we setup a Raspberry to use it as a flashing device with pure Linux instead of WSL on Windows, which raised this error message when trying to reset the device by running swiflash -m wp76xx -r. Both USB ports on the MangOH red board are connected, the flash process starts but still it fails to flash the image…

image

We then tried your UART_Test1 application by removing all apps first and then flash your app.

image

We started the app, sent several UART messages and encountered exactly the same issue that at the beginning some messages are being received and then suddenly it stops working.

From our point of view at this point the device clearly is broken but we cannot reflash it to the default state as described above. Also, if we stop the app and do cat < /dev/ttyHS0 the Kernel panics after some seconds or if something is sent via UART it panics immediately and the device is being restarted.

can you paste the ati8 here?

As said before, you cannot use " swiflash -m wp76xx -r " in new memory module with any FW or old memory module after R12.

If you need to erase the userapp partition, you need to use this:

If we restart the app, UART works again (as mentioned in the original post with the watchdog).

Ah yes, you are right, I forgot about that. However, after clearing the user partition, the error remains the same.

seems you are not using my yocto image “yocto_wp76xx.4k_msm_serial_hs.cwe”, the yocto build time should be december 2022…

Reflashed it

Now I don’t even got data with cat < /dev/ttyHS0 :frowning:
Baudrate was wrong. Now we get something with cat < /dev/ttyHS0. Will try, if the application also gets data.

Error still exists. Even microcom doesn’t work anymore after the application ran and was stopped again.

I don’t see the error
Is that you cannot receive uart data sometimes?

Yes, that’s exactly the problem. There’s no error shown, just nothing received anymore.

Then is the cts changing state?

You can see my app is changing the state of cts pin by ioctl()

I don’t get the point. What do you mean? In which way does that change the current behaviour? Nothing is received after a while and I don’t think it’s related to the CTS/RTS pin, since we don’t have one.

I want to know if ioctl() still works while rxd pin cannot receive data

BTW, if you microcom only instead of legato application, will you see the issue that no data can be read sometimes?

As mentioned here, microcom didn’t work properly anymore after the application ran.

As a different solution, we tried to simply run a Shell script as an application

sandboxed: true
start: manual

bundles:
{
    dir:
    {
        ${LEAF_WORKSPACE}/workspace/CL_Agent/requiredFiles/bash-uart /
    }
}
requires:
{
    dir:
    {
        /bin /bin
    }

    device:
    {
        [rw] /dev/ttyHS0 /dev/ttyHS0
    }
    
}
processes:
{
    run:
    {
        ( sh /bash-uart/bash-uart.sh )
    }
}
#!/bin/sh

stty -F /dev/ttyHS0 115200

echo "----------> started UART"
while read -r c; do
    echo $c
done < /dev/ttyHS0

We also put this script manually into the home directory to check if it works, if we execute it directly, not from within the app.

This was the procedure:

  1. Step: Ran the script directly with sh ./bash_uart.sh. Outcome: All data was received.
  2. Step: Started the Shell script via app (app start bash_uart). Outcome: The script was started but no content was received or displayed.
  3. Step: Stopped the app and ran sh ./bash_uart.sh again. Outcome: Sometimes it didn’t work at all, sometimes some characters were missing. After some restarts it works again (only if executed from the home directory).

The local shell script always worked if the app was not involved.

So even with this small script, there is something off. There must be something in the Framework that completely tilted which can only be cleared by a hard-reset.

Just wondering if the result will be the same if you use unsandboxed application to call the script by system() API.

BTW, would you consider to use simple linux application compiled by toolchain?
Then you don’t need to involve legato framework.

Hi jyijyi,

here’s our current status. We implemented a separate Shell script to be started with system(./uarth-handler.sh). The script works if it’s executed directly by a user via the console. However, it doesn’t work if it’s startet by the app. We also tried to start it via the /etc/init.d/startlegato.sh script. This way it also starts, but the UART never receives data (exactly the same result as if we start it via the app). Here’s a trimmed down version of the script.

#!/bin/sh

stty -F /dev/ttyHS0 115200

echo "-----> UART started" > /home/root/uart.log
while read -n 1 -r c; do
    echo "$c" >> /home/root/uart.log
done < /dev/ttyHS0
echo "-----> UART closed" >> /home/root/uart.log

I can clearly see in pstree that the script is running but it also never receives data on the UART.

I asked a colleague who has more experience with Linux about the issue and he assumes that there are several mounts mapped on top of each other in the file system. Could that be a problem? Do the apps run in a separate file system or something? Because adding the uart-handler.sh directly to /etc/init.d/ and activating it with update-rc.d uart-handler.sh defaults didn’t start the script either on reboot.

I have added some delay in the script.
It is working fine for me:


root@swi-mdm9x28-wp:~# cat /home/root/read_UART1.sh
#!/bin/sh
sleep 15
stty -F /dev/ttyHS0 115200

echo "-----> UART started" > /home/root/uart.log
while read -n 1 -r c; do
    echo "$c" >> /home/root/uart.log
done < /dev/ttyHS0
root@swi-mdm9x28-wp:~# cat /etc/init.d/startlegato.sh
#!/bin/sh
# Copyright (c) Sierra Wireless, Inc.
#
# Provides a hook for legato into the init scripts

if [ -e "/etc/run.env" ]; then
    source /etc/run.env
fi

FLASH_MOUNTPOINT=${FLASH_MOUNTPOINT:-/mnt/flash}
FLASH_MOUNTPOINT_LEGATO=${FLASH_MOUNTPOINT_LEGATO:-/mnt/legato}

if [ -e "${FLASH_MOUNTPOINT_LEGATO}/systems/current/read-only" ]
then
    export PATH=/legato/systems/current/bin:$PATH
    LEGATO_START=/legato/systems/current/bin/start
    LEGATO_MNT=${FLASH_MOUNTPOINT_LEGATO}
else
    LEGATO_START=${FLASH_MOUNTPOINT_LEGATO}/start
    LEGATO_MNT=${FLASH_MOUNTPOINT}/legato

    # Create mountpoint in case it doesn't already exists.
    mkdir -p ${LEGATO_MNT}
fi

case "$1" in
    start)
        echo "Legato start sequence"

        umount /legato 2>/dev/null
        mount -o bind $LEGATO_MNT /legato

        test -x $LEGATO_START && $LEGATO_START

#jyi
sh /home/root/read_UART1.sh &
        ;;

    stop)
        # Do something to stop Legato
        echo "Legato shutdown sequence"
        test -x $LEGATO_START && $LEGATO_START stop
        umount /legato
        ;;

    *)
        exit 1
        ;;

esac

echo "Finished Legato $1 Sequence"
root@swi-mdm9x28-wp:~#

Of course it works for you, you don’t have a broken device :wink:

So here’s what we further found out. We had this self-written file-helper (files attached) which was used by our monitoring-app to create the files which are required by other apps (as stated here: Bundle required files - #7 by HudriWudri).

The code with which we initialized the files looked like this. (The opened filed was never closed, but we fixed that later on).

static void initRequiredFiles(void)
{
    File_Open(CELL_INFO_FILE_PATH_TO, File_OpenOptionEnumWrite);
    File_Copy(EXAMPLE_FILE_PATH_FROM, HOME_ROOT_PATH "/" EXAMPLE_FILE_PATH_TO, false);
}

We think that especially the File_Copy function caused problems and for some reason broke the whole device! How could that happen??? Do you see any reason for that?

file-helper.zip (1.5 KB)

what do you mean by “broken device”?
You mean your module is different from mine?

What happen if you delete the app?

Finally we found the issue. It had two reasons:

  1. Something got corrupted heavily as described here: fdMonitor only works sporadically - #37 by HudriWudri. Please have a look at that stuff because this must not happen!!!. From my point of view there’s something deeply wrong if actions like that corrupt something permanently.
  2. Once again, the device which sends UART data, sent data to early, meaning before the Sierra’s UART was ready. If Sierra’s UART was not ready before data was received, the UART didn’t function properly anymore (no data was being received). Hence, the watchdog stabilized after a while (as mentioned here fdMonitor only works sporadically). So sometimes the device just crashed (as described here Kernel BUG at /usr/src/kernel/kernel/time/timer.c:806! - #19 by HudriWudri) if data was being received at the same moment as open was called or it just didn’t work sometimes if data was being received before /dev/ttyHS0 was ready. This also must not happen!!! Please have a look at this stuff and even more important: Please update the kernel! It is - as mentioned here - already 7 years old!

Linux: Linux swi-mdm9x28-wp 3.18.140 #1 PREEMPT Tue Aug 3 08:17:34 UTC 2021 armv7l GNU/Linux
Legato: 19.11.6_af84a308b18c93dd4f3ae40f63c9bc67