modemDaemon - Callback not called when data connection failure

Hello Forum,

After several days of data connection, the modemDaemon encountered the following errors:

Mar  3 04:23:50 swi-mdm9x15 user.err Legato: =ERR= | modemDaemon[23522]/swiQmi T=main | swiQmi.c swiQmi_CheckResponse() 781 | Sending QMI_WDS_START_NETWORK_INTERFACE_REQ_V01 failed: rc=0 (), resp.result=1.[0x01], resp.error=26.[0x1a]
Mar  3 04:23:50 swi-mdm9x15 user.warn Legato: -WRN- | modemDaemon[23522]/le_pa T=main | pa_mdc_qmi.c StartSession() 1707 | Data connection failure reason not available
Mar  3 04:23:50 swi-mdm9x15 user.err Legato: =ERR= | modemDaemon[23522]/modemDaemon T=main | le_mdc.c le_mdc_StartSession() 919 | Get Connection failure 45, 0, 0, 0
Mar  3 04:23:50 swi-mdm9x15 user.err Legato: =ERR= | modemDaemon[23522]/le_pa T=main | pa_mdc_qmi.c pa_mdc_StopSession() 2809 | Bad input parameter
Mar  3 04:23:50 swi-mdm9x15 user.info Legato:  INFO | modemDaemon[23522]/le_pa T=main | pa_mrc_qmi.c pa_mrc_GetNetworkRegState() 2040 | called

When this error occurs, the callback function passed as the argument of the_data_AddConnectionStateHandler is never called while the data connection has down.

A restart of legato does not allow to find a connection. It is necessary to reboot to restart a new data connection.

Thank you for your help

Hello Spastor,

Which version are you using?

I have been evaluating Legato for only a week but the data Connection API has already made me thinking what to do. I mean i am using also data connection api, too but there is an other api in modem service for a similiar functionality and you can’t use both. I have not yet noticed connection problems but i have fears that i will face similar issues lik you and i have no idea what to do apart from rebooting the processor but it should be the last step in error recovery and obviously you can’t do it often.
I think you may tinker with AT commands switching on/of radio and restarting Legato but i fear you can broke something in the environment so it would be nice to have a reset function in Data Connection api that handles it for us. So simplicity is a nice thing with data Connection API but it is maybe too simple and a nice shutdown/restart function could help.

Any idea from others? How do you prepare you program for this kind of problems?

tom

Hello Tomalex,

I use Legato 16.10.1 with my custom Linux Yocto Image.

there is an other api in modem service for a similiar functionality and you can’t use both ?

Could you be more precise ? Use AT platform adaptor (Change PlatformAdaptor-QmiBin-wp85 to PlatformAdaptor-AT) ?

I think you may tinker with AT commands switching on/of radio and restarting Legato

No effect
Device reboot it 's necessary to restart data connexion.

Thanks

Hello Spastor,

I mean you cant’ use the data connection api with the modem data control api. You don’t use it both by any chance? Which AT commands did you use?

BR
tom

Hello Spastor,

I experienced this kind of situation once, ii couldn’t inspect the situtation closely but i saw that the cm command was not working at all. I must implement some check on dataConnectionService/ModemService and reboot on this kind of event…Do you have an update on this problem?

tom

Hello,

the DCS is supposed to handle multiple types of connections. For instance it can consider the WiFi and the Modem as providers for a data connection.

Using the MDC API directly allows you to have a lower level API and more control over what you’re doing with the connection you’re requesting, but the downside is that you would re-do a lot of what the DCS does.

The error from the MDC (QMI_WDS_START_NETWORK_INTERFACE_REQ_V01 failed: rc=0 (), resp.result=1.[0x01], resp.error=26.[0x1a]) is odd and might be some firmware issue, but it seems that you should at least get a call back from the DCS to warn you that connectivity has been lost.

I’ve created an internal issue, will keep you updated.

Thanks

Hello @spastor,

do you still encounter this issue with a more recent version of the fw?

So when the connection starts to break down, the DCS handler is not called?
It should be called at least once to say that it disconnected, but then during the reconnection loop their shouldn’t be any event because there is not really any Connected<>Disconnected change of state.

Can confirm that the connection handler of your app is not called, even right after it just disconnected?

EDIT: Also could you provide some info from cm radio? We’re trying to reproduce the issue but so far we were not able to. I wonder if the radio conditions could be the reason we are having trouble.

Hi @CoRfr,

I’m actively following this issue because I’ve seen it too, on the FX30 running 16.10.1 ( the latest legato for the FX30 ). If I manage to recreate the problem, is there anything else apart from the above that would be helpful ?

Hi @mahtab,

so to clarify, you’ve seen the device lose its connection, and you’re not seeing any event from this?
Otherwise I think full logs would be the most important part (or at least, as much as you can get from logread), + cm radio when it’s connected and another one when it’s trying to reconnect.

Hi CoRfr,

to be strictly accurate, what I’ve seen is that the device reports that it’s connected - i.e. no handler state change is triggered in legato and it has an ip address on the cellular interface - when it isn’t connected - i.e. it no longer has a gateway assigned to it and can’t send data out. I can see that it reports that the default gateway is empty, so somewhere it knows that it’s lost connection, but neither the state handler changes nor the ip address assigned to the cellular interface changes - it still thinks it has an ip address.

It happens more often on one particular network, on some networks I haven’t been able to reproduce it at all: disconnecting the antenna, waiting and reconnecting the antenna a few times is enough to reproduce the problems. I will try to reproduce and get full logs as soon as I can.

Thanks

Very interesting issue, but I’m not sure it’s exactly the same thing as what @spastor encountered.
So the cellular interface rmnet0 is still UP and still has an IP?
Looking forward to the logs, that’s gonna be a fun one to debug! :slight_smile:

OK @CoRfr, in that case, here are a couple of files:

ifconfig_log.txt (1.7 KB)

logread.txt (13.5 KB)

One is a logread snippet, the modem thinks it’s got a data connection and doesn’t, at the same time. You have a data connection failure error and also a data session is already connected error: you get a dns server set and then you get an empty gateway: ifconfig shows the ip address of the cellular interface but you can’t ping to it from outside and you can’t ping out from the modem.

What got the modem into this state was connecting to a UMTS network, then doing a cm radio off, then it re-registers and comes back in this state…

Glad I could make someone’s day fun ! :wink:

P.S I have a sneaking suspicion that these problems occur when there are multiple data connections open and data handlers registered. That is also one of the conditions that exist when I see this problem.

Hello CoRfr,

Sorry for the delay, I was on leave after the birth of a child :slight_smile:

Do you still encounter this issue with a more recent version of the fw?
Not tested with a more recent version (My product is currently being marketed).

I will try to schedule an update of my product

  • First test with an update of legato and the PA

In my first message, there were the modemDeamon application logs and from memory with the cm tool, the errors were identical.

Best regards,
Sylvain

Hello,

Has someone found the problem?

We also happen that when you lose the connection or change operator, there is a point that is not able to reconnect and the modem freezes and only works if you restart the FX30S.

BR,

Javier.