Legato IDL - pre-defined type `file` not working with Legato 17.07.x


#1

Hi,

I rebuilt my code-base with Legato 17.07.1 for a WP8548 target.

Everything builds, installs and appears to run correctly.

One of the apps provides a service for clients to obtain a read-only file descriptor to a particular file.

My .api file function prototype looks like this

FUNCTION le_result_t GetFile
(
    file fd_out OUT
);

And a simplified service implementation looks like this:

le_result_t my_service_GetFile(int* fileOutPtr)
{
    //...
    *fileOutPtr = open(path_to_file, O_RDONLY);
	if (*fileOutPtr != -1)
	{
		return LE_OK;
	}
	return LE_FAULT;
}

This worked perfectly fine with Legato 16.10.1 and 16.10.3. However, since the update to 17.07.1 the file descriptor returned by the service to the client is always -1. Note that the file descriptor obtained within the service itself is valid - something must invalidate the file descriptor in the generated interface files.

Has support for the pre-defined file type been removed from the Legato IDL?


#2

Same thing happens with Legato 17.07.2.

Note that the issue only affects API services providing a file descriptor. Services receiving a file descriptor are unaffected.

  • file fd OUT will always set the fd pointer to -1, no matter what
  • file fd IN works correctly

I haven’t tested all the intermediate releases between 16.10.3 and 17.07.1 so I’m not sure when the issue was first introduced.

A workaround would be to pass a string with the file path to clients but I’d like to restrict them to have read-only access.

Several of my apps need to work with files, ideally by passing descriptors around. However, if I should abandon the use of file fd OUT and use file paths instead, so be it.

Can anyone comment on this?


#3

Hi @raf,

there was quite a bit of work done around mkTools between these 2 versions, and it looks like a regression.
I’ve created an issue internally but you can also create one on https://github.com/legatoproject/legato-af as to track the resolution.

It won’t be in 17.07 though, and unlikely in 17.08.
If you ever investigate and found the issue, we can eventually integrate your change.

Thanks


#4

Hi @CoRfr,

Thanks for responding.

I did investigate a little further and found the issue to be caused by the generated _client.c api function code.
More specifically, when unpacking the out parameters, a reference to the request message is used rather than a reference to the response message when le_msg_GetFd is called.

Basically, the generated code should use _responseMsgRef like so:

*fd_outPtr = le_msg_GetFd(_responseMsgRef);    // Correct

Rather than _msgRef which will cause it to return -1:

*fd_outPtr = le_msg_GetFd(_msgRef);            // Incorrect. Returns -1

I can manually edit the auto-generated code built with 17.07.x and it works.

Unfortunately, as much as I’d like to, I’m unable to devote more time to finding the root cause at the moment.
Hopefully someone more knowledgeable with the Legato AF will track it down.

Anyway, here’s a comparison between the generated code built with 16.10.3 and 17.07.x


idl_test_client.c generated code built with Legato 16.10.3

le_result_t idl_test_GetFile
(
    int* fd_outPtr
        ///< [OUT]
)
{
    le_msg_MessageRef_t _msgRef;
    le_msg_MessageRef_t _responseMsgRef;
    _Message_t* _msgPtr;

    // Will not be used if no data is sent/received from server.
    __attribute__((unused)) uint8_t* _msgBufPtr;

    le_result_t _result;

    // Range check values, if appropriate


    // Create a new message object and get the message buffer
    _msgRef = le_msg_CreateMsg(GetCurrentSessionRef());
    _msgPtr = le_msg_GetPayloadPtr(_msgRef);
    _msgPtr->id = _MSGID_idl_test_GetFile;
    _msgBufPtr = _msgPtr->buffer;

    // Pack the input parameters


    // Send a request to the server and get the response.
    LE_DEBUG("Sending message to server and waiting for response : %ti bytes sent",
             _msgBufPtr-_msgPtr->buffer);
    _responseMsgRef = le_msg_RequestSyncResponse(_msgRef);
    // It is a serious error if we don't get a valid response from the server
    LE_FATAL_IF(_responseMsgRef == NULL, "Valid response was not received from server");

    // Process the result and/or output parameters, if there are any.
    _msgPtr = le_msg_GetPayloadPtr(_responseMsgRef);
    _msgBufPtr = _msgPtr->buffer;

    // Unpack the result first
    _msgBufPtr = UnpackData( _msgBufPtr, &_result, sizeof(_result) );


    // Unpack any "out" parameters
    *fd_outPtr = le_msg_GetFd(_responseMsgRef);

    // Release the message object, now that all results/output has been copied.
    le_msg_ReleaseMsg(_responseMsgRef);


    return _result;
}

idl_test_client.c generated code built with Legato 17.07.x

le_result_t idl_test_GetFile
(
    int* fd_outPtr
        ///< [OUT]
)
{
    le_msg_MessageRef_t _msgRef;
    le_msg_MessageRef_t _responseMsgRef;
    _Message_t* _msgPtr;

    // Will not be used if no data is sent/received from server.
    __attribute__((unused)) uint8_t* _msgBufPtr;
    __attribute__((unused)) size_t _msgBufSize;

    le_result_t _result;

    // Range check values, if appropriate


    // Create a new message object and get the message buffer
    _msgRef = le_msg_CreateMsg(GetCurrentSessionRef());
    _msgPtr = le_msg_GetPayloadPtr(_msgRef);
    _msgPtr->id = _MSGID_idl_test_GetFile;
    _msgBufPtr = _msgPtr->buffer;
    _msgBufSize = _MAX_MSG_SIZE;

    // Pack a list of outputs requested by the client.
    uint32_t _requiredOutputs = 0;
    _requiredOutputs |= ((!!(fd_outPtr)) << 0);
    LE_ASSERT(le_pack_PackUint32(&_msgBufPtr, &_msgBufSize, _requiredOutputs));

    // Pack the input parameters

    // Send a request to the server and get the response.
    LE_DEBUG("Sending message to server and waiting for response : %ti bytes sent",
             _msgBufPtr-_msgPtr->buffer);
    _responseMsgRef = le_msg_RequestSyncResponse(_msgRef);
    // It is a serious error if we don't get a valid response from the server.  Call disconnect
    // handler (if one is defined) to allow cleanup
    if (_responseMsgRef == NULL)
    {
        SessionCloseHandler(GetCurrentSessionRef(), GetClientThreadDataPtr());
    }

    // Process the result and/or output parameters, if there are any.
    _msgPtr = le_msg_GetPayloadPtr(_responseMsgRef);
    _msgBufPtr = _msgPtr->buffer;
    _msgBufSize = _MAX_MSG_SIZE;

    // Unpack the result first
    if (!le_pack_UnpackResult( &_msgBufPtr, &_msgBufSize, &_result ))
    {
        goto error_unpack;
    }

    // Unpack any "out" parameters
    if (fd_outPtr)
    {
        *fd_outPtr = le_msg_GetFd(_msgRef);
    }


    // Release the message object, now that all results/output has been copied.
    le_msg_ReleaseMsg(_responseMsgRef);


    return _result;

error_unpack:
    LE_FATAL("Unexpected response from server.");
}

Cheers,
Raf


#5

Hi @raf:

Thanks for your investigation. In legato/framework/tools/ifgen/langC/templates/pack.templ around line 180 it says:

    if ({{parameter|FormatParameterName}})
    {
        *{{parameter|FormatParameterName}} = le_msg_GetFd(_msgRef);
    }

if you change _msgRef to _responseMsgRef does it work?

Thanks!
– Keith


#6

Hi @kdunwoody,

It does indeed - that fixed it!

Thanks for pointing me to the file. I note there’s been significant changes to ifgen since 16.10.3.

I’ve created an issue at https://github.com/legatoproject/legato-af/issues/19 to track this.

Thanks,
Raf


#7

Hi again @kdunwoody @CoRfr .

I think I’ve stumbled upon another IDL file issue, but this time relating to file fd IN?

I migrated to Legato 17.09.0 from 16.10.3 about a day ago and manually implemented the ifgen fix (issue 19) for file fd OUT as per the previous posts.

Legato:                17.09.0_cf35276670208f4a6ce776bcd22775de_modified
Firmware:              SWI9X15Y_07.12.09.00 r34123 CARMD-EV-FRMWR1 2017/04/26 23:34:19
Bootloader:            SWI9X15Y_07.12.09.00 r34123 CARMD-EV-FRMWR1 2017/04/26 23:34:19
PRI PN:                9905383
PRI Rev:               01.07
Carrier PRI Name:      GENERIC
Carrier PRI Rev:       001.033_000
SKU:                   1102816
MCU Version:           001.003
LAST RESET REASON:     Unknown
RESETS COUNT:          Expected: 0	Unexpected: 0

I built my own .sdef using mksys after running bin/legs in the legato dir downloaded from https://github.com/legatoproject/legato-af/tree/17.09.0.
I did try building the system in Dev Studio 5.3 after adding 17.09.0 as a custom package but the system crashed after applying the update (possibly a toolchain related problem?).

Anyway, the test I was running basically consists of performing these operations every 5 minutes:

  • Connect to a webserver (HTTPS using libcurl) and sync system time
  • Read sensor data from an external serial device.
  • Store data on SD card
  • Connect to FTP server and upload file (with libcurl)

My test app ran fine for roughly 20 hours until I observed some odd behavior. It’s running unsandboxed.

Oct 25 06:26:26 | myService[22052]/framework T=main | my_web_client.c my_web_FtpUploadFile() 958 | Sending message to server and waiting for response : 23 bytes sent
Oct 25 06:26:26 | myService[22052]/framework T=main | LE_FILENAME unixSocket_SendMsg() 441 | Sending fd 19.
Oct 25 06:26:26 | myWeb[22051]/framework T=main | LE_FILENAME unixSocket_ReceiveMsg() 669 | Ancillary data was discarded because it couldn't fit in our buffer
... 
Oct 25 06:26:26 | myWeb[22051]/myWebServiceComponent T=main | myWebService.c FtpUploadFile() 1026 | Couldn't open file: Bad file descriptor
...

I noted several FTP Timeouts logged for a couple of hours leading up to this which would have resulted in the file size growing by ~49 bytes each time. Could this be a file size restriction imposed in a later version of Legato?
Note that I’ve set maxFileBytes: 8192K and have been able to upload a file several MB in size without issue before.

This is what the log usually looks like when it’s working.

Oct 25 23:21:29 | myService[5051]/framework T=main | my_web_client.c my_web_FtpUploadFile() 958 | Sending message to server and waiting for response : 23 bytes sent
Oct 25 23:21:29 | myService[5051]/framework T=main | LE_FILENAME unixSocket_SendMsg() 441 | Sending fd 19.
Oct 25 23:21:29 | myWeb[5050]/framework T=main | LE_FILENAME ExtractFileDescriptor() 34 | Received fd (209).
...
Oct 25 23:21:29 | myWeb[5050]/myWebServiceComponent T=main | myWebService.c FtpUploadFile() 1030 | Local file size: 49 bytes.
...

The function signature looks like this:

le_result_t my_web_FtpUploadFile(const char* filename, int fd);
DEFINE MAX_STR_LEN = 512;

FUNCTION le_result_t FtpUploadFile
(
    string filename [MAX_STR_LEN] IN,
    file fd IN
);

Additionaly, issues with the HTTPS queries started to fail and logged this on libcurl’s verbose output:

error reading ca cert file /etc/ssl/certs/ca-certificates.crt (Error while reading file.)

And the error message was: curl_easy_perform() failed: Problem with the SSL CA cert (path? access rights?)

I’m unsure if these are related to each other but they appear to be caused by file reading issues. In any case, they never appeared in 16.10.3 after months of running continuously and uninterrupted.

Nothing has changed in the .cdef’s or .adef’s. They include all the files and libs needed as per the httpGet sample code. Do I need to add anything else or explicitely give access to certain dirs since moving from Legato 16.10.3?

Note that restarting the app resolves both issues (temporarily) until it falls over again some 10’s of hours later.

Any help would be much appreciated. Thanks.
Raf


#8

UPDATE:

So my tests system has faulted again with the exact same issue.

I was able to get some run-time stats extracted from log timestamps.

Run-time until fault - First run

From: Tuesday, 24 October 2017 at 06:28:58
To: Wednesday, 25 October 2017 at 02:36:30

Result: 20 hours, 7 minutes and 32 seconds

Run-time until fault - Second run

From: Wednesday, 25 October 2017 at 06:55:47
To: Thursday, 26 October 2017 at 03:11:29

Result: 20 hours, 15 minutes and 42 seconds

I can also confirm that /legato/systems/current/appsWriteable/myService/etc/ssl/certs/ca-certificates.crt and the sensor data file I’m attempting to upload exist. I can open them both outside of app.

Could this be a resource leak?

Is there something I can do to check/debug further without restarting the app (it’s still running with the issues)?

Thanks


#9

Do you have a test app that demonstrates the issue that you can post?

I took a brief look and I can’t see any obvious reason this would fail (unfortunately). I’ll file a bug in our bug tracker.


#10

As for the resource leak, it might be the case.

You could use the inspect tool to ‘observe’ a resource leak: http://legato.io/legato-docs/latest/toolsTarget_inspect.html


#11

Thanks @kdunwoody, @CoRfr.

Sorry for the slow response - I think I finally got to the bottom of it.

So, it was a resource leak caused by my misuse of file fd IN and not another a bug in mkTools.

I setup a test to continuously upload the same file and noticed that the sent fd was always the same number but the received fd would steadily increment with each IPC API call. Things stopped working once the received fd reached 255 and subsequent attempts gave this message.

Ancillary data was discarded because it couldn't fit in our buffer.

I was only closing the open file descriptor in the sender’s address space when I (thought I was but) really should have closed it in the receiver’s address space. The IPC API takes care of closing the original fd in the sender’s address space.

The culprit was fdopen(dup(fd_in),...) in the receiver. Not sure why I was using dup() there.

Anyway, all sorted now.
Thanks again for the support.


le_avdata_PushStream: bug after 218 calls