|
From: Erik R. <eri...@do...> - 2016-05-08 22:13:42
|
> Diving into this, it seems that the message buffer is getting > increased in size until it hits the maximum. The reason for this is > not yet clear to me, but I suspect something is getting desynchronized > in the stream due to send failure, and it's reading the wrong bytes as > the message length. I’ve been investigating this further, and I can confirm that this is the case - at least in my example setup. send_data calls send, and it checks the return value for errors (-1). However, it does NOT check that the number of bytes written to the send buffer matches the size of the data to send. Therefore, if the send buffer gets full (which obviously happens when sending ”too fast”), only part of the buffer gets sent. send_data then returns happily, discarding the part of the message that was never sent. When send_data is called the next time, it will either fail with EAGAIN (if the send buffer is still full), or send the beginning of a new message, thereby corrupting the stream on the receiving side. I cannot see an easy fix for this: both the message serialization and sending happens inside send_data, so even if send_data was to return with an error code in this case, the amount of data left to write would be ”forgotten”. So the responsibility of calling send again with the remaining data would therefore best be placed on send_data, but that would essentially make send_data blocking. HOWEVER, I’m actually not sure that this was my original problem, since at that point I was not using liblo to send the messages, I only used it as a server. So there may be another bug luring around, related or unrelated. Erik |