Reliability isn’t just for getting everything that was sent….
First Published Friday, 25th May 2012 02:31 pm from Real-Time Innovations (RTI) : rtihoward
The opinions expressed by this blogger and those providing comments are theirs alone, this does not reflect the opinion of Automated Trader or any employee thereof. Automated Trader is not responsible for the accuracy of any of the information supplied by this article.
I got a email from a user that basically stated that
"as a general rule, sending data with BEST_EFFORT
Reliability qos (i.e., using nominal UDP semantics) should
provide better performance than sending data with RELIABLE
Reliability QOS on a stable, clean and thus relatively lossless
network".
Hmm, that sounds
reasonable enough….to use a reliable protocol, the
delivery protocol would have to send and process additional
network packets like heartbeats, ACK/NACK packets thus consuming
both network bandwidth and additional CPU cycles. This additional
overhead should make the "performance" of a
reliable connection worse than that of a "best
effort" connection. If not "worse",
it certainly shouldn't make it better…or
would it?
Well, we may want to make some
definitions first. What is "performance"? Is
it maximum throughput? Or the latency of the data (time it takes
from sending to receiving)? Or resources being consumed while
sending at a particular data rate (CPU, network bandwidth,
memory)?
In general, when sending data slower
than the network bandwidth on a "stable, lossless
transport", then with Best Effort, there is no
additional overhead in CPU/network bandwidth/memory being
consumed. Of course, if you use the Reliable mode,
you'll get the same throughput performance but at a
higher "price" (overhead).
So, no, you do not get better throughput/latency, using
Best Effort vs Reliable QOS when sending data below network
bandwidth limitations on networks that do not lose data packets.
You'll get the same throughput/latency
performance…just for lower
"cost".
If there is a
chance that data packets can be lost on the network no matter
what the network load is, then there is an obvious performance
difference between Best Effort and Reliable…not
necessarily in terms of throughput and latency, but in terms of
determinism vs the guaranteed receipt of all data sent in the
order sent.
With Best Effort, you may not
receive all of the data, but you will receive whatever data that
was able to get through with minimal latency (more
deterministically), and no additional overhead will be incurred
even if there is data loss.
With Reliability,
the reliable protocol will be able detect and repair lost packets
so that all of data sent will be received in the order sent, at
the expense of additional network packets (HB, ACK/NACK) to
detect and repair lost packets, not to mention the increased CPU
and memory needed as well. But one could argue that the
"performance" of the Reliable connection is
better if less deterministic (i.e., there may be unpredictable
delays in receiving data while the system repairs missing
data).
That's all good and great
when the data rates are well below the network
bandwidth…
However, when you send
data faster than the system can handle, no matter if the network
itself is "lossless", e.g., shared memory,
data still can be lost…by DDS or the OS if not by the
network hardware.
It's easy to send
data faster than the network can handle. Data rates is calculated
by (amount of data/time). You can overwhelm a network by sending
small 1-200 byte data too fast. Or the same can happen by trying
to send a MB in a single write() call.
When an
application tries to send data faster than the network can
handle, data packets are lost.
In Best Effort
mode, DDS does not try to detect that it is being asked to send
data faster than the network can handle. And in Best Effort mode,
there is no mechanism to stop DDS from pushing data through the
network stack even though the network is saturated. So the
network stack and/or physical network will throw away data
packets exceeding the network bandwidth.
So
just because the "network" is lossless,
doesn't imply that from App to App there
isn't a place where data can be thrown out. The
physical network may never see a packet because the OS throws out
the data packet when the network reports that it can't
handle any more. So the packet isn't lost by the
physical network, but intentionally dropped by the OS or device
driver layer.
e.g., the send socket buffer is
full which causes OS to throw out the data being sent before it
reaches the Ethernet card.
Or more likely,
since it usually take more CPU to process incoming data then to
send outgoing data, Sending apps usually can send much faster
than Receiving apps can process, and thus the receive socket
buffer (or shared memory buffer) fills up while the CPU is busy
processing received packets….then the Ethernet device
or the OS shared memory driver has no choice but to drop the data
packets it's received.
So how fast
is too fast? Well, assuming a "clean
network", it's when the sender tries to send
more than the total amount of data that can be buffered in the
"system" in one go…without any
delay between sends. The "system" being a
combination of the send network stack, the network itself
(including buffers in switches/routers) and the receive network
stack. The main places where significant amounts of data can be
stpred are the send buffer and the receive buffer.
For RTI's shared memory driver, there is no
independent send buffer versus receive buffer vs network buffer,
there is only 1 shared memory buffer. So if you send data
> the size of the shared memory buffer in one go, then
some part of the data will probably be lost.
Let's take the case of sending
"large data". Large data is defined as data
that is larger than the MTU (maximum transmission unit) of the
physical transport. The largest user data packet that can be sent
by UDP is 64K. So sending 1 MB of data in a single write() call
would require some mechanism, either RTI DDS's builtin
large-data fragmentation feature or a user-level software layer,
to break up the large data into smaller (MTU-sized) chunks, and
sending the fragments individually through the physical
network.
And usually sending the data
fragments consecutively without any delay…which with
today's CPU speeds..can easily exceed the maximum
network bandwidth of most networks.
e.g.
sending 1 MB in the 1 ms that takes a CPU to breakup and send 1
MB in 64K chunks through a UDP socket requires a network that can
handle 8 Gbps. A 1 Gbps network would not be able to transmit the
data that fast.
With other networks, if the
large data being sent is greater than the network can buffer,
then data fragments could be lost.
e.g., 1 MB
of data. 64K chunks -> 16 data fragments are
sent. But if the shared memory buffer only holds 512 KB of data,
it's likely that the send side sends much faster than
the receive side can process, so up to 8 data fragments could be
"lost" (in the case that the send side sends
so fast that all of the fragments is "sent"
even before DDS on the receive side has a chance to take one
packet from the network).
The situation that I
just described is exactly what would happen if you try to send
too much data in Best Effort mode. There is no throttle. DDS will
push the data to the network as fast as the application sends the
data. And if the application gives DDS large data (e.g., 1 MB),
DDS will send all of the data in fragments without delay. If the
"network" looses data, then you'll
see your effective throughput either be zero (i.e., the network
is always loosing the last parts of a large data), or not with
high performance (i.e., every now and then you get lucky and all
of the fragments of a large data sample does make it
through).
So, what can you do? Put in a
mechanism to limit the rate that DDS pushes packets onto a
network to something that the network can handle. You can do this
open loop, i.e., put in arbitrary delays between sending of data
at the application layer and/or use the RTI DDS FlowControl
mechanism, or closed loop, by using feedback from the receiving
side to let the send side know when it's OK to send
more data.
The closed-loop mechanism is
basically what you're getting with the Reliable mode.
By using a limited-sized send queue, the reliable mechanism will
block DDS from sending any more data when the send queue is full,
and only when there is feedback (ACKs) back from the receive side
(indicating that it's able to process more packets) is
DDS allowed to send more data. This is also known as
"throttling".
Yes, this
will add some amount of overhead…but sending using the
Reliable protocol to throttle the send rate and thus not lose any
data due to excessive data rates at the cost of
receiving/processing HB/ACK is a small price to pay compared to
sending data so fast that data is lost and then having to use the
same Reliable protocol to repair the lost packets.
So, even when using the Reliable protocol,
it's still better to tune the protocol to never send
faster than the end-to-end network can handle.
In short, for large data, you're almost
guaranteeing that DDS will try to send it fast than the network
(even shared memory) can handle, and thus data will be lost. If
you're using RTI DDS's internal large data
algorithm, the data rate can be throttled using the Reliable
protocol. If your own code is breaking up the large data
yourself, you can use arbitrary delays in your send loop. Another
open-loop approach is to use the RTI DDS FlowControl mechanism
which can be configured to limit the max send rate for a
DataWriter to a specified data rate. The FlowController can also
be used by the RTI DDS internal large data algorithm.
For those of you who have used TCP for transferring MBs
and MBs of data without every having to worry about this
issue…well TCP internally breaks up data to MTU sized
chunks and uses a reliable protocol for data transfer and limited
buffer (queue) sizes so that it doesn't send data
faster than the network can handle. You don't actually
get to choose if you want to send Best Effort or Reliable,
it's always Reliable. And it's hard to tune
TCP to work under abnormal conditions. And the MTU size is
usually based on the MTU of Ethernet (around 1500
bytes).
So, in conclusion, sending data using
Best Effort QOS may not provided the best
performance…especially if peak data rates are greater
than the network data rate. You can see this on the highways of
California…during rush hour, there are metering lights
at the on-ramps that regulate when cars get to get on the
highway. With the metering lights, the network, aka highway, can
be run at higher effect throughput. Without this type of
regulation, driving in the SF Bay area or LA during rush hour
would be more of a mad house than it is.
href="http://feeds.wordpress.com/1.0/gocomments/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/comments/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/godelicious/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/delicious/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/gofacebook/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/facebook/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/gotwitter/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/twitter/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/gostumble/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/stumble/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/godigg/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/digg/rtidds.wordpress.com/415/"
href="http://feeds.wordpress.com/1.0/goreddit/rtidds.wordpress.com/415/">
alt="" border="0"
src="http://feeds.wordpress.com/1.0/reddit/rtidds.wordpress.com/415/"
src="http://stats.wordpress.com/b.gif?host=blogs.rti.com&blog=7350090&post=415&subd=rtidds&ref=&feed=1"
width="1" height="1" />




