Experiment on Oct 10
Try to use nettest to measure round-trip time and throughput
Dublin is running kernel 2.4.10 with tulip driver compiled as module
Use the default infrastructure
All the experiment are
experiment 2
1. PingPong test
pingserver [port [interface]]
where
port is the port number the server is to listen on (default or 0
value means let the system choose a port).
Value must be a positive integer.
interface is the interface the server is to listen on (default or 0 value
means all interfaces on this node).
Value must be an IP address in dotted decimal notation or
a DNS name.
pingclient port [interface [iterations [size]]]
where
port is the port number on which the server is listening (required).
interface is the interface to use to get to the server (default is
the DNS name of the node on which the client is running).
iterations is the number of messages the client is to send to the server
(default is 100).
size is the number of bytes in each message (default is 1, maximum
is 1,000,000).
What does the result data mean?
server immediately sends back whatever it receives from
client.
On each iteration step, client sends size
Byte data and waits to receive that piece
of data.
The elapse time is the
iterations times of sending and receiving
size Byte data
on client side.
The actual data sent from the NIC is a little more than
iterations*size
Byte because
of the TCP header, IP header, Ethernet frame wrapping
So the actual throughput should be a little more than the number we get here.
My explanation on the Round Trip Time Variation
There are a few obvious "jump" in the RTT when request size grows up.
When request size is small, there only one TCP packet==>one IP packet ==>one Ethernet frame involved in the communication. So the RTT is almost the same. It equals to the time to send one Ethernet frame. We can see that from the first significant jump when request size = 2048. The RTT almost 2 times as that when request size is 1024. We know that the maximum Ethernet Frame size is 1526 Byte( 1500 for data, 26 for Ethernet wrapper).
The second obvious jump happens when request size is 9K and 10K.See the data table above, but Why????
The third significant jump happens when request size is 64K.
It's pretty interesting that:
When request size is 32KByte, the RTT is much less than it was when request size is 16KB or 64KB
When request size is 64KByte, The RTT is not that stable as other cases. RTT fluctuates.
Re-do the experiments on request size 32K bytes and 64K bytes
Re-Test Round Trip Time when request size is 32KB and 64KB |
||||||||||||||||||
Request Size ( KB) | Elapse Time for 1000 iteration times | AVG(sec) | Var | RTT(us) | ||||||||||||||
1st | 2nd | 3rd | 4th | 5th | 6th | 7th | 8th | 9th | 10th | 11th | 12th | 13th | 14th | 15th | ||||
32 | 5.97 | 5.98 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.97 | 5.96 | 5.97 | 0.001333 | 5970 |
64 | 17.31 | 19.94 | 15.60 | 31.26 | 33.24 | 20.00 | 17.10 | 17.40 | 20.02 | 15.33 | 15.31 | 19.62 | 16.29 | 16.15 | 20.00 | 19.638 | 3.550933 | 19638 |
The result of running tcpdump on Dublin shows that
The fluctuation of round trip time is NOT because of "TCP packet retransmitting", See experiments here
2. Blast Test
For the blast test, the number of iterations times the message size all divided by the elapsed time gives the throughput rate in bytes per second.
blastserver [port
[interface]]
where
port is the port number the server is to listen on (default or 0
value means let the system choose a port).
Value must be a positive integer.
interface is the interface the server is to listen on (default or 0 value
means all interfaces on this node).
Value must be an IP address in dotted decimal notation or
a DNS name.
blastclient port
[interface [iterations [size]]]
where
port is the port number on which the server is listening (required).
interface is the interface to use to get to the server (default is
the DNS name of the node on which the client is running).
iterations is the number of messages the client is to send to the server
(default is 100).
size is the number of bytes in each message (default is 1, maximum
is 1,000,000).
What does the result data mean?
server swallow whatever it receives until finally it
sends 1 byte to finish the connection
Here client don't read back the data it sent. So the throughput we get here
should be a little higher than ping-pong test.
The actual data sent from the NIC is a little more
than iterations*size
Byte because
of the TCP header, IP header, Ethernet frame wrapping
So the actual throughput should be a little more than the number we get here.
Blast Test on Experiment 2 | ||||||||||
Tested Size byte | LOG(Size) | Iteration Times |
Elapse Time for iteration times ( second ) |
Throughput (Byte/Sec) | ||||||
1st | 2nd | 3rd | 4th | 5th | AVG | Variation | ||||
1 | 0 | 1000000 | 1.70 | 1.69 | 1.69 | 1.69 | 1.69 | 1.692 |
0.003 |
591016.59 |
2 | 1 | 1000000 | 1.64 | 1.64 | 1.64 | 1.64 | 1.64 | 1.64 | 0.000 | 1219512.20 |
4 | 2 | 1000000 | 1.43 | 1.43 | 1.44 | 1.44 | 1.43 | 1.434 | 0.005 | 2789400.28 |
8 | 3 | 1000000 | 1.62 | 1.63 | 1.62 | 1.62 | 1.61 | 1.62 | 0.004 | 4938271.61 |
16 | 4 | 1000000 | 1.92 | 1.93 | 1.90 | 1.92 | 1.91 | 1.916 | 0.009 | 8350730.69 |
32 | 5 | 1000000 | 2.87 | 2.88 | 2.87 | 2.87 | 2.88 | 2.874 | 0.005 | 11134307.59 |
64 | 6 | 1000000 | 5.73 | 5.73 | 5.73 | 5.72 | 5.73 | 5.728 | 0.003 | 11173184.36 |
128 | 7 | 100000 | 1.15 | 1.15 | 1.14 | 1.15 | 1.15 | 1.148 | 0.003 | 11149825.78 |
256 | 8 | 100000 | 2.29 | 2.29 | 2.29 | 2.29 | 2.29 | 2.29 | 0.000 | 11179039.30 |
512 | 9 | 100000 | 4.46 | 4.46 | 4.46 | 4.46 | 4.46 | 4.46 | 0.000 | 11479820.63 |
1024 | 10 | 10000 | 0.90 | 0.90 | 0.89 | 0.89 | 0.89 | 0.894 | 0.005 | 11454138.70 |
2048 | 11 | 10000 | 1.75 | 1.75 | 1.81 | 1.75 | 1.75 | 1.762 | 0.019 | 11623155.51 |
3072 | 11.58496 | 10000 | 2.61 | 2.62 | 2.62 | 2.62 | 2.62 | 2.618 | 0.003 | 11734148.20 |
4096 | 12 | 10000 | 3.49 | 3.50 | 3.50 | 3.49 | 3.50 | 3.496 | 0.005 | 11716247.14 |
5120 | 12.32193 | 10000 | 4.39 | 4.39 | 4.39 | 4.39 | 4.38 | 4.388 | 0.003 | 11668185.96 |
6144 | 12.58496 | 10000 | 5.23 | 5.23 | 5.23 | 5.23 | 5.23 | 5.23 | 0.000 | 11747609.94 |
7168 | 12.80735 | 10000 | 6.10 | 6.10 | 6.10 | 6.10 | 6.09 | 6.098 | 0.003 | 11754673.66 |
8192 | 13 | 10000 | 6.99 | 6.99 | 6.99 | 6.99 | 6.99 | 6.99 | 0.000 | 11719599.43 |
9216 | 13.16993 | 10000 | 7.85 | 7.85 | 7.85 | 7.85 | 7.85 | 7.85 | 0.000 | 11740127.39 |
10240 | 13.32193 | 10000 | 8.71 | 8.71 | 8.71 | 8.71 | 8.71 | 8.71 | 0.000 | 11756601.61 |
16384 | 14 | 100000 | 139.40 | 139.37 | 139.43 | 139.36 | 139.36 | 139.384 | 0.025 | 11754577.28 |
32768 | 15 | 100000 | 278.78 | 278.72 | 278.78 | 278.72 | 278.71 | 278.742 | 0.030 | 11755673.71 |
65536 | 16 | 100000 | 557.36 | 557.28 | 557.25 | 557.27 | 557.25 | 557.282 | 0.031 | 11759934.83 |
98304 | 16.58496 | 100000 | 835.75 | 835.81 | 835.69 | 835.76 | 835.70 | 835.742 | 0.038 | 11762481.72 |
131072 | 17 | 100000 | 1120.42 | 1114.50 | 1114.48 | 1114.58 | 1114.41 | 1115.678 | 1.897 | 11748192.58 |
Conclusion on blast test:
When request size is very small, TCP, IP, Ethernet pads data, the throughput is low. As request size increases, the throughput increases. And the throughput converges to the full bandwidth( Currently, the maximum bandwidth we can get from experiment is: 11762481.72 Byte/Sec
Theoretical Max Bandwidth calculation:
We know 1 Mb/sec in Ethernet specification is 1000000 bit/sec, 1 KB = 1024 Byte, 1 Byte = 8 bits
Fast Ethernet 100Mb/Sec = 10^8 bits/Sec = 12500000 Bytes/sec
For overhead detail see the following:
1. ETHERNET OVERHEAD
The max payload rate is: 1500/1526
The least payload rate is: 1/72 (using 45 bytes padding )
2. IP Layer Overhead
The Minimum IP header is 20 Bytes. Although the data is variable, from the 16 bit "Total length" field we know
The max IP data payload is 64Kbytes - 20Byte = 65516 bytes
3. TCP layer overhead
The minimum TCP header is also 20 bytes.
For the detailed Raw Experiment data, see this file