Page 1 of 1

High number of dropped packets

PostPosted: Jan 29th, '14, 17:00
by linuxdad
I am experiencing a high number of dropped packets.

Code: Select all
eth0      Link encap:Ethernet  HWaddr 00:30:48:78:EB:A2
          inet addr:66.207.133.227  Bcast:66.207.133.239  Mask:255.255.255.240
          inet6 addr: fe80::230:48ff:fe78:eba2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27536158 errors:0 dropped:4903591 overruns:0 frame:0
          TX packets:27652836 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:20511743045 (19.1 GiB)  TX bytes:12477776521 (11.6 GiB)
          Interrupt:18 Memory:ca000000-ca020000


A review of the netstat -s reveals the following:

Code: Select all
TcpExt:
    53 invalid SYN cookies received
    9352 resets received for embryonic SYN_RECV sockets
    66 ICMP packets dropped because they were out-of-window
    375335 TCP sockets finished time wait in fast timer
    527 packets rejects in established connections because of timestamp
    1800069 delayed acks sent
    519 delayed acks further delayed because of locked socket
    Quick ack mode was activated 101698 times
    1 times the listen queue of a socket overflowed
    403 SYNs to LISTEN sockets ignored
    36841 packets directly queued to recvmsg prequeue.
    4248230 packets directly received from backlog
    938207 packets directly received from prequeue
    1499078918 packets header predicted
    6089 packets header predicted and directly queued to user
    2576496 acknowledgments not containing data received
    1491457183 predicted acknowledgments
    294 times recovered from packet loss due to fast retransmit
    9856 times recovered from packet loss due to SACK data
    15 bad SACKs received
    Detected reordering 7 times using FACK
    Detected reordering 8 times using SACK
    Detected reordering 5 times using reno fast retransmit
    Detected reordering 25 times using time stamp
    9 congestion windows fully recovered
    30 congestion windows partially recovered using Hoe heuristic
    TCPDSACKUndo: 1237
    3942 congestion windows recovered after partial ack
    TCPLostRetransmit: 352
    105 timeouts after reno fast retransmit
    379 timeouts after SACK recovery
    230 timeouts in loss state
    15071 fast retransmits
    1691 forward retransmits
    3936 retransmits in slow start
    303332 other TCP timeouts
    TCPRenoRecoveryFail: 71
    448 sack retransmits failed
    102193 DSACKs sent for old packets
    1258 DSACKs sent for out of order packets
    6565 DSACKs received
    11 DSACKs for out of order packets received
    20761 connections reset due to unexpected data
    4712 connections reset due to early user close
    8383 connections aborted due to timeout
    TCPSACKDiscard: 5
    TCPDSACKIgnoredOld: 18
    TCPDSACKIgnoredNoUndo: 1000
    TCPSpuriousRTOs: 115
    TCPSackShifted: 76265
    TCPSackMerged: 39968
    TCPSackShiftFallback: 36181
    TCPDeferAcceptDrop: 7476
    TCPRcvCoalesce: 654917
    TCPOFOQueue: 710692
    TCPOFOMerge: 1250
    TCPChallengeACK: 918
    TCPSYNChallenge: 659


Any suggestions on how to eliminate the Communication issues?

Re: High number of dropped packets

PostPosted: Jan 30th, '14, 02:44
by doktor5000
What driver is used for eth0 and what hardware?
Code: Select all
lspcidrake -v | grep NET
would be appreciated
Also, you could take a look at http://prefetch.net/blog/index.php/2011 ... x-servers/
Also using iperf would be a good way to find out where the drops come from.

I'll spare the obvious hint to check for good quality Cat5e/Cat6e cables.

Additionally recommended for reading:
http://www.novell.com/support/kb/doc.php?id=7007165
http://sysadmin-notepad.blogspot.de/201 ... onfig.html

Re: High number of dropped packets

PostPosted: Jan 30th, '14, 17:12
by linuxdad
The Cables should be fairly new, but I got a new batch of cables and will go out and replace a few of them today.

But the interesting part of it all is that the first link you provided pointed to dropwatch. Where do we get that tool? It looks like it existed on Mageia 2, but not Mageia 3?

Re: High number of dropped packets

PostPosted: Jan 30th, '14, 17:22
by wintpe
often this can be caused by tcp segmentation offload in some drivers

this can be tried with

ethtool -K eth0 tso off
ethtool -K eth0 rx off
ethtool -K eth0 tx off
ethtool -K eth0 sg off

if it does not work just reboot or reverse above to on.

but we had issues with broadcom nics in this way.


ethtool -S eth0

will also show similar counters to what netstat is reporting

regards peter

Re: High number of dropped packets

PostPosted: Jan 31st, '14, 01:30
by linuxdad
Can we get Dropwatch for Mageia 3? I saw that it was once available, but cannot find the tool now.

Re: High number of dropped packets

PostPosted: Jan 31st, '14, 14:21
by doktor5000
Well, it's not in our SVN, not even in the obsolete tree.
But you can easily use the Fedora package: http://pkgs.org/search/?query=dropwatch&type=smart

But even then, it requires a flip in the kernel config, which we don't have by default:
Code: Select all
[doktor5000@Mageia3 ~]$ dropwatch
Unable to find NET_DM family, dropwatch can't work
Cleaning up on socket creation error
[doktor5000@Mageia3 ~]$ zgrep CONFIG_NET_DROP /proc/config.gz
# CONFIG_NET_DROP_MONITOR is not set



http://rhn.redhat.com/errata/RHSA-2013-1264.html wrote:* The realtime kernel was not built with the CONFIG_NET_DROP_WATCH kernel
configuration option enabled. As such, attempting to run the dropwatch
command resulted in the following error:

Unable to find NET_DM family, dropwatch can't work
Cleaning up on socket creation error

With this update, the realtime kernel is built with the
CONFIG_NET_DROP_WATCH option, allowing dropwatch to work as expected.
(BZ#979417)


FWIW, this would be the usual way to file package requests: https://wiki.mageia.org/en/How_to_repor ... ge_request
Will probably not happen for Mageia 3 and also not for Mageia 4 due to the required kernel config change.

Re: High number of dropped packets

PostPosted: Feb 1st, '14, 00:15
by jiml8
In the vast, vast majority of cases, large numbers of dropped packets are due to bad cables or connections. Regardless of cable quality, you should avoid running them close to any highly inductive devices such as electric motors. Right behind bad cables are bad NICs as a source of dropped packets.
.
All other sources of dropped packets are way, way, way down the probability list. I personally have encountered it due to packet reordering (which could happen sometimes in some types of upstream routers particularly Juniper routers). I have also encountered it due to software bugs and race conditions. If your networking stack is "full stock" this is hugely improbable, but if you have any 3rd party or custom elements inside your stack, then it is entirely possible. Speaking as someone who develops things to plug deep into the networking stack, I can tell you I've been through all of this.

Presuming yours is a stock Linux installation, then your problem is almost certainly cabling or NIC related.