Mageia forum

I am experiencing a high number of dropped packets.

Code: Select all: eth0 Link encap:Ethernet HWaddr 00:30:48:78:EB:A2 inet addr:66.207.133.227 Bcast:66.207.133.239 Mask:255.255.255.240 inet6 addr: fe80::230:48ff:fe78:eba2/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:27536158 errors:0 dropped:4903591 overruns:0 frame:0 TX packets:27652836 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:20511743045 (19.1 GiB) TX bytes:12477776521 (11.6 GiB) Interrupt:18 Memory:ca000000-ca020000

A review of the netstat -s reveals the following:

Code: Select all: TcpExt: 53 invalid SYN cookies received 9352 resets received for embryonic SYN_RECV sockets 66 ICMP packets dropped because they were out-of-window 375335 TCP sockets finished time wait in fast timer 527 packets rejects in established connections because of timestamp 1800069 delayed acks sent 519 delayed acks further delayed because of locked socket Quick ack mode was activated 101698 times 1 times the listen queue of a socket overflowed 403 SYNs to LISTEN sockets ignored 36841 packets directly queued to recvmsg prequeue. 4248230 packets directly received from backlog 938207 packets directly received from prequeue 1499078918 packets header predicted 6089 packets header predicted and directly queued to user 2576496 acknowledgments not containing data received 1491457183 predicted acknowledgments 294 times recovered from packet loss due to fast retransmit 9856 times recovered from packet loss due to SACK data 15 bad SACKs received Detected reordering 7 times using FACK Detected reordering 8 times using SACK Detected reordering 5 times using reno fast retransmit Detected reordering 25 times using time stamp 9 congestion windows fully recovered 30 congestion windows partially recovered using Hoe heuristic TCPDSACKUndo: 1237 3942 congestion windows recovered after partial ack TCPLostRetransmit: 352 105 timeouts after reno fast retransmit 379 timeouts after SACK recovery 230 timeouts in loss state 15071 fast retransmits 1691 forward retransmits 3936 retransmits in slow start 303332 other TCP timeouts TCPRenoRecoveryFail: 71 448 sack retransmits failed 102193 DSACKs sent for old packets 1258 DSACKs sent for out of order packets 6565 DSACKs received 11 DSACKs for out of order packets received 20761 connections reset due to unexpected data 4712 connections reset due to early user close 8383 connections aborted due to timeout TCPSACKDiscard: 5 TCPDSACKIgnoredOld: 18 TCPDSACKIgnoredNoUndo: 1000 TCPSpuriousRTOs: 115 TCPSackShifted: 76265 TCPSackMerged: 39968 TCPSackShiftFallback: 36181 TCPDeferAcceptDrop: 7476 TCPRcvCoalesce: 654917 TCPOFOQueue: 710692 TCPOFOMerge: 1250 TCPChallengeACK: 918 TCPSYNChallenge: 659

Any suggestions on how to eliminate the Communication issues?

What driver is used for eth0 and what hardware?

Code: Select all: lspcidrake -v | grep NET

would be appreciated
Also, you could take a look at http://prefetch.net/blog/index.php/2011 ... x-servers/
Also using iperf would be a good way to find out where the drops come from.

I'll spare the obvious hint to check for good quality Cat5e/Cat6e cables.

Additionally recommended for reading:
http://www.novell.com/support/kb/doc.php?id=7007165
http://sysadmin-notepad.blogspot.de/201 ... onfig.html

The Cables should be fairly new, but I got a new batch of cables and will go out and replace a few of them today.

But the interesting part of it all is that the first link you provided pointed to dropwatch. Where do we get that tool? It looks like it existed on Mageia 2, but not Mageia 3?

often this can be caused by tcp segmentation offload in some drivers

this can be tried with

ethtool -K eth0 tso off
ethtool -K eth0 rx off
ethtool -K eth0 tx off
ethtool -K eth0 sg off

if it does not work just reboot or reverse above to on.

but we had issues with broadcom nics in this way.

ethtool -S eth0

will also show similar counters to what netstat is reporting

regards peter

Can we get Dropwatch for Mageia 3? I saw that it was once available, but cannot find the tool now.

Well, it's not in our SVN, not even in the obsolete tree.
But you can easily use the Fedora package: http://pkgs.org/search/?query=dropwatch&type=smart

But even then, it requires a flip in the kernel config, which we don't have by default:

Code: Select all: [doktor5000@Mageia3 ~]$ dropwatch Unable to find NET_DM family, dropwatch can't work Cleaning up on socket creation error [doktor5000@Mageia3 ~]$ zgrep CONFIG_NET_DROP /proc/config.gz # CONFIG_NET_DROP_MONITOR is not set

http://rhn.redhat.com/errata/RHSA-2013-1264.html wrote:* The realtime kernel was not built with the CONFIG_NET_DROP_WATCH kernel
configuration option enabled. As such, attempting to run the dropwatch
command resulted in the following error:

Unable to find NET_DM family, dropwatch can't work
Cleaning up on socket creation error

With this update, the realtime kernel is built with the
CONFIG_NET_DROP_WATCH option, allowing dropwatch to work as expected.
(BZ#979417)

FWIW, this would be the usual way to file package requests: https://wiki.mageia.org/en/How_to_repor ... ge_request
Will probably not happen for Mageia 3 and also not for Mageia 4 due to the required kernel config change.

In the vast, vast majority of cases, large numbers of dropped packets are due to bad cables or connections. Regardless of cable quality, you should avoid running them close to any highly inductive devices such as electric motors. Right behind bad cables are bad NICs as a source of dropped packets.
.
All other sources of dropped packets are way, way, way down the probability list. I personally have encountered it due to packet reordering (which could happen sometimes in some types of upstream routers particularly Juniper routers). I have also encountered it due to software bugs and race conditions. If your networking stack is "full stock" this is hugely improbable, but if you have any 3rd party or custom elements inside your stack, then it is entirely possible. Speaking as someone who develops things to plug deep into the networking stack, I can tell you I've been through all of this.

Presuming yours is a stock Linux installation, then your problem is almost certainly cabling or NIC related.

Mageia forum

High number of dropped packets

High number of dropped packets

Re: High number of dropped packets

Re: High number of dropped packets

Re: High number of dropped packets

Re: High number of dropped packets

Re: High number of dropped packets

Re: High number of dropped packets