First, know your network. know the topology. if you have a network in place where by design, the majority of the variables are deterministic & not left to chance, the task gets a lot simpler. with this out of the way, the following are the different outputs and avenues that might shed further light on what is happening, and why.
a. While the problem is occuring AND while it is not occuring. Get the output of the following commands:
• console# show process cpu ( check IPMapForwarding, BcmRx )
• console# show ip stats (check ICMP )
• console# show arp (Total vs Peak Entries, Cache size, Age time, Dynamic renew mode)
• console# show bridge address-table
• console# show spanning-tree (Portfast, State, STP mode, Root bridge)
• console# show interfaces status (Duplex Drops, speed, flow control)
• console# show interfaces counters (for any two interfaces experiencing problems) (Packet Loss/ drop)
• console# show statistics ethernet (for any two interfaces experiencing problems)
There are some further devshell commands that greatly aid in troubleshooting, however, i shall not post them on the blog as they should only be executed under guidance and suggestion of a Level 3 Escalations rep/agent. Improper use of them can potentially impact your system.
b. Anytime, get the output of the following command (just once) on a switch that had the problem and has not been rebooted :
• Show logging.
c. Network – General description/Topology.
d. Others :
• Show ip, show ip interface (vlan/management)
2. Spanning tree:
Do the log file shows many topology changes? You should not see any topology changes in a stable network. Are all the edge ports set to portfast? What other switches (brand and model) are in your network?
Make sure all edge ports are set to Portfast. This will greatly reduce the TCNs on your network and therefore reduce latency and dropped packets. Port fast not only transitions ports to forwarding state more quickly, but a lesser recognized feature is that it defines a port as edge, thereby suppressing the generation of any TCNs from these.
When a client is disconnected and reconnected, the switchport will send out a STP TCN BPDU. This TCN will cause the switch to fast-age all of its bridge table entries. When the switch’s bridge address table is empty, it is forced to broadcast its unicast traffic out to all ports until the bridge table is repopulated. The solution to this behavior is to enable “port-fast”. This stops the TCN BPDU’s from being sent from the port when a client or server is disconnected / reconnected.
A topology change occurs when a switch either moves a port into the Forwarding state or moves a port from the Forwarding or Learning states into the Blocking state. In other words, a port on an active switch comes up or goes down.
When a switch detects a change it sends a TCN message towards the Root Bridge. Upon receiving the TCN message, the Root Bridge sends a BPDU with TCN flag set towards all switches in the network. When receiving these BPDUs, the switches are determined to shorten their bridging table aging times from the default (300 seconds) to the Forward Delay value (default 15 seconds). At this point, they don’t know how the topology has changed; they only know to force fairly recent bridging table entries to age out. (This causes recently idle entries to be flushed, leaving only the actively transmitting stations in the table).
Switch flushes its CAM table each time you turn on or turn off a PC that is connected to it. This happens only if “portfast” is not enabled on access port. Notice that this type of topology change is mostly cosmetic. No actual topology change occurred because none of the switches had to change port states to reach the Root Bridge.
At first, this doesn’t seem like a major problem because the PC link state affects only the “newness” of the CAM table contents. If CAM table entries are flushed as a result, they probably will be learned again. This becomes a problem when every user PC is considered. Now every time any PC in the network powers up or down, every switch in the network must age out CAM table entries. Also remember that when a switch doesn’t have a CAM entry for a destination, the packet must be flooded out all its ports. Flushed tables mean more unknown unicasts, which mean more broadcasts or flooded packets throughout the network.
Plan and assign STP root manually, as well as other priorities.
3. Flow control, Duplex Drops:
a. Enabling this on all switches and end stations in your network will prevent the switch from dropping packets during times of over subscription. Double check the end stations that they have flow control enabled also. obviously, if there is chronic over-subscription, you will see packet drops whether this is enabled or not.
b. Check Interface counters for any loss reported. Check for duplex/speed drops, i.e. any auto-neg failures.
4. CPU utilization
Check CPU Utilization. IPMAPFORWARDING task.
The IpMapForwarding task should be <10% for normal routing. Any higher than this means that software routing is occurring, so it could be no default route, unknown host, or unresolved next hop gateways, or STP TCN’s causing the task to be higher than normal. The bcmRx task is the frame receive task, and its value can vary slightly given the number of protocols enabled. This process also reports high if a loop occurs at layer 2.
The relevant reason for High CPU utilization in a lot of situations, is software based routing. Usually, CPU always receives a small number of the aforementioned packets, but a continuous flow of a large number of such may indicate config issue or network event. For the majority of packets, the switch performs the packet forwarding function in hardware (ASIC) at very high rate. Software switching occurs when traffic cannot be processed in hardware and the exception packets are forwarded to the CPU for further analysis. Any packet which cannot be forwarded looking at the L2/L3 tables will have to be processed by the CPU. For e.g. VLAN and CoS are part of the Ethernet header. DSCP however (L3 QoS) is part of the IP header, so it depends on QoS policy if any.
Common reasons for high CPU utilization due to process-switched packets are:
1) Packets that are copied to the CPU, but the original packets are switched in hardware. An example is Host MAC address learning.
2) A high number of spanning-tree port instances, or control packets such as BPDUs.
3) ICMP redirects; routing packets on the same interface
4) Layer 2 forwarding loops
5) Excessive ARP and DHCP traffic.
6) Packets that use IP header options
7) Packets that require a response from the layer3 device (TTL is expired, MTU is exceeded, fragmentation is needed, …)
8) Multicast traffic
9) Routing updates
10) Layer 2 discovery protocols (CDP, LLDP)
5. ARP Issues.
a. Cam vs ARP Aging Time :
The problem is usually caused by, once again, the switch routing in software. This can occur for a number of reasons but is usually down to a local host has aged from the L2 table but still exists in the ARP table. This can happen for a number of reasons but happens principally because of spanning tree topology change accelerated aging (BPDU TCNs) or the host has gone silent for some time period. The L2 (MAC Address and VLAN) information ages out but the ARP information (IP address – MAC address binding) remains.
Set ARP timeout equal to the MAC Address Aging time (arp timeout 300), or bridge aging time to greater than ARP time out. Do bridge aging-time 1230. ( ARP timeout defaults to 1200, bridge aging time defaults to 300)
- ARP table is Layer3 address to Layer2 address resolution. MAC Address Table is Layer2 address to interface binding.
- What can happen is that when a MAC address ages out of the MAC table, the packets are forwarded to the CPU until the MAC is relearned. On a busy network with a large number of clients and mostly routed traffic this can show up as high latency and dropped packets.
LAN switches use forwarding tables, such as Layer 2/CAM tables, to direct traffic to specific ports based on the VLAN number and the destination MAC address of the frame. When there is no entry that corresponds to the destination MAC of the frame in the incoming VLAN, the (unicast) frame is sent to all forwarding ports within the respective VLAN. This causes flooding.
On Cisco, the default ARP table aging time is 4 hours while the CAM holds the entries for only 5 minutes. The switch sends out a frame to all forwarding ports within the respective VLAN when the destination MAC address is aged out from the CAM table. You need a CAM aging timer greater or equal to the ARP timeout in order to prevent unicast flooding.
If you have 1024 + hosts, you may need to reduce the timeout so they expire faster and possibly wont fill the arp table as quick (Arp timeout 15). But Lowering ARP timeout would most likely increase CPU load. The CPU impact of the ARP timeout is based on having to do an ARP request and wait for the response at expire time. This scales directly with the number of hosts in the various layer 2 domains. ~30 ARP/sec will cause problems.
b. Arp Table Hash Collision.
This can happen when the arp table gets close to full. Freeing up the arp table entries faster (“no arp dynamic renew” and “arp timeout 300”), should alleviate this problem. Check the cache size for arp. Arp cachesize 1024
Hash collisions occur if the ARP table gets too close to full, or in some events due to certain clustering implementations. If ARP table space was exhausted, the switch would have started routing in software, causing it to run out of resources and eventually crash. Normally, the switch is continuously learning the MAC addresses of various hosts, if they are not yet in CAM table. A large number of MAC table flushes and subsequent relearning would involve the switch sending packets to the CPU for host address learning, and consequently, high CPU utilization.
c. ICMP Redirects/Unreachable :
Routing packets on the same interface, or traffic ingress and egress on the same L3 interface, can result in an ICMP redirect by the switch. If the switch knows that the next hop device to the ultimate destination is in the same subnet as the sending device, the switch generates ICMP redirect to the source. So, the router/switch sends the original packet back out onto the VLAN (packet has traversed the fabric twice now at this point) to the correct destination and sends the ICMP redirect back to the original host (now you have a third packet traversing the VLAN). The implications this has are:
- Each data packet goes through the VLAN (your switch’s back-plane etc.) twice. So double your bandwidth processing there.
- Redirect packets themselves consume bandwidth/link.
- Each ICMP redirect causes your router a CPU Interrupt and processing time. Some routers don’t route-cache for same-interface packets, which might mean the data packets get process-switched, as well.
This effect may becomes especially pronounced if there are a lot of packets traversing the fabric, for e.g. during large data transfers or heavy streams.
Disable the features:
• No ip redirects (globally as well as on main/concerned VLANs )
• No ip icmp unreachables ( globally as well as per interface )
6. Microsoft NLB clustering.
The problem here is that the ARP response from the NLB cluster has a different source MAC address than the HW address that is in the ARP response packet. Therefore the NLB MAC address is never learned at layer 2. Another issue with Microsoft NLB clustering is that if it is set to use multicast MAC addresses, the switch will not put the multicast address in the ARP table. To fix these issues, set NLB to use unicast addresses and put a static ARP entry in the switch with the correct unicast MAC address.
This is when the network path from client to server is different than from server to the client. Eventually the MAC addresses age out of the intermediate switches and software routing can occur. To check this, to a trace route from the client to the server and then from the server to the client. If the paths are different, you may have a multipath issue.
You are a life saver. This has helped me more than I could admit!
Reblogged this on iJuned.
great article, thanks for sharing your experience.