When an Ethernet device gets over loaded, flow control allows it to send PAUSE requests to the devices sending it data to allow the over loaded condition to clear. If flow control is not enabled and an over loaded condition occurs, the device will drop packets. Dropping packets is much more performance impacting than flow control.
802.3X flow control is not implemented on a flow basis, but on a link basis. The problem flow control is intended to solve is input buffer congestion on oversubscribed full duplex links which cannot handle wire-rate input. Flow Control was originally invented to prevent packet drops by switches that were running at less than media-speed. At that time the method of control was usually back-pressure. It also could substantially lower overall throughput through the segments being flow controlled.
- Pause Frames
When the receive part (Rx) of the port has its Rx FIFO queue filled and reaches the high water mark, the transmit part (Tx) of the port starts to generate pause frames. The remote device is expected to stop / reduce the transmission of packets for the interval time mentioned in the pause frame. If the Rx is able to clear the Rx queue or reach low water mark within this interval, Tx sends out a special pause frame that mentions the interval as zero (0x0). This enables the remote device to start to transmit packets. If the Rx still works on the queue, once the interval time expires, the Tx sends a new pause frame again with a new interval value.
If Rx-No-Pkt-Buff is zero or does not increment and the Tx PauseFrames counter increments, it indicates that our switch generates pause frames and the remote end obeys, hence Rx FIFO queue depletes.
If Rx-No-Pkt-Buff increments and TxPauseFrames also increments, it means that the remote end disregards the pause frames (does not support flow control) and continues to send traffic. In order to overcome this situation, manually configure the speed and duplex, as well as disable the flow control, if required. These types of errors on the interface are related to a traffic problem with the ports oversubscribed.
What Flow Control is not:
- Not intended to solve the problem of steady-state overloaded networks or links.
- It is not intended to address lack of network capacity. Properly used, flow control can be a useful tool to address short term overloads on a single link.
- Not intended to provide end-to-end flow control. End-to-end mechanisms, typically at the Transport Layer are intended to address such issues. The most common example is TCP Windows, which provide end-to-end flow control between source and destination for individual L3/L4 flows.
What would happen if Flow Control were not available?
For Ethernet, packets would continue to be sent to the receiving port, but there would be no room for the packets to be temporarily stored. The receiving port would simply ignore these incoming packets. Ethernet and TCP/IP work together to have those “lost” packets re-transmitted. However, it takes time to determine that packets have been dropped, request the re-transmission of those missing packets, and then actually send them.
Flow Control – Where to use it, and where not.
- Edge of a network
Where GE attached servers are operating at less than wirespeed, and the link only needs to be paused for a short time, typically measured in microseconds. The singular clients can be held off without potentially affecting large areas of the network. Flow control can be useful, for example, if the uplink is being swamped by individual clients. CoS/QoS will become more important over time, here.
Flow Control is very important to a well designed and high-performance iSCSI Ethernet infrastructure.
On many networks, there can be an imbalance in the network traffic between the devices that send traffic and the devices that receive the traffic. This is often the case in SAN configurations in which many servers (initiators) are communicating with storage devices (targets). If senders transmit data simultaneously, they may exceed the throughput capacity of the receiver. When this occurs, the receiver may drop packets, forcing senders to re-transmit the data after a delay. Although this will not result in any loss of data, latency will increase because of the re-transmissions, and I/O performance will degrade. Flow Control can help eliminate this problem. This lets the receiver process its backlog so it can later resume accepting input. The amount of delay introduced by this action is dramatically less than the overhead caused by TCP/IP packet re-transmission.
Switches should always be set to auto-negotiate flow control unless Support specifies otherwise. In Cisco terminology, this means using the “desired” setting. If the switch is capable of both sending and receiving pause frames (called symmetric flow control), enable negotiation in both directions (send desired and receive desired). If the switch only supports receiving pause frames (asymmetric flow control), then enable negotiation for receive only (receive desired).
On Equalogic PS Series arrays, auto-negotiation for asymmetric flow control is always enabled.
- Core of Network
It is actually more detrimental to flow control in the core than helpful. Flow control in the core can cause congestion in sections of the network that otherwise would not be congested. If particular links are constantly in a congested state, there is most likely a problem with the current implementation of the network. The right solution is to redesign the network with additional capacity, reduce the load, or provide appropriate end-to-end QoS to ensure critical traffic can get through.
The best way to handle any potential congestion in the backbone is CoS/QoS controls. Prioritizing packets through multiple queues provides far more sophisticated traffic control (such as targeting specific application packet types) than an all-or-nothing, or even a throttled form of flow control.
QoS cannot operate properly if a switch sends PAUSE frames, because this slows all of that ports traffic, including any traffic which may have high priority.
When you enable QoS on the switch, the port buffers are carved into one or more individual queues. Each queue has one or more drop thresholds associated with it. The combination of multiple queues within a buffer, and the drop thresholds associated with each queue, allow the switch to make intelligent decisions when faced with congestion. Traffic sensitive to jitter and delay variance, such as VoIP packets, can be moved to the head of the queue for transmission, while other less important or less sensitive traffic can be buffered or dropped. Ingress and egress scheduling are always based on the COS value associated with the frame. By default, higher COS values are mapped to higher queue numbers. COS 5 traffic, typically associated with VoIP traffic, is mapped to the strict priority queue, if present.