Tuesday, September 16, 2008

Random Early Detection Gateways for Congestion Avoidance (LBL '93)

The paper presents another congestion avoidance algorithm called Random Early Detection (RED) for gateways. Its main goal is to minimize delay (while keeping high throughput), avoid global synchronization, and sensitive to bursty traffic. It's cheap, using a single FIFO queue with no per-connection states.

RED operates around the knee. It uses thresholds "min" and "max" to define three operation modes: mark-nothing, randomly-mark, mark-everything. All the interesting parts are in the second mode. It has to prepare for congestion but at the same time allow bursty traffic, which mean it has to allow packets to get on the queue under certain conditions. But it ultimately has to keep the queue length as small as possible.

Random-drop is better than drop-tail. It has a chance to drop packets for connections that are "hogging" the bandwidth, whereas drop-tail is forced to drop late comers.

RED implements both choices of marking and dropping packets as explicit and implicit congestion signals. It treats the two as equivalent legitimate implementations. I wonder what's the discussion over this. What advantages or disadvantages each has? Has anyone measure the effectiveness for each approach?

RED is basically a compromise between implementation cost and features. On one hand, we have the FQ and CSFQ, which maintains per-connection queues and states. On the other, FIFO drop-tail is the lower bound in cost, but is not sensitive to bursty traffic. By introducing "randomness", RED can allow the queue size to grow occasionally (preferably at times of bursty traffic).

This paper should be kept on the list. It's nice to see the comparison between RED and the other algorithms we read before. The paper is cleanly written overall. It has a clear focus and shows in-depth details of the RED algorithm. The simulation results again were on artifically made models. By now, i would imagine simulating a real network workload is quite difficult. No paper we read so far has done it.

The way they show the nonexistence of global synchronization isn't really working for me. Instead of showing the avg. utiliziation, they should show the variance of the avg. utilization over time.

No comments: