Monday, September 8, 2008

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

The paper poses a (linear) control system view in looking at the congestion avoidance problem. It attempts to formally explain the reasoning behind multiplicative decrease and additive increase for distributed network flow control. In fact, the paper proves that multiplicative decrease and additive increase is only algorithm that would converge, w/o truncation.

The paper assumes a very simple feedback mechanism, basically binary signals, to advice increase or decrease for a user load, and these signals are tagged on by routers. It uses the concepts of efficiency and fairness to define the feasible region (or optimal state), and convergence and distributedness to measure the practicality of different increase-decrease control parameters.

The quality of the paper is superb. The authors did a very good job in illustrating the research problem (i.e. using "cliff" and "knee" to show the difference between congestion avoidance and control), and also in setting up the 4 metrics to quantify their results. They used fairness, convergence, distributedness, and efficiency, each of which is modeled formally using mathematical equations. The modeling equations and their proofs were quite elegant and understandable.

There is a short discussion toward the end that compares between the use of linear and nonlinear controls. Linear has the upper-hand in terms of simplicity and robustness to system parameters like max #users, but has the tradeoffs of "responsiveness" and "smoothness". Basically, it is hard to balance between reaching the optimal point quickly and not overshooting. Non-linear control on the other hand has more flexibility in achieving the optimal load; however, there are many more parameters to configure which lead to less robustness.

It is interesting to note the oscillation mentioned in the paper does not come from delay as commonly with feedback systems. It is from the lack of central coordination (i.e. the sum of individual changes overshoots what is needed). This results from router sending the same control signal to everyone in the network. It shows the beauty of the control being totally stateless (except the clients need to remember their request load).

Throughout the paper, it has a very simplistic view about the network (e.g. a relatively static pool of users). However, we all know that's not the case. I would want to see what are some "peculiarities" that arises in real network flow. What are the effects of having network elements that behaves differently (i.e. bursty, regular, and etc.)?

No comments: