CS268 Reading Summaries: September 2008

Monday, September 29, 2008

MACAW: A Media Access Protocol for Wireless LAN's (UCB)

Finding the right MAC protocol for wireless is not a solved problem. The paper is interested in finding a good protocol for the wireless LAN for Xerox at PARC. It has a multi-cell topology with each cell consisting of one base station and multiple nodes (the paper calls them "pads"). CSMA isn't suitable, and the paper illustrates this with the "hidden" and "exposed" terminal phenonmena. The root problem is that wireless signals have side-effects in interfering with other channels. With CSMA, implicit sensing is not enough to prevent collisions.

MACA is the predecessor of MACAW. MACA adds two new control signals (RTS and CTS) to help synchronize channel access. This helps nodes to communicate their state with nearby nodes. I.e. "I am prepared to listen to someone", or "I just started a communication stream with someone", and etc. This helps other nodes to know when to backoff. MACA assumes nodes to exchange RTS and CTS to establish connectivity. But what happens if we have rapid mobile environment where new nodes continuously comes into the cell? What is the effect of having node not hearing these control signals?

Of course, MACA is a conservative algorithm. In order to achieve more throughput and add things like fairness, this is where MACAW comes in. It adds three new control signals, Data-Send (DS), ACK, and RRTS, to increase efficiency of MACA. These signals brings even greater synchronization between nodes. However, they decrease potential contention by sending more control messages. That seems to be expensive in wireless. Power consumption overhead of setting up a transmission is incredibly demanding. Also, i don't think measuring throughput alone is the right measurement. Measuring throughput over power may be a better way to go.

In addition, MACAW enchances MACA's binary exponentially backoff (BEB) in two ways. The original BEB has problem with fairness and have starvation problems. MACAW first uses mutliplicative-increase-linear-decrease (MILD) policy to adjust backoff counter. Besides a very witty name, it's similar to the idea of AIMD. Second, it allows both the receiver and the sender to adjust the backoff counter when transmission succeeds.

The paper is a nice short read and shows a lot of the corner cases in designing a wireless link protocol. It's definitely a very involved process. On the other hand, it would also be nice to combine this reading with something about practicality issues like power consumption and environment hazards and other deployment challenges.

Sunday, September 28, 2008

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links

Transport for wireless is different from wired. It requires extra care to avoid performance penalty. This is because wired assume most packet loss and performance degradation is due to congestion, which is not quite true for wireless. The way signal is transferred in wireless environment is more prone to random bit errors. TCP deals with packet loss by invoking congestion control (backing off, i.e. decrease window size, and timeout). This is inappropriate and decrease performance.

The paper surveys three main approaches to deal with this problem. The first is to do nothing special. They called it end-to-end (E2E). Second is called link-layer (LL) which acts like a snoop agent, running at a faster rate than tcp. It tries to handle non-congestion related loss before propagating to tcp. The last is a split-connection scheme that adds an extra wired link from each sender to a base station. This uses the base station as a buffer and decouples the sender and receiver; thus, loss on receiver side won't necessarily slow down the sender, and vice-visa.

Along with each approaches there are enchancement techiques to better performance. Selective acks (SACK) is one, which provide extra ack information over the cumulative ack. Explicit loss notification (ELN) is another, which gives information about the cause for packet loss. Within the SACK paradigm, SMART is a special kind SACK that uses bitmap storage the recent ack history. SACK and ELN can be added to each of these three general schemes.

The paper measures the throughout for each of the different schemes against the peak (error-free) throughput (1.5 Mbps for LAN and 1.35 Mbps for WAN). They finds LL + SMART + TCP-AWARE to be the best against bit-errors. This TCP-awareness isn't explained in the paper. It would be nice to see what it really does. In the end, it comes down to how good are these protocols in preventing tcp from timeouts and shrinking window size. There is also a question of false-negative. How well do these protocols detect real congestion? I would like to see extension of these experiments in a more general settings w/ bit errors + congestion.

Tuesday, September 23, 2008

Scaling Internet Routers Using Optics (Stanford '03)

The paper goes very in-depth in designing a 100Tb/s router. It argues for a multi-stage over single-stage design, even though signle-stage is simple. The paper specifically proposes adding a load-balancing stage that randomize the input packets before entering the VOQs. It proposes Full Ordered Frames First (FOFF) to correct mis-sequencing of packets.

The concept of partitioning a switch into multiple stages is very novel. It's similar to pipelining in a circuit. Specializing individual stages of work increase throughout. However, partitioning a switch is at a more macro level. I wonder if there is any formal study in the design of the partitioning? What are the different ways to do the partitioning? What are some of the factors that drive the division?

The paper also suggests decentalized schedulers. This increases scalability and fault tolerance. But it's not clear how this is done. The discussion about the use of optics and MEMS is also too brief for me. I would be interested in reading some follow-up papers on these topics.

I agree that the paper lacks results and evaluation. It focuses a lot on certain design details, which make it hard to understand. This is the first paper i read from this class that i feel lost, which happens very often for my other classes=). On the up side, i did get a chance to google many things i didn't know about router design.

Monday, September 22, 2008

A Fast Switched Backplane for a Gigabit Switched Router ('97)

Router design takes quite a bit of hardware organization. The internal of a router is divided into forwarding decision, backplane, and output-link scheduler. Forwarding decision takes care of the MAC header (TTL, checksum, and etc.). Output-link scheduling handles flow or priority control to meet QoS, while the backplane does the actual transfer from the input to its outgoing port. The paper analyzes the design of backplane.

Traditionally, backplane is built using bus. This is called "shared backplane". It's known to be non-scalable because at most one port can transmit at any given time. Using crossbar switch would be much more scalable, since there are multiple connections between the set of inputs and outputs. Packets can be transmitted simultaneously. There is time slot, and the between each time slot the switch is reconfigured. This means packets needs to be fixed length now in order to fit the time slot. This sounds similar to the comparison between CSMA and TDMA, only this being at the circuit level.

Crossbar-switched backplane faces three scenarios that may block signal from transmitting. First is called head-of-line (HOL) blocking, which comes from scheduling multiple FIFO queues. The top element prevents the ones after from participating. Input and output blocking are caused by contention in accessing the hardware.

The way to deal with HOL blocking is to move away from FIFO and use something called Virtual output queuing (VOQ). Input maintains a queue for each output port, and sort the incoming packet into the corresponding queue before choosing one VOQ as input to enter the crossbar switch. In a sense, it adds an additional multiplexing stage. A simple scheduling algorithm called iSLIP is used to schedule these VOQs. iSLIP is two-sided (hand-shaking agreement), where output-side uses RR and input-side uses priority.

I wish the VOQ stuffs can be illustrated more in the paper. Sometimes it's hard to picture how these things are implemented. More importantly, they don't give a sense of what the overhead and cost are, even though the paper seems to focus heavily on implementation.

The paper gives a rough overall picture of the tread in router design. It gives a good sense to beginners like me in knowing about how queuing is done in routers. I recommend keeping this paper, especially if you'll need to read the optics paper.

Thursday, September 18, 2008

Fundamental Design Issues for the Futre Internet (Shenker '94)

The internet has always been structured to give "best-effort" service. It deliver packets in the most efficient or fair manner, in the protocol design point of view. However, the author argues that the best network design should be sensitive the application requirements that may be different for everyone. The overall performance of the network is measured by the sum of the utility functions for each app. Utility functions take delivery delay as parameter (i.e. each app has a different sensitivity to delay).

The way to maximize these utility functions is to extend the existing "best-effort" service model by incorporating other kinds of services. For example, the paper mentioned elastic, hard RT, delay-adaptive RT, and rate-adaptive RT. There are potentially more that would even depend on parameters other than delay.

There is simple analysis of suggesting how to extend the service model. It shows having multiple homogeneous service networks is less efficient than having heterogeneous service networks. This seems counter-intuitive at first. It seems there can be more optimization opportunities in a homogeneous setting (i.e. more assumptions can be made). In terms of RT and non-RT scheduling, this suggests that the two actually complements each other. For example, datagram service can fill up the leftover BW of the RT traffic.

The choice of having admission control or overprovisioning is a concern for future network design. On one hand, overprovisioning affects application performance. But having admission control is nice because one can avoid potentially all overloading (or congestion).

I like the paper. It presents the big ideas for future network design. Although the analysis were quite simplistic, it does illustrates the design issues.

Supporting Real-Time App in a Integrated Services Packet Network: Architecture and Mechanism (MIT)

The paper introduces the idea of Integrated Services Packet Network (ISPN), which allows real-time (RT) and non-RT traffic to share bandwidth. It classifies three types of services: guaranteed, predicted, and datagram. Guaranteed is real-time service that has a hard deadline of delivery. Clients need to declare a usage rate before being admitted into the network. Predicted is more like "soft" real-time, where the delay bound of delivery is assumed to be adaptive and adjusted by previous experienced delay. Datagram is relatively time-insensitive but is reserved with some minimal amount (10%, chosen in the paper) of BW to ensure progress.

Jitter is a main problem in RT services. Many reasons lead to jitters. The transport layer, in particular, induces jitter from multiplexing packets from different sources. Packets get delayed because of delivery of packets of other unrelated sources. This is the commonly known side-effect of sharing. Interestingly, the author suggests that jitter can be minimized if sharing is prioritized.

Isolation is inevitable for hard real-time services, but isolation is expensive. Providing a integrated service can be seen as tradeoff between cost and performance. At the same time, users can easily switch between RT and non-RT servicing. This may allow future applications to have a more dynamic and finer-grained timing behavior.

For soft RT and non-RT servicing, the author proposes FIFO+, which incoporate the concept of priority into FIFO, to mimize the max delay (jitter). It uses the expected instead of actual arrival time. The expected arrival time is driven by the packet's deadline.

Priority can be set at different granularity. The common way is to associated a priority number with each connection link. In multihop environment, the paper suggests that priority of packets can change at each hop. This leads to a more dynamic and flexible scheduling, since packets can now "accelerate" to catch up with their deadline.

The paper is nice in that it shows a new dimension, timing, in network service. It also shows how to incorporate this new "stricter" service with the traditional services.

Tuesday, September 16, 2008

Congestion Control for High Bandwidth-Delay Product Networks ('02)

The paper proposes the Explicit Control Protocol (XCP). It aims to address issues that TCP is not good at. For example, TCP is not very efficient in filling up the available bandwidth (not increasing fast enough) in high bandwidth environment. TCP has bias against high latency links because its throughput is inversely proportional to RTT.

XCP is a end-point AND gateway solution. Explicit congestion control information is exchanged between the end hosts and routers. The fairness controller (FC) and an efficiency controller (EC) are purposely separated for simplicity and modularity (separation of concerns). However, EC augments the feedback responses produced by FC. There is potentially a fixed-point issue here (i.e. Should EC's response affect FC's?). Also, allocation in terms of window size seems problematic. What if their windows are larger than they need?

The simulation results for XCP were unrealistically good. This is significant in that it shows how much we can improve from having more control information. It shows how much off TCP is from the "upper bound". A big problem however is the question of trust. It has to trust the end-points to cooperate by sending proper header info.

Random Early Detection Gateways for Congestion Avoidance (LBL '93)

The paper presents another congestion avoidance algorithm called Random Early Detection (RED) for gateways. Its main goal is to minimize delay (while keeping high throughput), avoid global synchronization, and sensitive to bursty traffic. It's cheap, using a single FIFO queue with no per-connection states.

RED operates around the knee. It uses thresholds "min" and "max" to define three operation modes: mark-nothing, randomly-mark, mark-everything. All the interesting parts are in the second mode. It has to prepare for congestion but at the same time allow bursty traffic, which mean it has to allow packets to get on the queue under certain conditions. But it ultimately has to keep the queue length as small as possible.

Random-drop is better than drop-tail. It has a chance to drop packets for connections that are "hogging" the bandwidth, whereas drop-tail is forced to drop late comers.

RED implements both choices of marking and dropping packets as explicit and implicit congestion signals. It treats the two as equivalent legitimate implementations. I wonder what's the discussion over this. What advantages or disadvantages each has? Has anyone measure the effectiveness for each approach?

RED is basically a compromise between implementation cost and features. On one hand, we have the FQ and CSFQ, which maintains per-connection queues and states. On the other, FIFO drop-tail is the lower bound in cost, but is not sensitive to bursty traffic. By introducing "randomness", RED can allow the queue size to grow occasionally (preferably at times of bursty traffic).

This paper should be kept on the list. It's nice to see the comparison between RED and the other algorithms we read before. The paper is cleanly written overall. It has a clear focus and shows in-depth details of the RED algorithm. The simulation results again were on artifically made models. By now, i would imagine simulating a real network workload is quite difficult. No paper we read so far has done it.

The way they show the nonexistence of global synchronization isn't really working for me. Instead of showing the avg. utiliziation, they should show the variance of the avg. utilization over time.

Wednesday, September 10, 2008

Core-Stateless Fair Queuing: Achieving Approximately Fair Bandwidth Allocation s in High Speed Networks (CMU '98)

Core-Stateless Fair Queuing (CSFQ) is an approximation of FQ implementation. It is designed to lessen the expense of implementing the full-blown version of FQ. CSFQ operates over an "island" of routers (s.t. inner routers are called core, and the perimeter called edge), where flow classification needs to be done once only at the edges. Core routers merely operate and propagate flow information. Like the other paper, CSFQ is compared with a handful of other alternatives, in this case both the hypothetical upper bound and other cheaper implementations that abstract away flow-states or multiple buffers.

CSFQ does not explicitly keep track of separate flow buffers, rather it keeps a small number of per-flow states (i.e. drop rate, arrival index). It computesthe drop probability for a incoming packet. In essence, it achieves similar desired effects of providing protection and being fair. It's also sensitive to burstiness of flow, which is a good thing (i.e. bursty flows can have good turn around time).

It's great that probabilistic method minimize the states we have to keep. Now, i see more reason why packets can be delivered out of order. Dropping packets randomly and regularly in the middle of a packet stream correspondingly adds burden on the endpoints to deal with these random "holes". Thus, we can see this as pushing the complexity away from the network and into the end-points.

Although the paper address the cost issue with FQ, it didn't give the full details on the bottleneck. I would like to see more on that in terms of implementation issues. I am also skeptical about the core-edge separation. Heterogeneity impedes scalability, in general. CSFQ also assumes that flow goes through a fixed set of routes, which doesn't map well to dynamic routing.

The paper explicitly talked about the concept of "punishment", which seems controversial. Basically, users who are nice are rewarded with more bandwidth while barbbling idiots get their share cut backed. This assumes that everyone is equal. But the reality is there are stratifications, like class, authority and etc. Should users with superior hardware and technology not be rewarded? Should everyone use the network the same?

Analysis and Simulation of a Fair Queueing Algorithm (PARC '90)

It describes a conceptual queuing model for network gateways, called fair queuing (FQ). It aims to provide protection between user loads for a router. It proposes to use separate queues to hold different user streams, and transmits with a round-robin fashion. FQ is contrasted with first-come-first-serve (FCFS) in terms of fairness, throughout, efficiency (i.e. # of dropped packets), and promptness (or avg. rtt). The two is simulated with rather unrealistic but isolated (understandable) scenarios.

FQ is found to be more fair in relation to users. The bulk of it is in defining what is "fair" and what is considered a "user". It turns out varying these definitions gives very different implications. For example, comparing package count favors large packages over small ones. The paper defined fairness in terms of bit-by-bit (rather than packet-by-packet) transmission ratio between users, and a user to be a source-dest conversation.

FQ rewards well-behaved users. Ill-behaved users would end up overwriting their own queues, thus increase their number of retransmits. This means that flow-control algorithm used by the end-hosts is encouraged the play nice and not create any congestion, since they are only hurting themselves. The paper showed that the Jain-Karel flow control with slow-start + AIMD out-performs other not-as-nice flow-controls. But what is considered to be nice enough, the paper didn't say...

One thing FQ adds is a trust boundary between gateways and end-hosts. One end host misbehaves may not influence others. It separates flow-control (end hosts) and queuing (gateway). In effect, gateways are "smarter" in resolving congestion problems on its own.

Monday, September 8, 2008

Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks

The paper poses a (linear) control system view in looking at the congestion avoidance problem. It attempts to formally explain the reasoning behind multiplicative decrease and additive increase for distributed network flow control. In fact, the paper proves that multiplicative decrease and additive increase is only algorithm that would converge, w/o truncation.

The paper assumes a very simple feedback mechanism, basically binary signals, to advice increase or decrease for a user load, and these signals are tagged on by routers. It uses the concepts of efficiency and fairness to define the feasible region (or optimal state), and convergence and distributedness to measure the practicality of different increase-decrease control parameters.

The quality of the paper is superb. The authors did a very good job in illustrating the research problem (i.e. using "cliff" and "knee" to show the difference between congestion avoidance and control), and also in setting up the 4 metrics to quantify their results. They used fairness, convergence, distributedness, and efficiency, each of which is modeled formally using mathematical equations. The modeling equations and their proofs were quite elegant and understandable.

There is a short discussion toward the end that compares between the use of linear and nonlinear controls. Linear has the upper-hand in terms of simplicity and robustness to system parameters like max #users, but has the tradeoffs of "responsiveness" and "smoothness". Basically, it is hard to balance between reaching the optimal point quickly and not overshooting. Non-linear control on the other hand has more flexibility in achieving the optimal load; however, there are many more parameters to configure which lead to less robustness.

It is interesting to note the oscillation mentioned in the paper does not come from delay as commonly with feedback systems. It is from the lack of central coordination (i.e. the sum of individual changes overshoots what is needed). This results from router sending the same control signal to everyone in the network. It shows the beauty of the control being totally stateless (except the clients need to remember their request load).

Throughout the paper, it has a very simplistic view about the network (e.g. a relatively static pool of users). However, we all know that's not the case. I would want to see what are some "peculiarities" that arises in real network flow. What are the effects of having network elements that behaves differently (i.e. bursty, regular, and etc.)?

Thursday, September 4, 2008

Congestion Avoidance and Control (UCB)

This describes a series of enhancements to the pre-90's TCP in order to deal with collapses due to congestion. The author, Van Jacobson, is big in bettering the performance of TCP in many ways (i.e. he also came up with Header Compression). The techniques described are significant and are said to have saved the internet from being unusable at the time. The root cause of the congestion problem is due to equilibrium not reached. Three techniques were proposed, including slow-start, round-trip time variance est., and congestion avoidance algorithm.

Slow-start requires new-comers or restarters to start at a low window size and gradually (exponentially) increase to saturation. Rtt variance estimation gives a better tolerance for high delay connections so that retransmission is minimized (s.t. network won't become even more congested). Finally, their congestion avoidance is basically using multiplicative decrease and additive increase to adjust window size.

It was quite clever to make the network signals implicit rather some bit in the header. A timeout tells an end-point to either backoff or restart after exceeding a certain threshold. Increasing the window size is triggered by a successful ACK. All these seem wonderful in keeping the protocol simple and stateless. However, does that mean window size is constantly adjusting? One major focus for the paper is on reaching equilibrium. Constantly pushing for increase and chopping it in half does seem a little wasteful. Toward the end of the paper has a small discussion about keeping the drop rate low, which was not totally clear.

The paper should be kept on the list. It shows many features we see in today TCP. It also illustrates the impact of these features (i.e. what it is like without them).

On Inferring Autonomous System Relationships in the Internet (UMass)

The paper is based on a very nifty idea that one can use BGP routing table entries to infer AS relationships based on the provider-customer-peer-sibling business model. The inference algorithms presented are heuristic-based. Its targeted the method on data from the Route View Project (http://www.routeviews.org) and verified its results with high accuracy (>90%) with one particular ISP, AT&T.

It first shows a nice overall picture of the structure of inter-domain routing. It follows with very in-depth details on the BGP protocol and addresses the role of some of centralized efforts, such as IRR and ARIN, for inter-network routing. The introduction and background sections were very helpful in bringing context and understanding for different agencies in the network, although i feel that the author abused the formal proofs and notations quite a bit. The overall value of this paper is really seeing the way networks are connected really conforms to what the underlying business model is. I think the paper should stay -- but only under the condition that it is to bring context rather than spanking students with the unnecessary formalisms.

Most of the research mentioned in related work seem to be a reverse engineering for discovering connectivity and by-passing the administrative/policy layer that BGP imposes. It suggests that BGP could be designed better in the first place to export connectivity information and allow these analyses more formally.

Wednesday, September 3, 2008

Interdomain Internet Routing (MIT)

This lecture gives a overview of the BGP implementation. BGP is FSM-based protocol that runs on TCP. It assumes independently administrated entities called AS (autonomous systems). BGP distinguishes the two kind of links (AS-AS and inner-AS), and these links run different sessions called eBGP and iBGP. It keeps track of multiple attributes (i.e. ASPATH, LOCAL_PREF, MED, and etc.) to select from multiple route records.

BGP focused on making the internet a fully decentralized system, allowing routes learned from multiple networks. It's design for reachability and scalability. Its damping (basically delay timer) feature helps to bring stability and protect from DoS attacks.

Since the protocol is designed for independently administrated networks, making it flexible using configuration parameters seems like natural thing to do. However, these parameters, along with the damping and how iBGP is organized, all seem pretty hacky. The paper wasn't quite clear on defining and showing consistency for the protocol. It suggests that there is probably a problem with determinacy. There is, as the author claims, still a lot of research for this and other areas of BGP.

Economic factors are a big part in BGP. It is designed to allow administrative domains (ISP's) to freely decide which routes to advertise or accept. This is really the essence of BGP, in that it overlays reachability over connectivity.

The paper overall was informative about real-world factors relating to protocol design but spent a little too much, i thought, on certain implementation details. It's well-complemented by the second paper, which gave a broader context and clearer motivation for BGP.

Tuesday, September 2, 2008

THE DESIGN PHILOSOPHY OF THE DARPA INTERNET PROTOCOLS

The paper outlined and prioritized the original design goals for the early Internet architecture. The main focus "was to develop an effective technique for multiplexed utilization of existing interconnected networks", mostly referring to survivability, service and network poly-morphism.

1. Internet communication must continue despite loss of networks or gateways.
2. The Internet must support multiple types of communications service.
3. The Internet architecture must accommodate a variety of networks.
4. The Internet architecture must permit distributed management of its resources.
5. The Internet architecture must be cost effective.
6. The Internet architecture must permit host attachment with a low level of effort.
7. The resources used in the internet architecture must be accountable.

I personally like this paper. The paper shows in details the architecture chosen that is related to each of the design goals. I think the paper should stay in the reading list. It gives a lot of historical facts and perspective. It gives me continuity in terms of learning about why some of the networking design are the way they are.

Toward the end, it basically advocates for datagram as the more popular and useful transport because it allows greater flexibility and efficiency. This is a "most-likely" conclusion that can be drawn from the end-to-end arguments (especially since the two papers came from the same author, David Clark).

CS268 Reading Summaries