Monday, September 29, 2008

MACAW: A Media Access Protocol for Wireless LAN's (UCB)

Finding the right MAC protocol for wireless is not a solved problem. The paper is interested in finding a good protocol for the wireless LAN for Xerox at PARC. It has a multi-cell topology with each cell consisting of one base station and multiple nodes (the paper calls them "pads"). CSMA isn't suitable, and the paper illustrates this with the "hidden" and "exposed" terminal phenonmena. The root problem is that wireless signals have side-effects in interfering with other channels. With CSMA, implicit sensing is not enough to prevent collisions. 

MACA is the predecessor of MACAW. MACA adds two new control signals (RTS and CTS) to help synchronize channel access. This helps nodes to communicate their state with nearby nodes. I.e. "I am prepared to listen to someone", or "I just started a communication stream with someone", and etc. This helps other nodes to know when to backoff. MACA assumes nodes to exchange RTS and CTS to establish connectivity. But what happens if we have rapid mobile environment where new nodes continuously comes into the cell? What is the effect of having node not hearing these control signals?

Of course, MACA is a conservative algorithm. In order to achieve more throughput and add things like fairness, this is where MACAW comes in. It adds three new control signals, Data-Send (DS), ACK, and RRTS, to increase efficiency of MACA. These signals brings even greater synchronization between nodes. However, they decrease potential contention by sending more control messages. That seems to be expensive in wireless. Power consumption overhead of setting up a transmission is incredibly demanding. Also, i don't think measuring throughput alone is the right measurement. Measuring throughput over power may be a better way to go.

In addition, MACAW enchances MACA's binary exponentially backoff (BEB) in two ways. The original BEB has problem with fairness and have starvation problems. MACAW first uses mutliplicative-increase-linear-decrease (MILD) policy to adjust backoff counter. Besides a very witty name, it's similar to the idea of AIMD. Second, it allows both the receiver and the sender to adjust the backoff counter when transmission succeeds.

The paper is a nice short read and shows a lot of the corner cases in designing a wireless link protocol. It's definitely a very involved process. On the other hand, it would also be nice to combine this reading with something about practicality issues like power consumption and environment hazards and other deployment challenges.


Sunday, September 28, 2008

A Comparison of Mechanisms for Improving TCP Performance over Wireless Links

Transport for wireless is different from wired. It requires extra care to avoid performance penalty. This is because wired assume most packet loss and performance degradation is due to congestion, which is not quite true for wireless. The way signal is transferred in wireless environment is more prone to random bit errors. TCP deals with packet loss by invoking congestion control (backing off, i.e. decrease window size, and timeout). This is inappropriate and decrease performance. 

The paper surveys three main approaches to deal with this problem. The first is to do nothing special. They called it end-to-end (E2E). Second is called link-layer (LL) which acts like a snoop agent, running at a faster rate than tcp. It tries to handle non-congestion related loss before propagating to tcp. The last is a split-connection scheme that adds an extra wired link from each sender to a base station. This uses the base station as a buffer and decouples the sender and receiver; thus, loss on receiver side won't necessarily slow down the sender, and vice-visa. 

Along with each approaches there are enchancement techiques to better performance. Selective acks (SACK) is one, which provide extra ack information over the cumulative ack. Explicit loss notification (ELN) is another, which gives information about the cause for packet loss. Within the SACK paradigm, SMART is a special kind SACK that uses bitmap storage the recent ack history. SACK and ELN can be added to each of these three general schemes.

The paper measures the throughout for each of the different schemes against the peak (error-free) throughput (1.5 Mbps for LAN and 1.35 Mbps for WAN). They finds LL + SMART + TCP-AWARE to be the best against bit-errors. This TCP-awareness isn't explained in the paper. It would be nice to see what it really does. In the end, it comes down to how good are these protocols in preventing tcp from timeouts and shrinking window size. There is also a question of false-negative. How well do these protocols detect real congestion? I would like to see extension of these experiments in a more general settings w/ bit errors + congestion.

Tuesday, September 23, 2008

Scaling Internet Routers Using Optics (Stanford '03)

The paper goes very in-depth in designing a 100Tb/s router. It argues for a multi-stage over single-stage design, even though signle-stage is simple. The paper specifically proposes adding a load-balancing stage that randomize the input packets before entering the VOQs. It proposes Full Ordered Frames First (FOFF) to correct mis-sequencing of packets.

The concept of partitioning a switch into multiple stages is very novel. It's similar to pipelining in a circuit. Specializing individual stages of work increase throughout. However, partitioning a switch is at a more macro level. I wonder if there is any formal study in the design of the partitioning? What are the different ways to do the partitioning? What are some of the factors that drive the division?

The paper also suggests decentalized schedulers. This increases scalability and fault tolerance. But it's not clear how this is done. The discussion about the use of optics and MEMS is also too brief for me. I would be interested in reading some follow-up papers on these topics.

I agree that the paper lacks results and evaluation. It focuses a lot on certain design details, which make it hard to understand. This is the first paper i read from this class that i feel lost, which happens very often for my other classes=). On the up side, i did get a chance to google many things i didn't know about router design.

Monday, September 22, 2008

A Fast Switched Backplane for a Gigabit Switched Router ('97)

Router design takes quite a bit of hardware organization. The internal of a router is divided into forwarding decision, backplane, and output-link scheduler. Forwarding decision takes care of the MAC header (TTL, checksum, and etc.). Output-link scheduling handles flow or priority control to meet QoS, while the backplane does the actual transfer from the input to its outgoing port. The paper analyzes the design of backplane.

Traditionally, backplane is built using bus. This is called "shared backplane". It's known to be non-scalable because at most one port can transmit at any given time. Using crossbar switch would be much more scalable, since there are multiple connections between the set of inputs and outputs. Packets can be transmitted simultaneously. There is time slot, and the between each time slot the switch is reconfigured. This means packets needs to be fixed length now in order to fit the time slot. This sounds similar to the comparison between CSMA and TDMA, only this being at the circuit level.

Crossbar-switched backplane faces three scenarios that may block signal from transmitting. First is called head-of-line (HOL) blocking, which comes from scheduling multiple FIFO queues. The top element prevents the ones after from participating. Input and output blocking are caused by contention in accessing the hardware.

The way to deal with HOL blocking is to move away from FIFO and use something called Virtual output queuing (VOQ). Input maintains a queue for each output port, and sort the incoming packet into the corresponding queue before choosing one VOQ as input to enter the crossbar switch. In a sense, it adds an additional multiplexing stage. A simple scheduling algorithm called iSLIP is used to schedule these VOQs. iSLIP is two-sided (hand-shaking agreement), where output-side uses RR and input-side uses priority.

I wish the VOQ stuffs can be illustrated more in the paper. Sometimes it's hard to picture how these things are implemented. More importantly, they don't give a sense of what the overhead and cost are, even though the paper seems to focus heavily on implementation.

The paper gives a rough overall picture of the tread in router design. It gives a good sense to beginners like me in knowing about how queuing is done in routers. I recommend keeping this paper, especially if you'll need to read the optics paper.

Thursday, September 18, 2008

Fundamental Design Issues for the Futre Internet (Shenker '94)

The internet has always been structured to give "best-effort" service. It deliver packets in the most efficient or fair manner, in the protocol design point of view. However, the author argues that the best network design should be sensitive the application requirements that may be different for everyone. The overall performance of the network is measured by the sum of the utility functions for each app. Utility functions take delivery delay as parameter (i.e. each app has a different sensitivity to delay).

The way to maximize these utility functions is to extend the existing "best-effort" service model by incorporating other kinds of services. For example, the paper mentioned elastic, hard RT, delay-adaptive RT, and rate-adaptive RT. There are potentially more that would even depend on parameters other than delay.

There is simple analysis of suggesting how to extend the service model. It shows having multiple homogeneous service networks is less efficient than having heterogeneous service networks. This seems counter-intuitive at first. It seems there can be more optimization opportunities in a homogeneous setting (i.e. more assumptions can be made). In terms of RT and non-RT scheduling, this suggests that the two actually complements each other. For example, datagram service can fill up the leftover BW of the RT traffic.

The choice of having admission control or overprovisioning is a concern for future network design. On one hand, overprovisioning affects application performance. But having admission control is nice because one can avoid potentially all overloading (or congestion).

I like the paper. It presents the big ideas for future network design. Although the analysis were quite simplistic, it does illustrates the design issues.

Supporting Real-Time App in a Integrated Services Packet Network: Architecture and Mechanism (MIT)

The paper introduces the idea of Integrated Services Packet Network (ISPN), which allows real-time (RT) and non-RT traffic to share bandwidth. It classifies three types of services: guaranteed, predicted, and datagram. Guaranteed is real-time service that has a hard deadline of delivery. Clients need to declare a usage rate before being admitted into the network. Predicted is more like "soft" real-time, where the delay bound of delivery is assumed to be adaptive and adjusted by previous experienced delay. Datagram is relatively time-insensitive but is reserved with some minimal amount (10%, chosen in the paper) of BW to ensure progress.

Jitter is a main problem in RT services. Many reasons lead to jitters. The transport layer, in particular, induces jitter from multiplexing packets from different sources. Packets get delayed because of delivery of packets of other unrelated sources. This is the commonly known side-effect of sharing. Interestingly, the author suggests that jitter can be minimized if sharing is prioritized.

Isolation is inevitable for hard real-time services, but isolation is expensive. Providing a integrated service can be seen as tradeoff between cost and performance. At the same time, users can easily switch between RT and non-RT servicing. This may allow future applications to have a more dynamic and finer-grained timing behavior.

For soft RT and non-RT servicing, the author proposes FIFO+, which incoporate the concept of priority into FIFO, to mimize the max delay (jitter). It uses the expected instead of actual arrival time. The expected arrival time is driven by the packet's deadline.

Priority can be set at different granularity. The common way is to associated a priority number with each connection link. In multihop environment, the paper suggests that priority of packets can change at each hop. This leads to a more dynamic and flexible scheduling, since packets can now "accelerate" to catch up with their deadline.

The paper is nice in that it shows a new dimension, timing, in network service. It also shows how to incorporate this new "stricter" service with the traditional services.

Tuesday, September 16, 2008

Congestion Control for High Bandwidth-Delay Product Networks ('02)

The paper proposes the Explicit Control Protocol (XCP). It aims to address issues that TCP is not good at. For example, TCP is not very efficient in filling up the available bandwidth (not increasing fast enough) in high bandwidth environment. TCP has bias against high latency links because its throughput is inversely proportional to RTT.

XCP is a end-point AND gateway solution. Explicit congestion control information is exchanged between the end hosts and routers. The fairness controller (FC) and an efficiency controller (EC) are purposely separated for simplicity and modularity (separation of concerns). However, EC augments the feedback responses produced by FC. There is potentially a fixed-point issue here (i.e. Should EC's response affect FC's?). Also, allocation in terms of window size seems problematic. What if their windows are larger than they need?

The simulation results for XCP were unrealistically good. This is significant in that it shows how much we can improve from having more control information. It shows how much off TCP is from the "upper bound". A big problem however is the question of trust. It has to trust the end-points to cooperate by sending proper header info.