Friday, October 17, 2008

TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks (UCB)

The paper describes a data aggregation service, called TAG, that operates on TinyOS. It argues that aggregation is such central functionality in WSN that it should be structured as a service. This justifies its existence within the end-to-end arguments. It incorporates SQL as user frontend for the service. This is nice because it leverages the data aggregation research that has been done in the database community. 

It has a small section on the taxonomy of aggregates. It briefly classifies some aggregate functions based on their characteristics. I would like to see more on how this info is applied. Maybe a type checker of the query compiler can take advantage of the info and prevent user from abusing the service. Or, more possibly, a query optimizer use these extra info to specialize a given aggregate function. 

The basic mechanism of TAG is to arrange a hierarchical routing structure (like a tree) for each user query. The tree is then used to propagate and aggregate data from bottom-up, where the root is the node interacting with the user. At each level of the tree, nodes can use "epoch" to set up sampling intervals and perform the query-specific aggregation. The paper acknowledges that the sampling intervals is timing imprecise, influenced by timing jitter and many other timing problems. The author thus only assumes vague timing semantics.

One problem with having a tree structure is fault tolerance. A tree inherently is prone to single point of failure. If a node higher up in the hierarchy failed, much data is lost compared to the failure of lower node. Also, security is a concern. A malicious node can pretend to be very near the root and have great influence on the aggregate result.

The SQL interface is a nice idea because it gives more flexibility for application usage compare to the traditional way of hard-coding in a single app. In-network aggregation minimizes message exchange. However, the challenge seems to be on reliability. How much confidence can we put into the data extracted from the network using TAG, given its limitation on the sampling rate? 


No comments: