Program at a glance   Tutorial program   Technical program   Abstracts   Papers


World Wide Web
Full Paper
On Network-Aware Clustering of Web Clients
Balachander Krishnamurthy (AT&T Research)
Jia Wang (Cornell University)
Being able to identify the groups of clients that are responsible for a significant portion of a Web site's requests can be helpful to both the Web site and the clients. In a Web application, it is beneficial to move content closer to groups of clients that are responsible for large subsets of requests to an origin server. We introduce {\it clusters} - a grouping of clients that are close together topologically and likely to be under common administrative control. We identify clusters using a "network-aware" method, based on information available from BGP routing table snapshots.
Experimental results show that our entirely automated approach is able to identify clusters for 99.9\% of the clients in a wide variety of Web server logs. Sampled validation results show that the identified clusters meet the proposed validation tests in over 90\% of the cases. An efficient self-corrective mechanism increases the applicability and accuracy of our initial approach and makes it adaptive to network dynamics. In addition to being able to detect unusual access patterns made by spiders and (suspected) proxies, our proposed method is useful for content distribution and proxy positioning, and applicable to other problems such as server replication and network management.