Methodology for estimating interdomain web traffic demand

Date

2004

Authors

Feldmann, Anja
Maennel, Olaf Manuel
Kammenhuber, Nils
Maggs, Bruce
De Prisco, Roberto
Sundaram, Ravi

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings of the 4th ACM SIGCOMM conference on Internet Measurement; Taormina, Sicily, Italy, 2004 / pp. 322 - 335

Statement of Responsibility

Anja Feldmann, Nils Kammenhuber, Olaf Maennel, Bruce Maggs, Roberto De Prisco, and Ravi Sundaram

Conference Name

Internet Measurement Conference (2004 : Sicily, Italy)

Abstract

This paper introduces a methodology for estimating interdomain Web traffic flows between all clients worldwide and the servers belonging to over one thousand content providers. The idea is to use the server logs from a large Content Delivery Network (CDN) to identify client downloads of content provider (i.e., publisher) Web pages. For each of these Web pages, a client typically downloads some objects from the content provider, some from the CDN, and perhaps some from third parties such as banner advertisement agencies. The sizes and sources of the non-CDN downloads associated with each CDN download are estimated separately by examining Web accesses in packet traces collected at several universities. The methodology produces a (time-varying) interdomain HTTP traffic demand matrix pairing several hundred thousand blocks of client IP addresses with over ten thousand individual Web servers. When combined with geographical databases and routing tables, the matrix can be used to provide (partial) answers to questions such as “How do Web access patterns vary by country?”, “Which autonomous systems host the most Web content?”, and “How stable are Web traffic flows over time?”.

School/Discipline

School of Mathematical Sciences

Dissertation Note

Provenance

Description

Copyright 2004 ACM

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record