arXiv:1905.08767
arXiv Year 2019 Peer-reviewed
Web Security · Privacy

The Blind Men and the Internet: Multi-Vantage Point Web Measurements

Jordan Jueckstock Shaown Sarker Peter Snyder Panagiotis Papadopoulos Matteo Varvello Benjamin Livshits
2019
Publication year
arXiv
Venue
Preprint
Type

Problem

In this paper, we design and deploy a synchronized multi-vantage point web measurement study to explore the comparability of web measurements across vantage points (VPs). We describe in reproducible detail the system with which we performed synchronized crawls on the Alexa top 5K domains from four distinct network VPs: research university, cloud datacenter, residential network, and Tor gateway proxy.

Approach

Apart from the expected poor results from Tor, we observed no shocking disparities across VPs, but we did find significant impact from the residential VP's reliability and performance disadvantages. We also found subtle but distinct indicators that some third-party content consistently avoided crawls from our cloud VP.

Results

In summary, we infer that cloud VPs do fail to observe some content of interest to security and privacy researchers, who should consider augmenting cloud VPs with alternate VPs for cross-validation. Our results also imply that the added visibility provided by residential VPs over university VPs is marginal compared to the infrastructure complexity and network fragility they introduce.

Cite this paper — BibTeX
@TechReport{arxiv190508767,
  title = "{The Blind Men and the Internet: Multi-Vantage Point Web Measurements}",
  author = "Jordan Jueckstock and Shaown Sarker and Peter Snyder and Panagiotis Papadopoulos and Matteo Varvello and Benjamin Livshits and Alexandros Kapravelos",
  year = "2019",
  month = may,
  institution = "arXiv",
  number = "arXiv:1905.08767",
}
Copied