Program

This is the conference schedule for IMC 2018. Watch the archived video from the conference: Thursday, November 1 and Friday, November 2. Wednesday will be added soon.

  • 08:30AM - 09:00AM - Light Breakfast
  • 09:00AM - 09:30AM - Welcome, opening remarks
    • On the Origins of Memes by Means of Fringe Web Communities  long - distinguished paper award
      Savvas Zannettou (Cyprus University of Technology), Tristan Caulfield (University College London), Jeremy Blackburn (University of Alabama at Birmingham), Emiliano De Cristofaro (University College London), Michael Sirivianos (Cyprus University of Technology), Gianluca Stringhini (Boston University), Guillermo Suarez-Tangil (King's College London)
      Abstract: Internet memes are increasingly used to sway and possibly manipulate public opinion, thus prompting the need to study their propagation, evolution, and influence across the Web. In this paper, we detect and measure the propagation of memes across multiple Web communities, using a processing pipeline based on perceptual hashing and clustering techniques, and a dataset of 160M images from 2.6B posts gathered from Twitter, Reddit, 4chan’s Politically Incorrect board (/pol/), and Gab over the course of 13 months. We group the images posted on fringe Web communities (/pol/, Gab, and The_Donald subreddit) into clusters, annotate them using meme metadata obtained from Know Your Meme, and also map images from mainstream communities (Twitter and Reddit) to the clusters. Our analysis provides an assessment of the popularity and diversity of memes in the context of each community, showing, e.g., that racist memes are extremely common in fringe Web communities. We also find a substantial number of politics-related memes on both mainstream and fringe Web communities, supporting media reports that memes might be used to enhance or harm politicians. Finally, we use Hawkes processes to model the interplay between Web communities and quantify their reciprocal influence, finding that /pol/ substantially influences the meme ecosystem with the number of memes it produces, while The Donald has a higher success rate in pushing them to other communities.
    • Following Their Footsteps: Characterizing Account Automation Abuse and Defenses  long
      Louis F. DeKoven (UC, San Diego), Trevor Pottinger (Facebook), Stefan Savage and Geoffrey M. Voelker (UC, San Diego), Nektarios Leontiadis (Facebook)
      Abstract: Online social networks routinely attract abuse from for-profit services that offer to artificially manipulate a user's social standing. In this paper, we examine five such services in depth, each advertising the ability to inflate their customer's standing on the Instagram social network. We identify the techniques used by these services to drive social actions, and how they are structured to evade straightforward detection. We characterize the dynamics of their customer base over several months and show that they are able to attract a large clientele and generate over $1M in monthly revenue. Finally, we construct controlled experiments to disrupt these services and analyze how different approaches to intervention (i.e., transparent interventions such as blocking abusive services vs. more opaque approaches such as deferred removal of artificial actions) can drive different reactions and thus provide distinct trade-offs for defenders.
    • Needle in a Haystack: Tracking Down Elite Phishing Domains in the Wild  long
      Ke Tian, Steve T.K. Jan, Hang Hu, Danfeng Yao, and Gang Wang (Virginia Tech)
      Abstract: Today's phishing websites are constantly evolving to deceive users and evade the detection. In this paper, we perform a measurement study on squatting phishing domains where the websites impersonate trusted entities not only at the page content level but also at the web domain level. To search for squatting phishing pages, we scanned five types of squatting domains over 224 million DNS records and identified 657K domains that are likely impersonating 702 popular brands. Then we build a novel machine learning classifier to detect phishing pages from both the web and mobile pages under the squatting domains. A key novelty is that our classifier is built on a careful measurement of evasive behaviors of phishing pages in practice. We introduce new features from visual analysis and optical character recognition (OCR) to overcome the heavy content obfuscation from attackers. In total, we discovered and verified 1,175 squatting phishing pages. We show that these phishing pages are used for various targeted scams, and are highly effective to evade detection. More than 90% of them successfully evaded popular blacklists for at least a month.
  • 10:45AM - 11:30AM - Morning Break
    • A Large Scale Study of Data Center Network Reliability  long
      Justin Meza (CMU and Facebook, Inc.), Tianyin Xu (UIUC and Facebook, Inc.), Kaushik Veeraraghavan (Facebook, Inc.), Onur Mutlu (ETH Zürich and CMU)
      Abstract: The ability to tolerate, remediate, and recover from network incidents (caused by device failures and fiber cuts, for example) is critical for building and operating highly-available web services. Achieving fault tolerance and failure preparedness requires system architects, software developers, and site operators to have a deep understanding of network reliability at scale, along with its implications on the software systems that run in data centers. Unfortunately, little has been reported on the reliability characteristics of large scale data center network infrastructure, let alone its impact on the availability of services powered by software running on that network infrastructure. This paper fills the gap by presenting a large scale, longitudinal study of data center network reliability based on operational data collected from the production network infrastructure at Facebook, one of the largest web service providers in the world. Our study covers reliability characteristics of both intra and inter data center networks. For intra data center networks, we study seven years of operation data comprising thousands of network incidents across two different data center network designs, a cluster network design and a state-of-the-art fabric network design. For inter data center networks, we study eighteen months of recent repair tickets from the field to understand reliability of Wide Area Network (WAN) backbones. In contrast to prior work, we study the effects of network reliability on software systems, and how these reliability characteristics evolve over time. We discuss the implications of network reliability on the design, implementation, and operation of large scale data center systems and how it affects highly-available web services. We hope our study forms a foundation for understanding the reliability of large scale network infrastructure, and inspires new reliability solutions to network incidents.
    • Predictive Analysis in Network Function Virtualization  short
      Zhijing Li (UCSB), Zihui Ge, Ajay Mahimkar, and Jia Wang (AT&T Labs - Research), Ben Y. Zhao and Haitao Zheng (University of Chicago), Joanne Emmons and Laura Ogden (AT&T)
      Abstract: Recent deployments of Network Function Virtualization (NFV) architectures have gained tremendous traction. While virtualization introduces bene ts such as lower costs and easier deployment of network functions, it adds additional layers that reduce transparency into faults at lower layers. To improve fault analysis and prediction for virtualized network functions (VNF), we envision a runtime predictive analysis system that runs in parallel with existing reactive monitoring systems to provide network operators timely warnings against faulty conditions. In this paper, we propose a deep learning based approach to reliably identify anomaly events from NFV system logs, and perform an empirical study using 18 consecutive months in 2016-2018 of real-world deployment data on virtualized provider edge routers. Our deep learning models, combined with customization and adaptation mechanisms, can successfully identify anomalous conditions that correlate with network trouble tickets. Analyzing these anomalies can help operators to optimize trouble ticket generation and processing rules in order to enable fast, or even proactive actions against faulty conditions.
    • Cloud Datacenter SDN Monitoring: Experiences and Challenges  short
      Arjun Roy (UC San Diego), Deepak Bansal, David Brumley, Harish Kumar Chandrappa, Parag Sharma, and Rishabh Tewari (Microsoft Corporation), Behnaz Arzani (Microsoft Research), Alex C. Snoeren (UC San Diego)
      Abstract: Cloud customers require highly reliable and performant leased datacenter infrastructure to deliver quality service for their users. It is thus critical for cloud providers to quickly detect and mitigate infrastructure faults. While much is known about managing faults that arise in the datacenter physical infrastructure (i.e., network and server equipment), comparatively little has been published regarding management of the logical overlay networks frequently employed to provide strong isolation in multi-tenant datacenters. We present a first look into the nuances of monitoring these “virtualized” networks through the lens of a large cloud provider. We describe challenges to building cloud-based fault monitoring systems, and use the output of a production system to illuminate how virtualization impacts multi-tenant datacenter fault management. We show that interactions between the virtualization, tenant software, and lower layers of the network fabric both simplify and complicate different aspects of fault detection and diagnosis efforts.
  • 12:30PM - 02:00PM - Lunch
    • An Empirical Analysis of the Commercial VPN Ecosystem  long
      Mohammad Taha Khan (UIC), Joe DeBlasio (UC San Diego), Chris Kanich (UIC), Geoffrey M. Voelker and Alex C. Snoeren (UC San Diego), Narseo Vallina-Rodriguez (IMDEA Networks Institute/ICSI)
      Abstract: Global Internet users are increasingly relying on virtual private network (VPN) services to preserve their privacy, circumvent censorship, and access geo-filtered content. However, due to the opaque nature of VPN clients and lack of technical expertise on behalf of the vast majority of users, individuals have limited means with to verify the particular claims of privacy, security, or even geographical presence made by a given VPN service. We design an active measurement system to test various infrastructural and privacy aspects of VPN services and evaluate 62 commercial providers. Our results suggest that while commercial VPN services seem, on the whole, less likely to intercept or tamper with user traffic than other, previously studied forms of traffic proxying, almost all of them leak user traffic— perhaps inadvertently—through a variety of means. We also find that a non-trivial fraction of VPN providers transparently proxy traffic, and many misrepresent the physical location of their vantage points: 5–30% of the vantage points, representing 10% of the providers we study appear to be hosted on servers located in countries other than those advertised to users.
    • How to Catch when Proxies Lie: Verifying the Physical Locations of Network Proxies with Active Geolocation  long
      Zachary Weinberg (Carnegie Mellon University), Shinyoung Cho (SUNY Stonybrook), Nicolas Christin and Vyas Sekar (Carnegie Mellon University), Phillipa Gill (University of Massachusetts - Amherst)
      Abstract: Internet users worldwide rely on commercial network proxies both to conceal their true location and identity, and to control their apparent location. Their reasons range from mundane to security-critical. Proxy operators offer no proof that their advertised server locations are accurate. IP-to-location databases tend to agree with the advertised locations, but there have been many reports of serious errors in such databases. In this study we estimate the locations of 2269 proxy servers from ping-time measurements to hosts in known locations, combined with AS and network information. These servers are operated by seven proxy services, and, according to the operators, spread over 222 countries and territories. Our measurements show that one-third of them are definitely not located in the advertised countries, and another third might not be. Instead, they are concentrated in countries where server hosting is cheap and reliable (e.g. Czech Republic, Germany, Netherlands, UK, USA). In the process, we address a number of technical challenges with applying active geolocation to proxy servers, which may not be directly pingable, and may restrict the types of packets that can be sent through them, e.g. forbidding `traceroute`. We also test three geolocation algorithms from previous literature, plus two variations of our own design, at the scale of the whole world.
    • An Empirical Study of the I2P Anonymity Network and its Censorship Resistance  long
      Nguyen Phong Hoang (Stony Brook University), Panagiotis Kintis and Manos Antonakakis (Georgia Institute of Technology), Michalis Polychronakis (Stony Brook University)
      Abstract: Tor and I2P are well-known anonymity networks used by many individuals to protect their online privacy and anonymity. Tor's centralized directory services facilitate the understanding of the Tor network, as well as the measurement and visualization of its structure through the Tor Metrics project. In contrast, I2P does not rely on centralized directory servers, and thus obtaining a complete view of the network is challenging. In this work, we conduct an empirical study of the I2P network, in which we measure properties including population, churn rate, router type, and the geographic distribution of I2P peers. We find that there are currently around 32K active I2P peers in the network on a daily basis. Of these peers, 14K are located behind NAT or firewalls. Using the collected network data, we examine the blocking resistance of I2P against a censor that wants to prevent access to I2P using address-based blocking techniques. Despite the decentralized characteristics of I2P, we discover that a censor can block more than 95% of peer IP addresses known by a stable I2P client by operating only 10 routers in the network. This amounts to severe network impairment: a blocking rate of more than 70% is enough to cause significant latency in web browsing activities, while blocking more than 90% of peer IP addresses can make the network unusable. Finally, we discuss the security consequences of the network being blocked, and directions for potential approaches to make I2P more resistant to blocking.
    • Understanding Tor Usage with Privacy-Preserving Measurement  long
      Akshaya Mani (Georgetown University), T Wilson Brown (UNSW Canberra Cyber, University of New South Wales), Rob Jansen and Aaron Johnson (U.S. Naval Research Laboratory), Micah Sherr (Georgetown University)
      Abstract: The Tor anonymity network is difficult to measure because, if not done carefully, measurements could risk the privacy (and potentially the safety) of the network’s users. Recent work has proposed the use of differential privacy and secure aggregation techniques to safely measure Tor, and preliminary proof-of-concept prototype tools have been developed in order to demonstrate the utility of these techniques. In this work, we significantly enhance two such tools—PrivCount and Private Set-Union Cardinality—in order to support the safe exploration of new types of Tor usage behavior that have never before been measured. Using the enhanced tools, we conduct a detailed measurement study of Tor covering three major aspects of Tor usage: how many users connect to Tor and from where do they connect, with which destinations do users most frequently communicate, and how many onion services exist and how are they used. Our findings include that Tor has ∼8 million daily users, a factor of four more than previously believed. We also find that ∼40% of the sites accessed over Tor have a torproject.org domain name, ∼10% of the sites have an amazon.com domain name, and ∼80% of the sites have a domain name that is included in the Alexa top 1 million sites list. Finally, we find that ∼90% of lookups for onion addresses are invalid, and more than 90% of attempted connections to onion services fail.
  • 03:40PM - 04:10PM - Afternoon Break
    • Characterizing the Internet Host Population Using Deep Learning: A Universal and Lightweight Numerical Embedding  long
      Armin Sarabi and Mingyan Liu (University of Michigan)
      Abstract: In this paper, we present a framework to characterize Internet hosts using deep learning, using Internet scan data to produce numerical and lightweight (low-dimensional) representations of hosts. To do so we first develop a novel method for extracting binary tags from structured texts, the format of the scan data. We then use a variational autoencoder, an unsupervised neural network model, to construct low-dimensional *embeddings* of our high-dimensional binary representations. We show that these lightweight embeddings retain most of the information in our binary representations, while drastically reducing memory and computational requirements for large-scale analysis. These embeddings are also *universal*, in that the process used to generate them is unsupervised and does not rely on specific applications. This universality makes the embeddings broadly applicable to a variety of learning tasks whereby they can be used as input features. We present two such examples, (1) detecting and predicting malicious hosts, and (2) unmasking hidden host attributes, and compare the trained models in their performance, speed, robustness, and interpretability. We show that our embeddings can achieve high accuracy (>95%) for these learning tasks, while being fast enough to enable host-level analysis at scale.
    • Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists  long
      Oliver Gasser and Quirin Scheitle (Technical University of Munich), Pawel Foremski (Institute of Theoretical and Applied Informatics, Polish Academy of Sciences), Qasim Lone and Maciej Korczynski (Grenoble Alps University), Stephen D. Strowes (RIPE NCC), Luuk Hendriks (University of Twente), Georg Carle (Technical University of Munich)
      Abstract: Network measurements are an important tool in understanding the Internet. Due to the expanse of the IPv6 address space, exhaustive scans as in IPv4 are not possible for IPv6. In recent years, several studies have proposed the use of target lists of IPv6 addresses, called IPv6 hitlists. In this paper, we show that addresses in IPv6 hitlists are heavily clustered. We present novel techniques that allow IPv6 hitlists to be pushed from quantity to quality. We perform a longitudinal active measurement study over 6 months, targeting more than 50 M addresses. We develop a rigorous method to detect aliased prefixes, which identifies 1.5 % of our prefixes as aliased, pertaining to about half of our target addresses. Using entropy clustering, we group the entire hitlist into just 6 distinct addressing schemes. Furthermore, we perform client measurements by leveraging crowdsourcing. To encourage reproducibility in network measurement research and to serve as a starting point for future IPv6 studies, we publish source code, analysis tools, and data.
    • In the IP of the Beholder: Strategies for Active IPv6 Topology Discovery  long
      Robert Beverly (Naval Postgraduate School), Ramakrishnan Durairajan (University of Oregon), David Plonka (Akamai Technologies), Justin P. Rohrer (Naval Postgraduate School)
      Abstract: Existing methods for active topology discovery within the IPv6 Internet largely mirror those of IPv4. In light of the large and sparsely populated address space, in conjunction with aggressive ICMPv6 rate limiting by routers, this work develops a different approach to Internet-wide IPv6 topology mapping. We adopt randomized probing techniques in order to distribute probing load, minimize the effects of rate limiting, and probe at higher rates. Second, we extensively analyze the efficiency and efficacy of various IPv6 hitlists and target generation methods when used for topology discovery, and synthesize new target lists based on our empirical results to provide both breadth (coverage across networks) and depth (to find potential subnetting). Employing our probing strategy, we discover more than 1.3M IPv6 router interface addresses from a single vantage point. Finally, we share our prober implementation, synthesized target lists, and discovered IPv6 topology results.
  • 06:00PM - 07:30PM - Reception and Posters
  • 08:30AM - 09:00AM - Light Breakfast
    • Impact of Device Performance on Mobile Internet QoE  short
      Mallesham Dasari, Santiago Vargas, Arani Bhattacharya, Aruna Balasubramanian, Samir R. Das, and Michael Ferdman (Stony Brook University)
      Abstract: A large fraction of users in developing regions use relatively inexpensive, low-end smartphones. However, the impact of device capabilities on the performance of mobile Internet applications has not been explored. To bridge this gap, we study the QoE of three popular applications – Web browsing, video streaming, and video telephony – for different device parameters. Our results demonstrate that the performance of Web browsing is much more sensitive to low-end hardware than that of video applications, especially video streaming. This is because the video applications exploit specialized coprocessors/accelerators and thread-level parallelism on multi-core mobile devices. Even low-end devices are equipped with needed coprocessors and multiple cores. In contrast, Web browsing is largely influenced by clock frequency, but it uses no more than two cores. This makes the performance of Web browsing more vulnerable on low-end smartphones. Based on the lessons learned from studying video applications, we explore offloading Web computation to a coprocessor. Specifically, we explore the offloading of regular expression computation to a DSP coprocessor and show an improvement of 18% in page load time while saving energy by a factor of four.
    • A First Look at Sim-Enabled Wearables in the Wild  short
      Harini Kolamunna (The University of New South Wales, Sydney, Australia), Ilias Leontiadis and Diego Perino (Telefonica R & D, Barcelona, Spain), Suranga Seneviratne and Kanchana Thilakarathna (The University of Sydney, Sydney, Australia), Aruna Seneviratne (The University of New South Wales, Sydney, Australia)
      Abstract: Recent advances are driving wearables towards stand-alone devices with cellular network support (e.g. SIM-enabled Apple Watch series- 3). Nonetheless, a little has been studied on SIM-enabled wearable traffic in ISP networks to gain customer insights and to understand traffic characteristics. In this paper, we characterize the network traffic of several thousand SIM-enabled wearable users in a large European mobile ISP. We present insights on user behavior, application characteristics such as popularity and usage, and wearable traffic patterns. We observed a 9% increase in SIM-enabled wearable users over a five month observation period. However, only 34% of such users actually generate any network transaction. Our analysis also indicates that SIM-enabled wearable users are significantly more active in terms of mobility, data consumption and frequency of app usage compared to the remaining customers of the ISP who are mostly equipped with a smartphone. Finally, wearable apps directly communicate with third parties such as advertisement and analytics networks similarly to smartphone apps.
    • Mobility Support in Cellular Networks: A Measurement Study on Its Configurations and Implications  long
      Haotian Deng, Chunyi Peng, Ans Fida, Jiayi Meng, and Y. Charlie Hu (Purdue)
      Abstract: In this paper, we conduct the first global-scale measurement study to unveil how 30 mobile operators manage mobility support in their carrier networks. Using a novel, device-centric tool MMLab, we are able to crawl runtime configurations without the assistance from operators. Using handoff configurations from 32,000+ cells and >18,700 handoff instances, we uncover how policy-based handoffs work in practice. We further study how the configuration parameters affect the handoff performance and user data access. Our study exhibits three main points regarding handoff configurations. 1) Operators deploy extremely complex and diverse configurations to control how handoff is performed. 2) The setting of handoff configuration values affect data performance in a rational way. 3) While giving better control granularity over handoff procedures, such diverse configurations also lead to unexpected negative compound effects to performance and efficiency. Moreover, our study of mobility support through a device-side approach gives valuable insights to network operators, mobile users and the research community.
    • Beyond Google Play: A Large-Scale Comparative Study of Chinese Android App Markets  long
      Haoyu Wang (Beijing University of Posts and Telecommunications), Zhe Liu and Jingyue Liang (Peking University), Narseo Vallina-Rodriguez (IMDEA Networks Institute and ICSI), Yao Guo (Peking University), Li Li (Monash University), Juan Tapiador (Universidad Carlos III de Madrid), Jingcun Cao (Indiana University Bloomington), Guoai Xu (Beijing University of Posts and Telecommunications)
      Abstract: China is one of the largest Android markets in the world. As Chinese users cannot access Google Play to buy and install Android apps, a number of independent app stores have emerged and compete in the Chinese app market. Some of the Chinese app stores are pre-installed vendor-specific app markets (e.g., Huawei, Xiaomi and OPPO), whereas others are maintained by large tech companies (e.g., Baidu, Qihoo 360 and Tencent). The nature of these app stores and the content available through them vary greatly, including their trustworthiness and security guarantees. As of today, the research community has not studied the Chinese Android ecosystem in depth. To fill this gap, we present the first large-scale comparative study that covers more than 6 million Android apps downloaded from 16 Chinese app markets and Google Play. We focus our study on catalog similarity across app stores, their features, publishing dynamics, and the prevalence of various forms of misbehavior (including the presence of fake, cloned and malicious apps). Our findings also suggest heterogeneous developer behavior across app stores, in terms of code maintenance, use of third-party services, and so forth. Overall, Chinese app markets perform substantially worse when taking active measures to protect mobile users and legit developers from deceptive and abusive actors, showing a significantly higher prevalence of malware, fake, and cloned apps than Google Play.
  • 10:20AM - 11:00AM - Morning Break
    • Characterizing the deployment and performance of multi-CDNs  short
      Rachee Singh, Arun Dunna, and Phillipa Gill (University of Massachusetts, Amherst)
      Abstract: Pushing software updates to millions of geographically diverse clients is an important technical challenge for software providers. In this paper, we characterize how content delivery networks (CDNs) are used to deliver software updates of two prominent operating systems (Windows and iOS), over a span of 3 years. We leverage a data set of DNS and ping measurements from 9,000 RIPE Atlas clients, distributed across 206 countries, to understand regional and temporal trends in the use of multiple CDNs for delivering OS updates. We contrast two competing methodologies for distributing OS updates employed by Microsoft and Apple, where majority of Microsoft clients download Windows updates from their local ISP. But, 90% of Apple clients access iOS updates from Apple's own network. We find an approximate improvement of 70 ms in the latency observed by clients in Asia and Africa when accessing content from edge caches in local ISPs. Additionally, Microsoft provides lower latencies to its clients in developing regions by directing them to Akamai's rich network of edge caches. We also observe that clients in developing regions accessing Windows updates from Level 3 get poor latencies arising from the absence of Level~3's footprint in those regions.
    • Dissecting Apple's Meta-CDN during an iOS Update  short
      Jeremias Blendin and Fabrice Bendfeldt (Technische Universität Darmstadt), Ingmar Poese (Benocs), Boris Koldehofe (Technische Universität Darmstadt), Oliver Hohlfeld (RWTH Aachen University)
      Abstract: Content delivery networks (CDN) contribute more than 50% of today’s Internet traffic. Meta-CDNs, an evolution of centrally controlled CDNs, promise increased flexibility by multihoming content. So far, efforts to understand the characteristics of Meta-CDNs focus mainly on third-party Meta-CDN services. A common, but unexplored, use case for Meta-CDNs is to use the CDNs mapping infrastructure to form self-operated Meta-CDNs integrating third-party CDNs. These CDNs assist in the build-up phase of a CDN’s infrastructure or mitigate capacity shortages by offloading traffic. This paper investigates the Apple CDN as a prominent example of self-operated Meta-CDNs. We describe the involved CDNs, the request-mapping mechanism, and show the cache locations of the Apple CDN using measurements of more than 800 RIPE Atlas probes worldwide. We further measure its load-sharing behavior by observing a major iOS update in Sep. 2017, a significant event potentially reaching up to an estimated 1 billion iOS devices. Furthermore, by analyzing data from a European Eyeball ISP, we quantify third-party traffic offloading effects and find third-party CDNs increase their traffic by 438% while saturating seemingly unrelated links.
    • Understanding Video Management Planes  long
      Zahaib Akhtar (University of Southern California), Yun Seong Nam (Purdue University), Jessica Chen (University of Windsor), Ramesh Govindan (University of Southern California), Ethan Katz-Bassett (Columbia University), Sanjay Rao (Purdue University), Jibin Zhan and Hui Zhang (Conviva)
      Abstract: While Internet video control and data planes have received much research attention, little is known about the video management plane. In this paper, using data from more than a hundred video publishers spanning two years, we characterize the video management plane and its evolution. The management plane shows significant diversity with respect to video packaging, playback device support, and CDN use, and current trends suggest increasing diversity in some of these dimensions. This diversity adds complexity to management, and we show that the complexity of many management tasks is sub-linearly correlated with the number of hours a publisher’s content is viewed. Moreover, today each publisher runs an independent management plane, and this practice can lead to sub-optimal outcomes for syndicated content, such as redundancies in CDN storage and loss of control for content owners over delivery quality.
    • Advancing the Art of Internet Edge Outage Detection  long
      Philipp Richter (MIT / Akamai), Ramakrishna Padmanabhan and Neil Spring (University of Maryland), Arthur Berger (Akamai / MIT), David Clark (MIT)
      Abstract: Measuring reliability of edge networks in the Internet is difficult due to the size and heterogeneity of networks, the rarity of outages, and the difficulty of finding vantage points that can accurately capture such events at scale. In this paper, we use logs from a major CDN, detailing hourly request counts from address blocks. We discovered that in many edge address blocks, devices, collectively, contact the CDN every hour over weeks and months. We establish that a sudden temporary absence of these requests indicates a loss of Internet connectivity of those address blocks, events we call disruptions. We develop a disruption detection technique and present broad and detailed statistics on 1.5M disruption events over the course of a year. Our approach reveals that disruptions do not necessarily reflect actual service outages, but can be the result of prefix migrations. Major natural disasters are clearly represented in our data as expected; however, a large share of detected disruptions correlate well with planned human intervention during scheduled maintenance intervals, and are thus unlikely to be caused by external factors. Cross-evaluating our results we find that current state-of-the-art active outage detection over-estimates the occurrence of disruptions in some address blocks. Our observations of disruptions, service outages, and different causes for such events yield implications for the design of outage detection systems, as well as for policymakers seeking to establish reporting requirements for Internet services.
  • 12:20PM - 12:30PM - Discussion: Proposal for a Reproducibility Track
  • 12:30PM - 02:00PM - Lunch
    • 403 Forbidden: A Global View of CDN Geoblocking  long
      Allison McDonald and Matthew Bernhard (University of Michigan), Luke Valenta (University of Pennsylvania), Benjamin VanderSloot and Will Scott (University of Michigan), Nick Sullivan (Cloudflare), J. Alex Halderman and Roya Ensafi (University of Michigan)
      Abstract: We report the first wide-scale measurement study of serverside geographic restriction, or geoblocking, a phenomenon in which server operators intentionally deny access to users from particular countries or regions. Many sites practice geoblocking due to legal requirements or other business reasons, but excessive blocking can needlessly deny valuable content and services to entire national populations. To help researchers and policymakers understand this phenomenon, we develop a semi-automated system to detect instances where whole websites were rendered inaccessible due to geoblocking. By focusing on detecting geoblocking capabilities offered by large CDNs and cloud providers, we can reliably distinguish the practice from dynamic anti-abuse mechanisms and network-based censorship. We apply our techniques to test for geoblocking across the Alexa Top 10K sites from thousands of vantage points in 177 countries. We then expand our measurement to a sample of CDN customers in the Alexa Top 1M. We find that geoblocking occurs across a broad set of countries and sites. We observe geoblocking in nearly all countries we study, with Iran, Syria, Sudan, Cuba, and Russia experiencing the highest rates. These countries experience particularly high rates of geoblocking for finance and banking sites, likely as a result of U.S. economic sanctions. We also verify our measurements with data provided by Cloudflare, and find our observations to be accurate.
    • Where The Light Gets In: Analyzing Web Censorship Mechanisms in India  long
      Tarun Kumar Yadav, Akshat Sinha, Devashish Gosain, Piyush Kumar Sharma, and Sambuddho Chakravarty (IIIT Delhi)
      Abstract: This paper presents a detailed study of the Internet censorship in India. We consolidated a list of potentially blocked websites from various public sources to assess censorship mechanisms used by nine major ISPs. To begin with, we demonstrate that existing censorship detection tools like OONI are grossly inaccurate. We thus developed various techniques and heuristics to correctly assess censorship and study the underlying mechanism involved in these ISPs. At every step we corroborated our finding manually to test the efficacy of our approach, a step largely ignored by others. We fortify our findings by adjudging the coverage and consistency of censorship infrastructure, broadly in terms of average number of network paths and requested domains the infrastructure censors. Our results indicate a clear disparity among the ISPs, on how they install censorship infrastructure. For instance, in Idea network we observed the censorious middleboxes on over 90% of our tested intra-AS paths whereas for Vodafone, it is as low as 2.5%. We conclude our research by devising our own novel anti-censorship strategies, that does not depend on third party tools (like proxies, Tor and VPNs etc.). We managed to anti-censor all blocked websites in all ISPs under test
    • BGP Communities: Even more Worms in the Routing Can  long
      Florian Streibelt and Franziska Lichtblau (Max Planck Institute for Informatics), Robert Beverly (Naval Postgraduate School), Anja Feldmann (Max Planck Institute for Informatics), Cristel Pelsser (University of Strasbourgh), Georgios Smaragdakis (TU Berlin), Randy Bush (Internet Initiative Japan)
      Abstract: BGP communities are a mechanism widely used by operators to manage policy, mitigate attacks, and engineer traffic; e.g. to drop unwanted traffic, filter announcements, adjust local preference, and prepend paths to influence peer selection. Unfortunately, we show that BGP communities can be exploited by remote parties to influence routing in unintended ways. The BGP community-based vulnerabilities we expose are enabled by a combination of complex policies, error-prone configurations, a lack of cryptographic integrity and authenticity over communities, and the wide extent of community propagation. Due in part to their ill-defined semantics, BGP communities are often propagated far further than a single routing hop, even though their intended scope is typically limited to nearby ASes. Indeed, we find 14% of transit ASes forward received BGP communities onward. Given the rich inter-connectivity of transit ASes, this means that communities effectively propagate globally. As a consequence, remote adversaries can use BGP communities to trigger remote blackholing, steer traffic, and manipulate routes even without prefix hijacking. We highlight examples of these attacks via scenarios that we tested and measured both in the lab as well as in the wild. While we suggest what can be done to mitigate such ill effects, it is up to the Internet operations community whether to take up the suggestions.
    • A First Joint Look at DoS Attacks and BGP Blackholing in the Wild  short
      Mattijs Jonker and Aiko Pras (University of Twente), Alberto Dainotti (CAIDA, UC San Diego), Anna Sperotto (University of Twente)
      Abstract: BGP blackholing is an operational countermeasure that builds upon the capabilities of BGP to achieve DoS mitigation. Although empirical evidence of blackholing activities are documented in literature, a clear understanding of how blackholing is used in practice when attacks occur is still missing. This paper presents a first joint look at DoS attacks and BGP blackholing in the wild. We do this on the basis of two complementary data sets of DoS attacks, inferred from a large network telescope and DoS honeypots, and on a data set of blackholing events. All data sets span a period of three years, thus providing a longitudinal overview of operational deployment of blackholing during DoS attacks.
  • 03:30PM - 04:00PM - Afternoon Break
    • Is the Web Ready for OCSP Must Staple?  long
      Taejoong Chung and Jay Lok (Northeastern University), Balakrishnan Chandrasekaran (Max-Planck-Institut für Informatik), David Choffnes (Northeastern University), Dave Levin (University of Maryland), Bruce Maggs (Duke University and Akamai Technologies), Alan Mislove (Northeastern University), John Rula (Akamai Technologies), Nick Sullivan (Cloudflare), Christo Wilson (Northeastern University)
      Abstract: TLS, the de facto cryptographic protocol for securing communications in the Internet, relies on a hierarchy of certificates that bind names to public keys. Naturally, ensuring that the communicating parties are using only valid certificates is fundamental to benefit from the security of TLS. To this end, most certificates and clients support OCSP, a protocol for querying a certificate’s revocation status and confirm that it is still valid. Unfortunately, however, OCSP has been criticized for its slow performance, unreliability, soft-failures, and privacy issues. To address these issues, the OCSP Must-Staple certificate extension was introduced, which requires web servers to provide OCSP responses to clients during the TLS handshake, making revocation checks free for clients. Whether all of the players in the web’s PKI are ready to support OCSP Must-Staple, however, remains still an open question. In this paper, we take a broad look at the web’s PKI and determine if all components involved—namely, certificate authorities, web server administrators, and web browsers— are ready to support OCSP Must-Staple. We find that each component does not yet fully support OCSP Must-Staple: OCSP responders are still not fully reliable, and most major web browsers and web server implementations do not fully support OCSP Must-Staple. On the bright side, only a few players need to do something to make it possible for web admins to switch. Thus, we believe a much wider deployment of OCSP Must-Staple is an achievable goal.
    • The Rise of Certificate Transparency and Its Implications on the Internet Ecosystem  short
      Quirin Scheitle and Oliver Gasser (Technical University of Munich (TUM)), Theodor Nolte (HAW Hamburg), Johanna Amann (ICSI/Corelight/LBNL), Lexi Brent (The University of Sydney), Georg Carle (Technical University of Munich (TUM)), Ralph Holz (The University of Sydney), Thomas C. Schmidt (HAW Hamburg), Matthias Wählisch (FU Berlin)
      Abstract: In this paper, we analyze the evolution of Certificate Transparency (CT) over time and explore the implications of exposing certificate DNS names from the perspective of security and privacy. We find that certificates in CT logs have seen exponential growth. Website support for CT has also constantly increased, with now 33% of established connections supporting CT. With the increasing deployment of CT, there are also concerns of information leakage due to all certificates being visible in CT logs. To understand this threat, we introduce a CT honeypot and show that data from CT logs is being used to identify targets for scanning campaigns only minutes after certificate issuance. We present and evaluate a methodology to learn and validate new subdomains from the vast number of domains extracted from CT logged certificates.
    • Coming of Age: A Longitudinal Study of TLS Deployment  long - distinguished paper award
      Platon Kotzias (IMDEA Software Institute), Abbas Razaghpanah (Stony Brook University), Johanna Amann (ICSI/Corelight/LBNL), Kenneth G. Paterson (Royal Holloway, University of London), Narseo Vallina-Rodriguez (IMDEA Networks / International Computer Science Institute), Juan Caballero (IMDEA Software Institute)
      Abstract: The Transport Layer Security (TLS) protocol is the de-facto standard for encrypted communication on the Internet. However, it has been plagued by a number of different attacks and security issues over the last years. Addressing these attacks requires changes to the protocol, to server- or client-software, or to all of them. In this paper we conduct the first large-scale longitudinal study examining the evolution of the TLS ecosystem over the last six years. We place a special focus on the ecosystem's evolution in response to high-profile attacks. For our analysis, we use a passive measurement dataset with more than 319.3B connections since February 2012, and an active dataset that contains TLS and SSL scans of the entire IPv4 address space since August 2015. To identify the evolution of specific clients we also create the-to our knowledge-largest TLS client fingerprint database to date, consisting of 1,684 fingerprints. We observe that the ecosystem has shifted significantly since 2012, with major changes in which cipher suites and TLS extensions are offered by clients and accepted by servers having taken place. Where possible, we correlate these with the timing of specific attacks on TLS. At the same time, our results show that while clients, especially browsers, are quick to adopt new algorithms, they are also slow to drop support for older ones. We also encounter significant amounts of client software that probably unwittingly offer unsafe ciphers. We discuss these findings in the context of long tail effects in the TLS ecosystem.
  • 06:00PM - 10:00PM - Banquet
  • 08:30AM - 09:00AM - Light Breakfast
    • Tracing Cross Border Web Tracking  long - distinguished paper award
      Costas Iordanou (Universidad Carlos III de Madrid / Technical University (TU) Berlin), Georgios Smaragdakis (TU Berlin), Ingmar Poese (BENOCS), Nikolaos Laoutaris (Data Transparency Lab & Eurecat)
      Abstract: A tracking flow is a flow between an end user and a Web tracking service. We develop an extensive measurement methodology for quantifying at scale the amount of tracking flows that cross data protection borders, be it national ones, or international, such as the EU28 border within which the General Data Protection Regulation (GDPR) applies. Our methodology uses a browser extension to fully render advertising and tracking code, various lists and heuristics to extract well known trackers, reverse DNS to get all the IP ranges of trackers, and state-of-the-art geolocation. We employ our methodology on a dataset from 350 real users of the browser extension over a period of more than four months, and then generalize our results by analyzing Billions of Web tracking flows from more than 60 million broadband and mobile users from 4 large European ISPs. We show that the majority of tracking flows cross national borders in Europe but, unlike popular belief, are pretty well confined within the larger GDPR jurisdiction. Simple DNS redirection and PoP mirroring can increase national confinement while sealing almost all tracking flows within Europe. Last, and regretfully, we show that cross boarder tracking is prevalent even in sensitive and hence protected data categories and groups including health, sexual orientation, minors, and others.
    • How Tracking Companies Circumvented Ad Blockers Using WebSockets  short
      Muhammad Ahmad Bashir, Sajjad Arshad, Engin Kirda, William Robertson, and Christo Wilson (Northeastern University)
      Abstract: In this study of 100,000 websites, we document how Advertising and Analytics (A&A) companies have used WebSockets to bypass ad blocking, exfiltrate user tracking data, and deliver advertisements. Specifically, our measurements investigate how a long-standing bug in Chrome’s (the world’s most popular browser) chrome.webRequest API prevented blocking extensions from being able to interpose on WebSocket connections. We conducted large-scale crawls of top publishers before and after this bug was patched in April 2017 to examine which A&A companies were using WebSockets, what information was being transferred, and whether companies altered their behavior after the patch. We find that a small but persistent group of A&A companies use WebSockets, and that several of them engaged in troubling behavior, such as browser fingerprinting, exfiltrating the DOM, and serving advertisements, that would have circumvented blocking due to the Chrome bug.
    • Who Knocks at the IPv6 Door? Detecting IPv6 Scanning  short
      Kensuke Fukuda (NII/Sokendai), John Heidemann (USC/ISI)
      Abstract: DNS backscatter detects internet-wide activity by looking for common reverse DNS lookups at authoritative DNS servers that high in the DNS hierarchy. Both DNS backscatter and monitoring address space (network telescopes or darknets) can detect scanning in IPv4, but with IPv6's vastly more address space, network telesecopes becomes much less effective. This paper shows how to generalize DNS backscatter to IPv6. IPv6 requires new classification rules, but these reveal large network services, from cloud providers and CDNs to specific services such as NTP and mail. Backscatter also identifies router interfaces suggesting traceroute-based topology studies. We identify 16 scanners per week from DNS backscatter, as confirmed from backbone traffic observations and blacklists, and after elimination of benign services, classify another 95 originators in backscatter as potential scanners. Our work also confirms that IPv6 appears to be less carefully monitoring
    • A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists  long - community contribution award
      Quirin Scheitle (Technical University of Munich (TUM)), Oliver Hohlfeld (RWTH Aachen University), Julien Gamba (IMDEA, Universidad Carlos III de Madrid), Jonas Jelten (Technical University of Munich (TUM)), Torsten Zimmermann (RWTH Aachen University), Stephen D. Strowes (RIPE NCC), Narseo Vallina-Rodriguez (IMDEA Networks Institute / ICSI)
      Abstract: A broad range of research areas including Internet measurement, privacy, and network security rely on lists of target domains to be analysed; researchers make use of target lists for reasons of necessity or efficiency. The popular Alexa list of one million domains is a widely used example. Despite their prevalence in research papers, the soundness of top lists has seldom been questioned by the community: little is known about the lists’ creation, representativity, potential biases, stability, or overlap between lists. In this study we survey the extent, nature, and evolution of top lists used by research communities. We assess the structure and stability of these lists, and show that rank manipulation is possible for some lists. We also reproduce the results of several scientific studies to assess the impact of using a top list at all, which list specifically, and the date of list creation. We find that (i) top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and (ii) some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability. We conclude our paper with specific recommendations on the use of top lists, and how to interpret results based on top lists with caution.
  • 10:20AM - 11:00AM - Morning Break
    • Pushing the Boundaries with bdrmapIT: Mapping Router Ownership at Internet Scale  long
      Alexander Marder (University of Pennsylvania), Matthew Luckie (University of Waikato), Amogh Dhamdhere, Bradley Huffaker, and kc claffy (CAIDA / UC San Diego), Jonathan M. Smith (University of Pennsylvania)
      Abstract: Two complementary approaches to mapping network boundaries from traceroute paths recently emerged~[27,31]. Both approaches apply heuristics to inform inferences extracted from traceroute measurement campaigns. {\em bdrmap}~[27] used targeted traceroutes from a specific network, alias resolution probing techniques, and AS relationship inferences, to infer the boundaries of that specific network and the other networks attached at each boundary. {\em MAPIT}~[31] tackled the ambitious challenge of inferring {\em all AS-level network boundaries} in a massive archived collection of traceroutes launched from many different networks. Both were substantial contributions to the state-of-the-art, and inspired a collaboration to explore the potential to combine the approaches. We present and evaluate \texttt{bdrmapIT}, the result of that exploration, which yielded a more complete, accurate, and general solution to this persistent and central challenge of Internet topology research. \texttt{bdrmapIT} achieves 91.8\%-98.8\% accuracy when mapping AS boundaries in two Internet-wide traceroute datasets, vastly improving on \texttt{MAP-IT}'s coverage without sacrificing \texttt{bdrmap}'s ability to map a single network. The \bdrmapit{} source code is available at \url{https://git.io/fAsI0}.
    • Multilevel MDA-Lite Paris Traceroute  long
      Kevin Vermeulen (Sorbonne Université), Stephen D. Strowes (RIPE NCC), Olivier Fourmaux and Timur Friedman (Sorbonne Université)
      Abstract: Since its introduction in 2006-2007, Paris Traceroute and its Multipath Detection Algorithm (MDA) have been used to conduct well over a billion IP level multipath route traces from platforms such as M-Lab. Unfortunately, the MDA requires a large number of packets in order to trace an entire topology of load balanced paths between a source and a destination, which makes it undesirable for platforms that otherwise deploy Paris Traceroute, such as RIPE Atlas. In this paper we present a major update to the Paris Traceroute tool. Our contributions are: (1) MDA-Lite, an alternative to the MDA that significantly cuts overhead while maintaining a low failure probability; (2) Fakeroute, a simulator that enables validation of a multipath route tracing tool’s adherence to its claimed failure probability bounds; (3) multilevel multipath route tracing, with, for the first time, a Traceroute tool that provides a router-level view of multipath routes; and (4) surveys at both the IP and router levels of multipath routing in the Internet, showing, among other things, that load balancing topologies have increased in size well beyond what has been previously reported as recently as 2016. The data and the software underlying these results are publicly available.
    • O Peer, Where Art Thou? Uncovering Remote Peering Interconnections at IXPs  long
      George Nomikos, Vasileios Kotronis, and Pavlos Sermpezis (FORTH, Greece), Petros Gigis (FORTH & University of Crete, Greece), Lefteris Manassakis (FORTH, Greece), Christoph Dietzel (TU Berlin/DE-CIX), Stavros Konstantaras (AMS-IX, Netherlands), Xenofontas Dimitropoulos (FORTH & University of Crete, Greece), Vasileios Giotsas (Lancaster University, England)
      Abstract: Internet eXchange Points (IXPs) are Internet hubs that mainly provide the switching infrastructure to interconnect networks and exchange traffic. While the initial goal of IXPs was to bring together networks residing in the same city or country, and thus keep local traffic local, this model is gradually shifting. Many networks connect to IXPs without having physical presence at their switching infrastructure. This practice, called Remote Peering, is changing the Internet topology and economy, and has become the subject of a contentious debate within the network operators' community. However, despite the increasing attention it attracts, the understanding of the characteristics and impact of remote peering is limited. In this work, we introduce and validate a heuristic methodology for discovering remote peers at IXPs. We (i) identify critical remote peering inference challenges, (ii) infer remote peers with high accuracy (>95%) and coverage (93%) per IXP, and (iii) characterize different aspects of the remote peering ecosystem by applying our methodology to 30 large IXPs. We observe that remote peering is a significantly common practice in all the studied IXPs; for the largest IXPs, remote peers account for 40% of their member base. We also show that today, IXP growth is mainly driven by remote peering, which contributes two times more than local peering.
    • Three Bits Suffice: Explicit Support for Passive Measurement of Internet Latency in QUIC and TCP  short
      Piet De Vaere, Tobias Bühler, Mirja Kühlewind, and Bria Trammell (ETH Zurich)
      Abstract: Passive measurement is a commonly used approach for measuring round trip time (RTT), as it reduces bandwidth overhead compared to large-scale active measurements. However, passive RTT measurement is limited to transport-specific approaches, such as those that utilize Transmission Control Protocol (TCP) timestamps. Furthermore, the continuing deployment of encrypted transport protocols such as QUIC hides the information used for passive RTT measurement from the network. In this work, we introduce the latency spin signal as a light-weight, transport-independent and explicit replacement for TCP timestamps for passive latency measurement. This signal supports per-flow, single-point and single direction passive measurement of end-to-end RTT using just three bits in the transport protocol header, leveraging the existing dynamics of the vast majority of Internet-deployed transports. We show how the signal applies to measurement of both TCP and to QUIC through implementation of the signal in endpoint transport stacks. We also provide a high-performance measurement implementation for the signal using the Vector Packet Processing (VPP) framework. Evaluation on emulated networks and in an Internet testbed demonstrate the viability of the signal, and show that it is resistant to even large amounts of loss or reordering on the measured path.
  • 12:30PM - 02:00PM - Lunch
    • Analyzing Ethereum's Contract Topology  short
      Lucianna Kiffer (Northeastern University), Dave Levin (University of Maryland), Alan Mislove (Northeastern University)
      Abstract: Ethereum is the second most valuable cryptocurrency today, with a current market cap of over $68B. What sets Ethereum apart from other cryptocurrencies is that it uses the blockchain to not only store a record of transactions, but also smart contracts and a history of calls made to those contracts. Thus, Ethereum represents a new form of distributed system: one where users can implement contracts that can provide functionality such as voting protocols, crowdfunding projects, betting agreements, and many more. However, despite the massive investment, little is known about how contracts in Ethereum are actually created and used. In this paper, we examine how contracts in Ethereum are created, and how users and contracts interact with one another. We modify the geth client to log all such interactions, and find that contracts today are three times more likely to be created by other contracts than they are by users, and that over 60% of contracts have never been interacted with. Additionally, we obtain the bytecode of all contracts and look for similarity; we find that less than 10% of user-created contracts are unique, and less than 1% of contract-created contracts are so. Clustering the contracts based on code similarity reveals even further similarity. These results indicate that there is substantial code re-use in Ethereum, suggesting that bugs in such contracts could have widespread impact on the Ethereum user population.
    • Measuring Ethereum Network Peers  long
      Seoung Kyun Kim, Zane Ma, Siddharth Murali, Joshua Mason, Andrew Miller, and Michael Bailey (University of Illinois at Urbana-Champaign)
      Abstract: Ethereum, the second-largest cryptocurrency valued at a peak of $138 billion in 2018, is a decentralized, Turing-complete computing platform. Although the stability and security of Ethereum---and blockchain systems in general---have been widely-studied, most analysis has focused on application level features of these systems such as cryptographic mining challenges, smart contract semantics, or block mining operators. Little attention has been paid to the underlying peer-to-peer (P2P) networks that are responsible for information propagation and that enable blockchain consensus. In this work, we develop NodeFinder to measure this previously opaque network at scale and illuminate the properties of its nodes. We analyze the Ethereum network from two vantage points: a three-month long view of nodes on the P2P network, and a single day snapshot of the Ethereum Mainnet peers. We uncover a noisy DEVp2p ecosystem in which fewer than half of all nodes contribute to the Ethereum Mainnet. Through a comparison with other previously studied P2P networks including BitTorrent, Gnutella, and Bitcoin, we find that Ethereum differs in both network size and geographical distribution.
    • Digging into Browser-based Crypto Mining  short
      Jan Rüth, Torsten Zimmermann, Konrad Wolsing, and Oliver Hohlfeld (RWTH Aachen University)
      Abstract: Mining is the foundation of blockchain-based cryptocurrencies such as Bitcoin rewarding the miner for finding blocks for new transactions. The Monero currency enables mining with standard hardware in contrast to special hardware (ASICs) as often used in Bitcoin, paving the way for in-browser mining as a new revenue model for website operators. In this work, we study the prevalence of this new phenomenon. We identify and classify mining websites in 138M domains and present a new fingerprinting method which finds up to a factor of 5.7 more miners than publicly available block lists. Our work identifies and dissects Coinhive as the major browser-mining stakeholder. Further, we present a new method to associate mined blocks in the Monero blockchain to mining pools and uncover that Coinhive currently contributes 1.18% of mined blocks having turned over 1293 Moneros in June 2018.
  • 03:00PM - 03:30PM - Afternoon Break
    • When the Dike Breaks: Dissecting DNS Defenses During DDoS  long
      Giovane C. M. Moura (SIDN Labs/TU Delft), John Heidemann (USC/Information Sciences Institute), Moritz Müller (SIDN Labs/University of Twente), Ricardo de O. Schmidt (University of Passo Fundo), Marco Davids (SIDN Labs)
      Abstract: The Internet's Domain Name System (DNS) is a frequent target of Distributed Denial-of-Service (DDoS) attacks, but such attacks have had very different outcomes---some attacks have disabled major public websites, while the external effects of other attacks have been minimal. While on one hand the DNS protocol is relatively simple, the _system_ has many moving parts, with multiple levels of caching and retries and replicated servers. This paper uses controlled experiments to examine how these mechanisms affect DNS resilience and latency, exploring both the client side's DNS _user experience_, and server-side traffic. We find that, for about 30% of clients, caching is not effective. However, when caches are full they allow about half of clients to ride out server outages that last less than cache lifetimes, caching and retries together allow up to half of the clients to tolerate DDoS attacks longer than cache lifetimes, with 90% query loss, and almost all clients to tolerate attacks resulting in 50% packet loss. While clients may get service during an attack, tail-latency increases for clients. For servers, retries during DDoS attacks increase normal traffic up to $8\times$. Our findings about caching and retries help explain why users see service outages from some real-world DDoS events, but minimal visible effects from others.
    • Comments on DNS Robustness  short
      Mark Allman (ICSI)
      Abstract: The Domain Name System (DNS) maps human-friendly names into the network addresses necessary for network communication. Therefore, the robustness of the DNS is crucial to the general operation of the Internet. As such, the DNS protocol and architecture were designed to facilitate structural robustness within system. For instance, a domain can depend on authoritative nameservers in several topologically disparate datacenters to aid robustness. However, the actual operation of the system need not utilize these robustness tools. In this paper we provide an initial analysis of the structural robustness of the DNS ecosystem over the last nine years.
    • From Deletion to Re-Registration in Zero Seconds: Domain Registrar Behaviour During the Drop  short
      Tobias Lauinger and Ahmet Salih Buyukkayhan (Northeastern University), Abdelberi Chaabane (Nokia Bell Labs), William Robertson and Engin Kirda (Northeastern University)
      Abstract: When desirable Internet domain names expire, they are often re-registered in the very moment the old registration is deleted, in a highly competitive and resource-intensive practice called domain drop-catching. To date, there has been little insight into the daily time period when expired domain names are deleted, and the race to re-registration that takes place. In this paper, we show that .com domains are deleted in a predictable order, and propose a model to infer the earliest possible time a domain could have been re-registered. We leverage this model to characterise at a precision of seconds how fast certain types of domain names are re-registered. We show that 9.5% of deleted domains are re-registered immediately, with a delay of zero seconds. Domains not taken immediately by the drop-catch services are often re-registered later, with different behaviours over the following seconds, minutes and hours. Since these behaviours imply different effort and price points, our methodology can be useful for future work to explain the uses of re-registered domains.
    • LDplayer: DNS Experimentation at Scale  long
      Liang Zhu and John Heidemann (University of Southern California)
      Abstract: DNS has evolved over the last 20 years, improving in security and privacy and broadening the kinds of applications it supports. However, this evolution has been slowed by the large installed base and the wide range of implementations. The impact of changes is difficult to model due to complex interactions between DNS optimizations, caching, and distributed operation. We suggest that {experimentation at scale} is needed to evaluate changes and facilitate DNS evolution. This paper presents LDplayer, a configurable, general-purpose DNS experimental framework that enables DNS experiments to scale in several dimensions: many zones, multiple levels of DNS hierarchy, high query rates, and diverse query sources. LDplayer provides high fidelity experiments while meeting these requirements through its distributed DNS query replay system, methods to rebuild the relevant DNS hierarchy from traces, and efficient emulation of this hierarchy on minimal hardware. We show that a single DNS server can correctly emulate multiple independent levels of the DNS hierarchy while providing correct responses as if they were independent. We validate that our system can replay a DNS root traffic with tiny error ($\pm8ms$ quartiles in query timing and $\pm 0.1\%$ difference in query rate). We show that our system can replay queries at 87k queries/s, more than twice of a normal DNS Root traffic rate. LDplayer's trace replay has the unique ability to evaluate important design questions with confidence that we capture the interplay of caching, timeouts, and resource constraints. As an example, we demonstrate the memory requirements of a DNS root server with all traffic running over TCP and TLS, and identify performance discontinuities in latency as a function of client RTT.
  • 04:50PM - 05:00PM - Conference Concludes