Monday, 23th October 2023
- 19:00 - 22:00 - Reception at École de technologie supérieure (Pavilion E - Maison des étudiants)
Tuesday, 24th October 2023
- 08:00 - 08:30 - Breakfast
- 08:30 - 09:00 - Opening Remarks
- 09:00 - 10:00 - Keynote: "Enterprises, William Gibson, & the Art of Internet Measurement" by Vern Paxson (Corelight)
- 10:00 - 11:00 - Break and posters
- 11:00 - 12:00 - Session 1: Replication
- Replication (Session Chair: Anja Feldmann)
- Omar Darwich (LAAS-CNRS), Hugo Rimliger (Sorbonne Université), Milo Dreyfus (Sorbonne Université), Matthieu Gouel (Sorbonne Université), Kevin Vermeulen (LAAS-CNRS)Abstract: IP geolocation is one of the most widely used forms of metadata for IP addresses, and despite almost twenty years of effort from the research community, the reality is that there is no accurate, complete, up-to-date, and explainable publicly available dataset for IP geolocation. We argue that a central reason for this state of affairs is the impressive results from prior publications, both in terms of accuracy and coverage: up to street level accuracy and locating millions of IP addresses with a few hundred vantage points in months. We believe the community would substantially benefit from a public baseline dataset and code. To encourage future research in IP geolocation, we replicate two geolocation techniques and evaluate their accuracy and coverage. We show that we can neither use the first technique to obtain the previously claimed street level accuracy, nor the second to geolocate millions of IP addresses on today’s Internet and with publicly available measurement infrastructure. In addition to this reappraisal, we re-evaluate the fundamental insights that led to these prior results, as well as provide new insights and recommendations to help the design of future geolocation techniques. All of our code and data are publicly available to support reproducibility.
- Savvas Kastanakis (Lancaster University), Vasileios Giotsas (Cloudflare), Ioana Livadariu (Simula Metropolitan), Neeraj Suri (Lancaster University)Abstract: In 2003, Wang and Gao presented an algorithm to infer and characterize routing policies as this knowledge could be valuable in predicting and debugging routing paths. They used their algorithm to measure the phenomenon of selectively announced prefixes, in which, ASes would announce their prefixes to specific providers to manipulate incoming traffic. Since 2003, the Internet has evolved from a hierarchical graph, to a flat and dense structure. Despite 20 years of extensive research since that seminal work, the impact of these topological changes on routing policies is still blurred. In this paper we conduct a replicability study of the Wang and Gao paper, to shed light on the evolution and the current state of selectively announced prefixes. We show that selective announcements are persistent, not only across time, but also across networks. Moreover, we observe that neighbors of different AS relationships may be assigned with the same local preference values, and path selection is not as heavily dependent on AS relationships as it used to be. Our results highlight the need for BGP policy inference to be conducted as a high-periodicity process to account for the dynamic nature of AS connectivity and the derived policies.
- Abstract: We replicate the paper, "When to Use and When Not to Use BBR: An Empirical Analysis and Evaluation Study" by Cao et al, published in IMC 2019, with a focus on the relative goodput of TCP BBR and TCP CUBIC for a range of bottleneck buffer sizes, bandwidths, and delays. We replicate the experiments performed by the original authors on two large-scale open-access testbeds, to validate the conclusions of the paper. We further extend the experiments to BBRv2. We package the experiment artifacts and make them publicly available so that others can repeat and build on this work.
- Alessandro Finamore (Huawei Technologies France SASU), Wang Chao (Huawei Technologies France SASU), Jonatan Krolikowski (Huawei Technologies France SAS), Jose M. Navarro (Huawei Technologies France SASU), Fuxing Chen (Huawei Technologies France SASU), Dario Rossi (Huawei Technologies France SASU)Abstract: Over the last years we witnessed a renewed interest toward Traffic Classification (TC) captivated by the rise of Deep Learning (DL). Yet, the vast majority of TC literature lacks code artifacts, performance assessments across datasets and reference comparisons against Machine Learning (ML) methods. Among those works, a recent study from IMC’22 [16] is worth of attention since it adopts recent DL methodologies (namely, few-shot learning, self-supervision via contrastive learning and data augmentation) appealing for networking as they enable to learn from a few samples and transfer across datasets. The main result of [16] on the UCDAVIS19, ISCX-VPN and ISCX-Tor datasets is that, with such DL methodologies, 100 input samples are enough to achieve very high accuracy using an input representation called “flowpic” (i.e., a per-flow 2d histograms of the packets size evolution over time). In this paper (i) we reproduce [16] on the same datasets and (ii) we replicate its most salient aspect (the importance of data augmentation) on three additional public datasets (MIRAGE-19, MIRAGE-22 and UTMOBILENET21). While we confirm most of the original results, we also found a ≈20% accuracy drop on some of the investigated scenarios due to a data shift in the original dataset that we uncovered. Additionally, our study validates that the data augmentation strategies studied in [16] perform well on other datasets too. In the spirit of reproducibility and replicability we make all artifacts (code and data) available to the research community at https://tcbenchstack.github.io/tcbench/
- 12:00 - 14:00 - Lunch
- 14:00 - 15:15 - Session 2: Routing
- Routing (Session Chair: Alberto Dainotti)
- Bradley Huffaker (UC San Diego / CAIDA), Romain Fontugne (IIJ Research Lab), Alexander Marder (UC San Diego / CAIDA), kc Claffy (UC San Diego / CAIDA)Abstract: Recent geopolitical events have elevated interest in understanding which networks play dominant roles with respect to a specific country. We explore two existing BGP-based metrics for quantifying the role of Internet networks (autonomous systems) in the global routing system: AS-level customer cone (CC) size and AS hegemony (AH), a metric of path betweenness centrality. The focus of our study is adapting the global AS Customer Cone and AS Hegemony metrics to country-specific metrics by restricting the input data to destination prefixes in that country. We analyze the impact of this sometimes substantial downsampling of existing public routing data on the robustness of resulting statistics and rankings of ASes based on these statistics. We apply two approaches to analyzing the stability of the sample-based rankings, yielding a generalizable method to assess when it is safe to use sample-based rankings for estimating the top-ranked ASes in a country. Conversely, our method can provide an indication of how many additional BGP vantage points in a country are required to yield reliable domestic rankings. We apply our country-specific metrics to case studies of Australia, Japan, Russia, and the United States, demonstrating the potential to facilitate studies of concentration and interdependence in telecommunications markets in the face of multiple forces driving the Internet infrastructure's evolution. We make our data set available for other researchers to explore and advance interdisciplinary studies of political economy and Internet topology.
- Thomas Krenc (UC San Diego / CAIDA), Matthew Luckie (UC San Diego / CAIDA), Alexander Marder (UC San Diego / CAIDA), kc Claffy (UC San Diego / CAIDA)Abstract: BGP communities allow operators to influence routing decisions made by other networks (action communities) and to annotate their network's routing information with metadata such as where each route was learned or the relationship the network has with their neighbor (information communities). BGP communities also help researchers understand complex Internet routing behaviors. However, there is no standard convention for how operators assign community values, and significant efforts to scalably infer community meanings have ignored this high-level classification. We discovered that doing so comes at significant cost in accuracy, of both inference and validation. To advance this narrow but powerful direction in Internet infrastructure research, we design and validate an algorithm to execute this first fundamental step: inferring whether a BGP community is action or information. We applied our method to 78,480 community values observed in public BGP data for May 2023. Validating our inferences (24,376 action and 54,104 informational communities) against available ground truth (6,259 communities) we find that our method classified 96.5% correctly. We found that the precision of a state-of-the-art location community inference method increased from 68.2% to 94.8% with our classifications. We publicly share our code, dictionaries, inferences, and datasets to enable the community to benefit from them.
- Weitong Li (Virginia Tech), Zhexiao Lin (University of California, Berkeley), Md. Ishtiaq Ashiq (Virginia Tech), Emile Aben (RIPE NCC), Romain Fontugne (IIJ Research Lab), Amreesh Phokeer (Internet Society), Taejoong “Tijay” Chung (Virginia Tech)Abstract: The Resource Public Key Infrastructure (RPKI) is a system to add security to the Internet routing. In recent years, the publication of Route Origin Authorization (ROA) objects, which bind IP prefixes to their legitimate origin ASN, has been rapidly increasing. However, ROAs are effective only if the routers use them to verify and filter invalid BGP announcements, a process called Route Origin Validation (ROV). There are many proposed approaches to measure the status of ROV in the wild, but they are limited in scalability or accuracy. In this paper, we present RoVista, an ROV measurement framework that leverages IP-ID side channel and in-the-wild RPKI-invalid prefix. With over 16 months of longitudinal measurement, RoVista successfully covers more than 27K ASes where 56.1% of ASes have derived benefits from ROV, although the percentage of fully protected ASes re- mains relatively low at 9.4%. In order to validate our findings, we have also sought input from network operators. We then evaluate the security impact of current ROV deployment and reveal misconfigurations that will weaken the protection of ROV. Lastly, we compare RoVista with other approaches and conclude with a discussion of our findings and limitations.
- Taha Albakour (TU Berlin), Oliver Gasser (Max Planck Institute for Informatics), Robert Beverly (Center for Measurement and Analysis of Network Data), Georgios Smaragdakis (Delft University of Technology)Abstract: The Internet architecture has facilitated a multi-party, distributed, and heterogeneous physical infrastructure where routers from different vendors connect and inter-operate via IP. Such vendor heterogeneity can have important security and policy implications. For example, a security vulnerability may be specific to a particular vendor and implementation, and thus will have a disproportionate impact on particular networks and paths if exploited. From a policy perspective, governments are now explicitly banning particular vendors— or have threatened to do so. Despite these critical issues, the composition of router vendors across the Internet remains largely opaque. Remotely identifying router vendors is challenging due to their strict security posture, indistinguishability due to code sharing across vendors, and noise due to vendor mergers. We make progress in overcoming these challenges by developing LFP, a tool that improves the coverage, accuracy, and efficiency of router fingerprinting as compared to the current state-of- the-art. We leverage LFP to characterize the degree of router vendor homogeneity within networks and the regional distribution of vendors. We then take a path-centric view and apply LFP to better understand the potential for correlated failures and fate-sharing. Finally, we perform a case study on inter- and intra-United States data paths to explore the feasibility to make vendor-based routing policy decisions, i.e., whether it is possible to avoid a particular vendor given the current infrastructure.
- Ben Du (UC San Diego), Katherine Izhikevich (UC San Diego), Sumanth Rao (UC San Diego), Gautam Akiwate (Stanford University), Cecilia Testart (Georgia Tech/MIT), Alex C. Snoeren (UC San Diego), kc Claffy (UC San Diego / CAIDA)Abstract: The Internet Routing Registry (IRR) is a set of distributed databases used by networks to register routing policy information and to validate messages received in the Border Gateway Protocol (BGP). First deployed in the 1990s, the IRR remains the most widely used database for routing security purposes, despite the existence of more recent alternatives. Yet, the IRR lacks a strict validation standard and the limited coordination across different database providers can lead to inaccuracies. Moreover, it has been reported that bad actors have begun to register false records in the IRR to bypass operators' defenses when launching attacks on the Internet routing system, such as BGP hijacks. In this paper, we provide a longitudinal analysis of the IRR over the span of 1.5 years. We develop a workflow to identify irregular IRR records that contained conflicting information compared to different routing data sources. We identified 34,199 irregular route objects in the largest IRR database and found 6,373 to be potentially suspicious. We curated a list of 315 suspicious route objects to the public.
- 15:15 - 16:15 - Session 3: Web 1
- Web (Session Chair: David Choffnes)
- Jiahui HE (Hong Kong University of Science and Technology (GZ)), Haris Bin Zia (Queen Mary University of London), Ignacio Castro (Queen Mary University of London), Aravindh Raman (Telefonica Research), Nishanth Sastry (University of Surrey, UK), Gareth Tyson (Hong Kong University of Science and Technology (GZ), Queen Mary University of London)Abstract: The acquisition of Twitter by Elon Musk has spurred controversy and uncertainty among Twitter users. The move raised both praise and concerns, particularly regarding Musk's views on free speech. As a result, a large number of Twitter users have looked for alternatives to Twitter. Mastodon, a decentralized micro-blogging social network, has attracted the attention of many users and the general media. In this paper, we analyze the migration of 136,009 users from Twitter to Mastodon. We inspect the impact that this has on the wider Mastodon ecosystem, particularly in terms of user-driven pressure towards centralization. We further explore factors that influence users to migrate, highlighting the effect of users' social networks. Finally, we inspect the behavior of individual users, showing how they utilize both Twitter and Mastodon in parallel. This leads us to build classifiers to explore if migration is predictable. Through a feature analysis, we evaluate the factors that most effectively predict a user's decision to migrate.
- The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement ShortCalvin Ardi (University of Southern California/Information Sciences Institute), Matt Calder (Meta / Columbia University)Abstract: Much of the content and structure of the Web remains inaccessible to evaluate at scale because it is gated by user accounts. This limitation restricts researchers to examining only a superficial layer of a website: the landing page or public search-indexable pages. In this work we take first steps toward an approach that exposes more of the web by automating website access using popular Single Sign On providers. SSO providers such as Google and Facebook enable users to authenticate across many sites using a single account -- eliminating the need to create and manage massive numbers of individual accounts.
- Jingyuan Zhu (University of Michigan), Anish Nyayachavadi (University of Michigan), Jiangchen Zhu (Columbia University), Vaspol Ruamviboonsuk (Microsoft), Harsha V. Madhyastha (University of Southern California)Abstract: The web is littered with millions of links which previously worked but no longer do. When users encounter any such broken link, they resort to looking up an archived copy of the linked page. But, for a sizeable fraction of these broken links, no archived copies exist. Even if a copy exists, it often poorly approximates the original page, e.g., any functionality on the page which requires the client browser to communicate with the page’s backend servers will not work, and even the latest copy will be missing updates made to the page’s content after that copy was captured. To address this situation, we observe that broken links are often merely a result of website reorganizations; the linked page still exists on the same site, albeit at a different URL. Therefore, given a broken link, our system FABLE attempts to find the linked page’s new URL by learning and exploiting the pattern in how the old URLs for other pages on the same site have transformed to their new URLs. We show that our approach is significantly more accurate and efficient than prior approaches which rely on stability in page content over time. FABLE increases the fraction of dead links for which the corresponding new URLs can be found by 50%, while reducing the median delay incurred in identifying the new URL for a broken link from over 40 seconds to less than 10 seconds.
- Kaiyan Liu (Geogre Mason University), Nan Wu (George Mason University), Bo Han (George Mason University)Abstract: By combining various emerging technologies, mobile extended reality (XR) blends the real world with virtual content to create a spectrum of immersive experiences. Although Web-based XR can offer attractive features such as better accessibility and cross-platform compatibility, its performance may not be on par with its standalone counterpart. As a low-level bytecode, WebAssembly has the potential to drastically accelerate Web-based XR by enabling near-native execution speed. However, little has been known about how well Web-based XR performs with WebAssembly acceleration. To bridge this crucial gap, we conduct a first-of-its-kind systematic and empirical study to analyze the performance of Web-based XR expedited by WebAssembly on four diverse platforms with five different browsers. Our measurement results reveal that although WebAssemlby can accelerate different XR tasks in various contexts, there remains a substantial performance disparity between Web-based and standalone XR. We hope our findings can foster the realization of an immersive Web that is accessible to a wider audience.
- 16:15 - 16:45 - Break and posters
- 16:45 - 17:30 - Session 4: Web 2
- Web (Session Chair: Hamed Haddadi)
- Ali Rasaii (Max Planck Institute for Informatics), Devashish Gosain (KU Leuven), Oliver Gasser (Max Planck Institute for Informatics)Abstract: Privacy regulations have led to many websites showing cookie banners to their users. Usually, cookie banners present the user with the option to “accept” or “reject” cookies. Recently, a new form of paywall-like cookie banner has taken hold on the Web, giving users the option to either accept cookies (and consequently user tracking) or buy a paid subscription for a tracking-free website experience. In this paper, we perform the first completely automated analysis of cookiewalls, i.e., cookie banners acting as a paywall. We find cookiewalls on 0.6% of all queried 45k websites. Moreover, cookiewalls are deployed to a large degree on European websites, e.g., for Germany we see cookiewalls on 8.5% of top 1k websites. Additionally, websites using cookiewalls send 6.4 times more third-party cookies and 42 times more tracking cookies to visitors, compared to regular cookie banner websites. We also uncover two large Subscription Management Platforms used on hundreds of websites, which provide website operators with easy-to-setup cookiewall solutions. Finally, we plan to publish tools, data, and code to foster reproducibility and further studies.
- A Longitudinal Study of Vulnerable Client-side Resources and Web Developers' Updating Behaviors LongKyungchan Lim (University of Tennessee, Knoxville), Yonghwi Kwon (University of Virginia), Doowon Kim (University of Tennessee, Knoxville)Abstract: Modern Websites rely on various client-side web resources, such as JavaScript libraries, to provide end-users with rich and interactive web experiences. Unfortunately, anecdotal evidence shows that improperly managed client-side resources could open up attack surfaces that adversaries can exploit. However, there is a lack of systematic understanding of the security impact of client-side resources. In this paper, we conduct a longitudinal (four-year) measurement study of the security practices and implications on client-side resources (e.g., JavaScript libraries and Adobe Flash) across the Web. Specifically, we first collect a large-scale dataset of 157.2M webpages of Alexa Top 1M websites for four years in the wild. Analyzing the dataset, we find an average of 41.2% websites (in each year of the four years) carry at least one vulnerable client-side resource (e.g., JavaScript or Adobe Flash). We also reveal that vulnerable JavaScript library versions are frequently observed in the wild, suggesting a concerning level of lagging update practice in the wild. On average, we observe 531.2 days with 25,337 websites of the window of vulnerability due to the unpatched client-side resources from the release of security patches. Furthermore, we manually investigate the fidelity of CVE (Common Vulnerabilities and Exposures) reports on client-side resources, leveraging PoC (Proof of Concept) code. We find that 13 CVE reports (out of 27) have incorrect vulnerable version information, which may impact security-related tasks such as security updates.
- John Pegioudis (FORTH & University of Crete), Emmanouil Papadogianakis (FORTH & University of Crete), Nicolas Kourtellis (Telefonica Research), Evangelos P. Markatos (FORTH & University of Crete), Panagiotis Papadopoulos (FORTH)Abstract: Contemporary browsers constitute a critical component of our everyday interactions with the Web. Similar to a small, but powerful operating system, a browser is responsible to fetch and run web apps locally, on the user’s (mobile) device. Even though in the last few years, there has been an increased interest for tools and mechanisms to block potentially malicious behaviours of web domains against the users’ privacy (e.g., ad blockers, incognito browsing mode, etc.), it is still unclear if the user can browse the Web in private. In this paper, we analyse the natively generated network traffic of 15 mobile browser apps under different configurations, in an attempt to investigate if the users are capable of browsing the Web privately, without sharing their browsing history with remote servers. To achieve this, we develop Panoptes: a novel framework for instrumenting and monitoring separately the mobile browser traffic that is generated by (i) the web engine and (ii) natively by the mobile app. By crawling a set of websites via Panoptes, and analyzing the native traffic of browsers we find that there are browsers (i) who persistently track their users, and (ii) browsers that report to remote servers geolocated outside of EU and in violation of GDPR, the exact page and content the user is browsing at that moment. Finally, we see browsers communicating with third party ad servers while leaking PII and device identifiers.
- 18:30 - 21:00 - Student Dinner at McGill Faculty Club
Wednesday, 25th October 2023
- 08:00 - 09:00 - Breakfast
- 09:00 - 10:00 - Keynote #2 by Vincent Gautrais
- 10:00 - 10:30 - Break
- 10:30 - 12:00 - Session 1: Security and DNS
- Security (Session Chair: Gautam Akiwate)
- Fenglu Zhang (Tsinghua University), Yunyi Zhang (National University of Defense Technology), Baojun Liu (Tsinghua University), Eihal Alowaisheq (King Saud University), Lingyun Ying (QI-ANXIN Technology Research Institute), Xiang Li (Tsinghua University), Zaifeng Zhang (360 Security Technology Inc.), Ying Liu (Tsinghua University), Haixin Duan (Tsinghua University; Quancheng Laboratory), Min Zhang (National University of Defense Technology)Abstract: Leveraging DNS for covert communications is appealing since most networks allow DNS traffic, especially the ones directed toward renowned DNS hosting services. Unfortunately, most DNS hosting services overlook domain ownership verification, enabling miscreants to host undelegated DNS records of a domain they do not own. Consequently, miscreants can conduct covert communication through such undelegated records for whitelisted domains on reputable hosting providers. In this paper, we shed light on the emerging threat posed by undelegated records and demonstrate their exploitation in the wild. To the best of our knowledge, this security risk has not been studied before. We conducted a comprehensive measurement to reveal the prevalence of the risk. In total, we observed 1,580,925 unique undelegated records that are potentially abused. We further observed that a considerable portion of these records are associated with malicious behaviors. By utilizing threat intelligence and malicious traffic collected by malware sandbox, we extracted malicious IP addresses from 25.41% of these records, spanning 1,369 Tranco top 2K domains and 248 DNS hosting providers, including Cloudflare and Amazon. Furthermore, we discovered that the majority of the identified malicious activities are Trojan-related. Moreover, we conducted case studies on two malware families (Dark.IOT and Specter) that exploit undelegated records to obtain C2 servers, in addition to the masquerading SPF records to conceal SMTP-based covert communication. Also, we provided mitigation options for different entities. As a result of our disclosure, several popular hosting providers have taken action to address this issue.
- Guannan Liu (Virginia Tech), Lin Jin (University of Delaware), Shuai Hao (Old Dominion University), Yubao Zhang (University of Delaware), Daiping Liu (University of Delaware), Angelos Stavrou (Virginia Tech), Haining Wang (Virginia Tech)Abstract: Non-Existent Domain (NXDomain) is one type of the Domain Name System (DNS) error responses, indicating that the queried domain name does not exist and cannot be resolved. Unfortunately, little research has focused on understanding why and how NXDomain responses are generated, utilized, and exploited. In this paper, we conduct the first comprehensive and systematic study on NXDomain by investigating its scale, origin, and security implications. Utilizing a large-scale passive DNS database, we identify 146,363,745,785 NXDomains queried by DNS users between 2014 and 2022. Within these 146 billion NXDomains, 91 million of them hold historic WHOIS records, of which 5.3 million are identified as malicious domains including about 2.4 million blocklisted domains, 2.8 million DGA (Domain Generation Algorithms) based domains, and 90 thousand squatting domains targeting popular domains. To gain more insights into the usage patterns and security risks of NXDomains, we register 19 carefully selected NXDomains in the DNS database, each of which received more than ten thousand DNS queries per month. We then deploy a honeypot for our registered domains and collect 5,925,311 incoming queries for 6 months, from which we discover that 5,186,858 and 505,238 queries are generated from automated processes and web crawlers, respectively. Finally, we perform extensive traffic analysis on our collected data and reveal that NXDomains can be misused for various purposes, including botnet takeover, malicious file injection, and residue trust exploitation.
- Yevheniya Nosyk (Université Grenoble Alpes), Maciej Korczyński (Université Grenoble Alpes), Andrzej Duda (Université Grenoble Alpes)Abstract: The Domain Name System (DNS) relies on response codes to confirm successful transactions or indicate anomalies. Yet, the response codes are not sufficiently fine-grained to pinpoint the root causes of resolution failures. RFC 8914 (Extended DNS Errors or EDE) addresses the problem by defining a new extensible registry of error codes and serving them inside the OPT resource record. In this paper, we show that four major DNS resolver vendors and three large public DNS resolvers support this standard, but do not agree in 94% of our test cases. We reveal that Cloudflare DNS is the most precise in indicating various DNS misconfigurations via the EDE mechanism, so we use it to perform a large-scale analysis of more than 303M registered domain names. We show that 17.7M of them trigger extended error codes. Lame delegations and DNSSEC validation failures are the most common problems encountered.
- Zane Ma (Georgia Institute of Technology), Aaron Faulkenberry (Georgia Institute of Technology), Thomas Papastergiou (Georgia Institute of Technology), Zakir Durumeric (Stanford University), Michael D. Bailey (Georgia Institute of Technology), Angelos D. Keromytis (Georgia Institute of Technology), Fabian Monrose (Georgia Institute of Technology), Manos Antonakakis (Georgia Institute of Technology)Abstract: TLS certificates, which provide server authentication for a majority of today's email servers (i.e., STARTTLS) and websites (i.e., HTTPS), primarily function as a trusted mapping between a domain name and a cryptographic keypair. To enable scalable authentication and avoid constant certificate re-issuance, certificate authorities (CAs) produce certificates that are valid for up to 398 days. This static, name-to-key caching mechanism belies a complex reality: a tangle of dynamic infrastructure surrounding domains, servers, cryptographic keys, and CAs. When any of these operations changes, the authentication information attested by a certificate can become stale and no longer accurately reflect real-world operations. In this work, we examine the broader phenomenon of \emph{certificate invalidation events}. We first taxonomize the functions of information contained within a certificate and then identify how changes to underlying operations can affect the accuracy of different information categories, leading to a stale certificate. We discover three classes of security-relevant invalidation events that enable a third-party to impersonate a domain outside of their control. Utilizing large-scale certificate and domain datasets, we quantify these precarious scanerios and find that they have affected over 15K domains per day, on average. Unfortunately, modern certificate revocation provides little recourse, so we examine the impact of reducing certificate lifetimes and estimate a potential 75\% time decrease in precarious access to valid TLS keys if the current 398-day limit is reduced to 90 days.
- The CVE Wayback Machine: Measuring Coordinated Disclosure from Exploits Against 2 Years of Zero-Days LongEric Pauley (University of Wisconsin–Madison), Paul Barford (University of Wisconsin–Madison), Patrick McDaniel (University of Wisconsin–Madison)Abstract: Real-world software security depends on coordinated vulnerability disclosure (CVD) from researchers, a process that the community has continually sought to measure and improve. Yet, CVD practices are only as effective as the data that informs them, with deep and representative data collection remaining frustratingly elusive. In this paper, we leverage Kappa, a cloud-based interactive Internet telescope, to build statistical models of vulnerability lifecycles, bridging the data gap in over 20 years of CVD research. By analyzing application-layer Internet scanning traffic seen by our new vantage point over two years, we identify real-world exploitation timelines for 63 emergent threats. We bring this data together with six additional conventional datasets to build a complete birth-to-death model of these vulnerabilities, the most complete analysis of vulnerability lifecycles to date. Our statistical analysis reaches three key recommendations: (1) CVD across diverse vendors shows lower effectiveness than previously thought, (2) intrusion detection systems are underutilized to provide protection for critical vulnerabilities, and (3) existing data sources of CVD can be augmented by novel approaches to Internet measurement. In this way, our vantage point offers new opportunities to improve the CVD process, achieving a safer software ecosystem in practice.
- 12:00 - 14:00 - Lunch
- 14:00 - 15:15 - Session 2: Security
- Security (Session Chair: Hamed Haddadi)
- Jingjing Wang (Beijing University of Posts and Telecommunications), Liu Wang (Beijing University of Posts and Telecommunications), Feng Dong (Huazhong University of Science and Technology), Haoyu Wang (Huazhong University of Science and Technology)Abstract: VirusTotal is the mostly used online scanning service in both academia and industry. However, it is known that the results returned by antivirus engines are often inconsistent and changing over time. The intrinsic dynamics of VirusTotal labeling has prompted researchers to investigate the characteristics of label dynamics for more effective use. However, they are generally limited in terms of the size and diversity of the datasets used in the measurements. This poses threats to many of their conclusions. In this paper, we perform an extraordinary large-scale study to re-measure the label dynamics of VirusTotal. Our dataset involves all the scan data in VirusTotal over a 14-month period, including over 571 million samples and 847 million reports in total. With this large dataset, we are able to revisit many issues related to label dynamics of VirusTotal, including the prevalence of label dynamics/silence, the characteristics across file types, the impact of label dynamics on common label aggregation methods, the stabilization patterns of labels, etc. Our measurement reveals some observations that are unknown to the research community and even inconsistent with previous research. We believe that our findings could help researchers advance the understanding of the VirusTotal ecosystem.
- Sayak Saha Roy (The University of Texas at Arlington), Unique Karanjit (The University of Texas at Arlington), Shirin Nilizadeh (The University of Texas at Arlington)Abstract: Free Website Building services (FWBs) provide individuals with a cost-effective and convenient way to create a website without requiring advanced technical knowledge or coding skills. However, malicious actors often abuse these services to host phishing websites. In this work, we propose FreePhish, a scalable framework to continuously identify phishing websites that are created using FWBs. Using FreePhish, we were able to detect and characterize more than 31.4K phishing URLs that were created using 17 unique free website builder services and shared on Twitter and Facebook over a period of six months. We find that FWBs provide attackers with several features that make it easier to create and maintain phishing websites at scale while simultaneously evading anti-phishing countermeasures. Our study indicates that anti-phishing blocklists and browser protection tools have significantly lower coverage and high detection time against FWB phishing attacks when compared to regular (self-hosted) phishing websites. While our prompt disclosure of these attacks helped some FWBs to remove these attacks, we found several others who were slow at removal or did not remove them outright, with the same also being true for Twitter and Facebook. Finally, we also provide FreePhish as a free Chromium web extension that can be utilized to prevent end-users from accessing potential FWB-based phishing attacks.
- Cristian Munteanu (Max Planck Institute for Informatics), Said Jawad Saidi (Max Planck Institute for Informatics), Oliver Gasser (Max Planck Institute for Informatics), Georgios Smaragdakis (Delft University of Technology), Anja Feldmann (Max Planck Institute for Informatics)Abstract: Honeypots have been used for decades to detect, monitor, and understand attempts of unauthorized use of information systems. Previous studies focused on characterizing the spread of malware, e.g., Mirai and other attacks, or proposed stealthy and interactive architectures to improve honeypot efficiency. In this paper, we present insights and benefits gained from collaborating with an operational honeyfarm, i.e., a set of honeypots distributed around the globe with centralized data collection. We analyze data of about 400 million sessions over a 15-month period, gathered from a globally distributed honeyfarm consisting of 221 honeypots deployed in 55 countries. Our analysis unveils stark differences among the activity seen by the honeypots---some are contacted millions of times while others only observe a few thousand sessions. We also analyze the behavior of scouters and intruders of these honeypots. Again, some honeypots report orders of magnitude more interactions with command execution than others. Still, diversity is needed since even if we focus on the honeypots with the highest visibility, they see only a small fraction of the intrusions, including only 5\% of the files. Thus, although around 2\% of intrusions are visible by most of the honeypots in our honeyfarm, the rest are only visible to a few. We conclude with a discussion of the findings of work.
- Evolving Bots: The New Generation of Comment Bots and their Underlying Scam Campaigns in YouTube LongSeung Ho Na (KAIST), Sumin Cho (KAIST), Seungwon Shin (KAIST)Abstract: This paper presents a pioneering investigation into a novel form of scam advertising method on YouTube, termed ``social scam bots'' (SSBs). These bots have evolved to emulate benign user behavior by posting comments and engaging with other users, oftentimes appearing prominently among the top rated comments. We analyzed the YouTube video comments and proposed a method to identify SSBs and extract the underlying scam domains. Our study revealed 1,134 SSBs promoting 72 scam campaigns responsible for infecting 31.73% of crawled videos. Further investigation revealed that SSBs exhibit advances that surpass traditional bots. Notably, they demonstrated understanding of the target audience by aligning scam campaigns with the specific video content, effectively leveraging the YouTube recommendation algorithm. We monitored these SSBs over a period of six months, enabling us to evaluate the effectiveness of YouTube's mitigation efforts. We uncovered various strategies employed by SSBs to evade mitigation attempts, including a novel strategy called ``self-engagement," aimed at boosting the ranking of their comments. By shedding light on the phenomenon of SSBs and their evolving tactics, our study aims to raise awareness and contribute to the prevention of these malicious actors, ultimately fostering a safer online platform.
- Liz Izhikevich (Stanford University), Manda Tran (Stanford University), Michalis Kallitsis (Merit Network, Inc.), Aurore Fass (Stanford University, CISPA Helmholtz Center for Information Security), Zakir Durumeric (Stanford University)Abstract: Cloud computing has dramatically changed service deployment patterns. In this work, we analyze how attackers identify and target cloud services in contrast to traditional networks and network telescopes. Using a diverse set of cloud honeypots in 5 providers and 23 countries as well as 2 educational networks and 1 network telescope, we analyze how IP address assignment, geography, network, and service-port selection, influence what services are targeted in the cloud. We find that scanners that target cloud compute are selective: they avoid scanning networks without legitimate services and they discriminate between geographic regions. Further, attackers mine Internet-service search engines to find exploitable services and, in some cases, they avoid targeting IANA-assigned protocols, causing researchers to misclassify up to 16% of traffic on select ports. Based on our results, we derive recommendations for researchers and operators.
- 15:15 - 15:45 - Break
- 15:45 - 17:00 - Session 3: Security and Privacy
- Security and Privacy (Session Chair: Liz Izhikevich)
- Daniel Wagner (DE-CIX / Max Planck Institute for Informatics), Sahil Ashish Ranadive (Georgia Institute of Technology), Harm Griffioen (Delft University of Technology), Michalis Kallitsis (Merit Network, Inc.), Alberto Dainotti (Georgia Institute of Technology), Georgios Smaragdakis (Delft University of Technology), Anja Feldmann (Max Planck Institute for Informatics)Abstract: Unsolicited traffic sent to advertised network space that does not host active services provides insights about misconfigu- rations as well as potentially malicious activities including the spread of Botnets, DDoS campaigns, and exploitation of vulnerabilities. Network telescopes have been used for many years to monitor such unsolicited traffic. Unfortunately, they are limited by the available address space for such tasks and, thus, limited to specific geographic and/or network regions. In this paper, we argue that telescopes do not need ded- icated address space. Rather it suffices to focus on address space that is unlikely to be in use. Indeed, we observe that large parts of the advertised IPv4 address space is neither hosting users nor services. Thus, when such space is de- tected, the traffic destined to it is unsolicited background radiation. We refer to telescopes that capture such traffic as meta-telescopes. By using central network vantage points we identify the largest and most distributed meta telescopes to date—consisting of more than 350k /24 blocks in more than 7k ASes. Using background radiation from these meta- telescope we highlight that unsolicited traffic differs by net- work/geographic region as well as by network type. Finally, we discuss our experiences and the challenges of operating meta-telescopes in the wild.
- Stefan Czybik (Technische Universität Berlin), Micha Horlboge (Technische Universität Berlin), Konrad Rieck (Technische Universität Berlin)Abstract: The Sender Policy Framework (SPF) is a basic mechanism for authorizing the use of domains in email. In combination with other mechanisms, it serves as a cornerstone for protecting users from forged senders. In this paper, we investigate the configuration of SPF across the Internet. To this end, we analyze SPF records from 12 million domains, representing the largest measurement of SPF to date. Our analysis shows a growing adoption, with 56.5 % of the examined domains providing SPF records. However, we also uncover notable security issues: First, 2.9 % of the SPF records have errors, undefined content or ineffective rules, undermining the intended protection. Second, we observe a large number of very lax configurations. For example, 34.7 % of the domains allow emails to be sent from over 100 000 IP addresses. We explore the reasons for these loose policies and demonstrate that they facilitate email forgery. As a remedy, we derive recommendations for an adequate configuration and notify all operators of domains with misconfigured SPF records.
- Nurullah Demir (Institut for Internet Security, Karlsruhe Institute of Technology (KIT)), Tobias Urban (Institute for Internet Security, Westphalian University of Applied Sciences), Jan Hörnemann (Institute for Internet Security and AWARE7 GmbH), Matteo Grosse-Kampmann (AWARE7 GmbH), Thorsten Holz (CISPA Helmholtz Center for Information Security), Norbert Pohlmann (Institute for Internet Security, Westphalian University of Applied Sciences), Christian Wressnegger (Karlsruhe Institute of Technology (KIT))Abstract: Measurement studies are essential for research and industry alike to understand the Web's inner workings better and help quantify specific phenomena. Performing such studies is demanding due to the dynamic nature and size of the Web. An experiment's careful design and setup are complex, and many factors might affect the results. However, while several works have independently observed differences in the outcome of an experiment (e.g., the number of observed trackers) based on the measurement setup, it is unclear what causes such deviations. This work investigates the reasons for these differences by visiting 1.7M webpages with five different measurement setups. Based on this, we build 'dependency trees' for each page and cross-compare the nodes in the trees. The results show that the measured trees differ considerably, that the cause of differences can be attributed to specific nodes, and that even identical measurement setups can produce different results.
- Salim Chouaki (LIX, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris), Oana Goga (LIX, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris), Hamed Haddadi (Imperial College London, Brave Software), Peter Snyder (Brave Software)Abstract: We present the first extensive measurement of the privacy properties of the advertising systems used by privacy-focused search engines. We propose an automated methodology to study the impact of clicking on search ads on three popular private search engines which have advertising-based business models: StartPage, Qwant, and DuckDuckGo, and we compare them to two dominant data-harvesting ones: Google and Bing. We investigate the possibility of third parties tracking users when clicking on ads by analyzing first-party storage, redirection domain paths, and requests sent before, when, and after the clicks. Our results show that privacy-focused search engines fail to protect users' privacy when clicking ads. Users' requests are sent through redirectors on 4\% of ad clicks on Bing, 86\% of ad clicks on Qwant, and 100\% of ad clicks on Google, DuckDuckGo, and StartPage. Even worse, advertising systems collude with advertisers across all search engines by passing unique IDs to advertisers in most ad clicks. These IDs allow redirectors to aggregate users' activity on ads' destination websites in addition to the activity they record when users are redirected through them. Overall, we observe that both privacy-focused and traditional search engines engage in privacy-harming behaviors allowing cross-site tracking, even in privacy-enhanced browsers.
- Stephen McQuistin (University of Glasgow), Pete Snyder (Brave Software), Colin Perkins (University of Glasgow), Hamed Haddadi (Brave Software, Imperial College London), Gareth Tyson (Hong Kong University of Science & Technology)Abstract: The public suffix list is a community-maintained list of rules that can be applied to domain names to determine how they should be grouped into logical organizations or companies. We present the first large scale measurement study of how the public suffix list is used by open-source software on the Web, and the privacy harm resulting from projects using outdated versions of the list. Specifically, we measure how often developers include out of date versions of the public suffix list in their projects, how old included lists are, and estimate the real world privacy harm with a model based on a large scale crawl of the Web. Our findings include that incorrect public suffix list is a frequent problem in open source software, and that at least 44 open source projects use hard-coded, outdated versions of the public suffix list, including popular, security-focused projects, such as password managers and digital forensics tools. We also estimate that, because of these out-of-date lists, these projects make incorrect privacy decisions for 1313 effective top level domains (eTLDs), affecting 50,750 domains, by extrapolating from data gathered by the HTTP Archive projec
- 17:00 - 17:30 - Session 4: Distributed Protocols
- Distributed Protocols (Session Chair: Steve Uhlig)
- Leonhard Balduf (Technical University of Darmstadt), Maciej Korczyński (University of Grenoble Alps, Grenoble Informatics Laboratory), Onur Ascigil (Lancaster University), Navin Keizer (University College London), George Pavlou (UCL), Björn Scheuermann (Technical University of Darmstadt), Michał Król (City, University of London)Abstract: Interplanetary Filesystem (IPFS) is one of the largest peer-to-peer filesystems in operation. The network is the default storage layer for Web3 and is being presented as a solution to the centralization of the web. In this paper, we present a large-scale, multi-modal measurement study of the IPFS network. We analyze the topology, the traffic, the content providers and the entry points from the classical Internet. Our measurements show significant centralization in the IPFS network and a high share of nodes hosted in the cloud. We also shed light on the main stakeholders in the ecosystem. We discuss key challenges that might disrupt continuing efforts to decentralize the Web and highlight multiple properties that are creating pressures toward centralization.
- Lioba Heimbach (ETH Zurich), Christof Ferreira Torres (ETH Zurich), Lucianna Kiffer (ETH Zurich), Roger Wattenhofer (ETH Zurich)Abstract: With Ethereum's transition from Proof-of-Work to Proof-of-Stake in September 2022 came another paradigm shift, the Proposer-Builder Separation (PBS) scheme. PBS was introduced to decouple the roles of selecting and ordering transactions in a block (i.e., the builder), from those validating its contents and proposing the block to the network as the new head of the blockchain (i.e., the proposer). In this landscape, proposers are the validators in the Proof-of-Stake consensus protocol who validate and secure the network, while now relying on specialized block builders for creating blocks with the most value (e.g., transaction fees) for the proposer. Additionally, relays play a crucial new role in this ecosystem, acting as mediators between builders and proposers, being entrusted with the responsibility of transmitting the most lucrative blocks from the builders to the proposers. PBS is currently an opt-in protocol (i.e., a proposer can still opt-out and build their own blocks). In this work, we study it's adoption and show that the current PBS landscape exhibits significant centralization amongst the builders and relays. We further explore whether PBS effectively achieves its intended objectives of enabling hobbyist validators to maximize block profitability and preventing censorship. Our findings reveal that although PBS grants all validators the same opportunity to access optimized and competitive blocks, it tends to stimulate censorship rather than reduce it. Additionally, our analysis demonstrates that relays do not consistently uphold their commitments and may prove unreliable. Specifically, there are instances where proposers do not receive the complete value as initially promised, and the censorship or filtering capabilities pledged by the relay exhibit significant gaps.
- 18:15 - 21:30 - Conference Banquet at InterContinental Montréal
Thursday, 26th October 2023
- 08:00 - 08:45 - Breakfast
- 08:45 - 09:00 - IMC 2024 announcement
- 9:00 - 10:00 - Session 1: IoT
- IoT (Session Chair: Alexander Gamero)
- Tianrui Hu (Northeastern University), Daniel J. Dubois (Northeastern University), David Choffnes (Northeastern University)Abstract: Smart home IoT platforms are typically closed systems, meaning that there is poor visibility into device behavior. Understanding device behavior is important not only for understanding whether devices are functioning as expected, but also can reveal implications for privacy (e.g., surreptitious audio/video recording), security (e.g., device compromise), and safety (e.g., denial of service on a baby monitor). While there has been some work on identifying devices and a handful of activities, an open question is what is the extent to which we can automatically model the entire behavior of an IoT deployment, and how it changes over time, without any privileged access to IoT devices or platform messages. In this work, we demonstrate that the vast majority of IoT behavior can indeed be modeled, using a novel multi-dimensional approach that relies only on the (often encrypted) network traffic exchanged by IoT devices. Our key insight is that IoT behavior (including cross-device interactions) can often be captured using relatively simple models such as timers (for periodic behavior) and probabilistic state-machines (for user-initiated behavior and devices interactions) during a limited observation phase. We then propose deviation metrics that can identify when the behavior of an IoT device or an IoT system changes over time. Our models and metrics successfully identify several notable changes in our IoT deployment, including a camera that changed locations, network outages that impact connectivity, and device malfunctions.
- Aniketh Girish (IMDEA Networks / Universidad Carlos III de Madrid), Tianrui Hu (Northeastern University), Vijay Prakash (New York University), Daniel J. Dubois (Northeastern University), Srdjan Matic (IMDEA Software Institute), Danny Yuxing Huang (New York University), Serge Egelman (UC Berkeley / ICSI), Joel Reardon (University of Calgary), Juan Tapiador (Universidad Carlos III de Madrid), David Choffnes (Northeastern University), Narseo Vallina-Rodriguez (IMDEA Networks/AppCensus)Abstract: The network communication between Internet of Things (IoT) devices on the same local network has significant implications for security, privacy, and correctness. Yet, local network traffic has been largely ignored by prior literature, which typically focuses on studying the communication between devices and wide-area endpoints or detecting vulnerable IoT devices exposed to the Internet. In this paper, we present a comprehensive measurement study to shed light on the local communication within a smart home deployment and its associated threats. We use a unique combination of passive network traffic captures, honeypot interactions, and crowdsourced data from participants to identify a wide range of device activities on the local network. We then analyze these diverse datasets to characterize local network protocols, security and privacy threats associated with them, and real examples of information exposure due to local IoT traffic. Our analysis reveals vulnerable devices and insecure network protocols, how sensitive network and device data is exposed in the local network, and how this is abused by malicious actors and even exfiltrated to remote servers, potentially for tracking purposes. We will make our datasets and analysis publicly available to support further research in this area.
- Behind the Scenes: Uncovering TLS and Server Certificate Practice of IoT Device Vendors in the Wild LongHongying Dong (University of Virginia), Hao Shu (New York University), Vijay Prakash (New York University), Yizhe Zhang (University of Virginia), Muhammad Talha Paracha (Northeastern University), David Choffnes (Northeastern University), Santiago Torres-Arias (Purdue University), Danny Yuxing Huang (New York University), Yixin Sun (University of Virginia)Abstract: IoT devices are increasingly used in consumer homes. Despite recent works in characterizing IoT TLS usage for a limited number of in-lab devices, there exists a gap in quantitatively understanding TLS behaviors from devices in the wild and server-side certificate management. To bridge this knowledge gap, we conduct a new measurement study by focusing on the practice of *device vendors*, through a crowdsourced dataset of network traffic from 2,014 real-world IoT devices across 721 global users. Through a new approach by identifying the sharing of TLS fingerprints across vendors and across devices, we uncover the prevalent use of customized TLS libraries (i.e., not matched to any known TLS libraries) and potential security concerns resulting from co-located TLS stacks of different services. Furthermore, we present the first known study on server-side certificate management for servers contacted by IoT devices. Our study highlights potential concerns in the TLS/PKI practice by IoT device vendors. We aim to raise visibility for these issues and motivate vendors to improve security practice.
- Armin Sarabi (University of Michigan), Tongxin Yin (University of Michigan), Mingyan Liu (University of Michigan)Abstract: In this paper we propose the use of large language models (LLMs) for characterizing, clustering, and fingerprinting raw text data generated by network measurements. To this end, we train a transformer-based masked language model, namely RoBERTa, on a dataset containing hundreds of millions of banners obtained from Internet-wide scans to learn their underlying structure. We further fine-tune this model using a custom loss function (driven by domain knowledge) to produce temporally stable numerical representations (embeddings) that can be used out-of-the-box for downstream learning tasks. Our generated embeddings are robust, resilient to small random changes in the content of a banner, and maintain proximity between embeddings of similar hardware/software products, or hosts with similar configuration. We further cluster embeddings of HTTP banners using a density-based approach (HDBSCAN), and examine the obtained clusters to generate text-based fingerprints for the purpose of labeling raw scan data. We compare our fingerprints to Recog, an existing database of manually curated fingerprints, and show that they we can identify new IoT devices and server products that were not previously captured by Recog. We believe that our proposed methodology poses an important direction for future research by utilizing state-of-the-art language models to automatically analyze, interpret, and label the large amounts of data generated by active Internet scans.
- 10:00 - 10:30 - Break
- 10:30 - 11:45 - Session 2: Transport
- Transport (Session Chair: Marie-José Montpetit)
- Taveesh Sharma (University of Chicago), Tarun Mangla (University of Chicago), Arpit Gupta (UCSB), Junchen Jiang (University of Chicago), Nick Feamster (University of Chicago)Abstract: The increased use of video conferencing applications (VCAs) has made it critical to understand and support end-user quality of experience (QoE) by all stakeholders in the VCA ecosystem, especially network operators, who typically do not have direct access to client software. Existing VCA QoE estimation methods use passive measurements of application-level Real-time Transport Protocol (RTP) headers. However, a network operator does not always have access to RTP headers in all cases, particularly when VCAs use custom RTP protocols (e.g., Zoom) or due to system constraints (e.g., legacy measurement systems). Given this challenge, this paper considers the use of more standard features in the network traffic, namely, IP and UDP headers, to provide per-second estimates of key VCA QoE metrics such as frames rate and video resolution. We develop a method that uses machine learning with a combination of flow statistics (e.g., throughput) and features derived based on the mechanisms used by the VCAs to fragment video frames into packets. We evaluate our method for three prevalent VCAs running over WebRTC: Google Meet, Microsoft Teams, and Cisco Webex. Our evaluation consists of 54,696 seconds of VCA data collected from both (1), controlled in-lab network conditions, and (2) real-world networks from 15 households. We show that the ML-based approach yields similar accuracy compared to the RTP-based methods, despite using only IP/UDP data. For instance, we can estimate FPS within 2 FPS for up to 83.05% of one-second intervals in the real-world data, which is only 1.76% lower than using the application-level RTP headers.
- Zeya Umayya (IIIT Delhi), Dhruv Malik (IIIT Delhi), Devashish Gosain (KU Leuven), Piyush Kumar Sharma (KU Leuven)Abstract: Tor, one of the most popular censorship circumvention systems, faces regular blocking attempts by censors. Thus to facilitate access, it relies on ``pluggable transports'' (PTs) that disguise Tor's traffic and makes it hard for the adversary to block Tor. However, these PTs are not yet well studied and compared for the performance they provide to the users. Thus we conduct a first comparative performance evaluation of a total of 12 PTs---the ones currently supported by the Tor project and those that can be integrated in the future. Our results reveal multiple facets of the PT ecosystem. (1) PTs' download time significantly varies even under similar network conditions. (2) All PTs are not equally reliable.Thus clients who regularly suffer censorship may falsely believe that such PTs are blocked. (3) PT performance depends on the underlying communication primitive. (4) PTs performance significantly depends on the website access method (browser or command-line). Surprisingly, some PTs perform even better than vanilla Tor. Based on our findings from more than $1.25$M measurements, we provide recommendations about selecting PTs and believe that our study can help the Tor community to facilitate access for users that face censorship.
- Abstract: Since its introduction in 2015, QUIC has seen rapid adoption and is set to be the default transport stack for HTTP3. Given that developers can now easily implement and deploy their own congestion control algorithms in the user space, there is an imminent risk of the proliferation of QUIC implementations of congestion control algorithms that no longer resemble their corresponding standard kernel implementations. In this paper, we present the results of a comprehensive measurement study of the congestion control algorithm (CCA) implementations for 11 popular open-source QUIC stacks. We propose a new metric called Conformance-T that can help us identify the implementations with large deviations more accurately and also provide hints on how they can be modified to be more conformant to reference kernel implementations. Our results show that while most QUIC CCA implementations are conformant in shallow buffers, they become less conformant in deep buffers. In the process, we also identified five new QUIC implementations that had low conformance and demonstrated how low-conformance implementations can cause unfairness and subvert our expectations of how we expect different CCAs to interact. With the hints obtained from our new metric, we were able to identify implementation-level differences that led to the low conformance and derive the modifications required to improve conformance for three of them.
- Constantin Sander (RWTH Aachen University), Ike Kunze (RWTH Aachen University), Leo Blöcher (RWTH Aachen University), Mike Kosek (Technical University of Munich), Klaus Wehrle (RWTH Aachen University)Abstract: TCP and QUIC can both leverage ECN to avoid congestion loss and its retransmission overhead. However, both protocols require support of their remote endpoints and it took two decades since the initial standardization of ECN for TCP to reach 80 % ECN support and more in the wild. In contrast, the QUIC standard mandates ECN support, but there are notable ambiguities that make it unclear if and how ECN can actually be used with QUIC on the Internet. Hence, in this paper, we analyze ECN support with QUIC in the wild: We conduct repeated measurements on more than 180 M domains to identify HTTP/3 websites and analyze the underlying QUIC connections w.r.t. ECN support. We only find 20 % of QUIC hosts, providing 6 % of HTTP/3 websites, to mirror client ECN codepoints. Yet, mirroring ECN is only half of what is required for ECN with QUIC, as QUIC validates mirrored ECN codepoints to detect network impairments: We observe that less than 2 % of QUIC hosts, providing less than 0.3 % of HTTP/3 websites, pass this validation. We identify possible root causes in content providers not showing QUIC ECN support and network impairments hindering ECN. We thus also characterize ECN with QUIC distributedly to traverse other paths and discuss our results w.r.t. QUIC and ECN innovations beyond QUIC.
- Ike Kunze (RWTH Aachen University), Constantin Sander (RWTH Aachen University), Klaus Wehrle (RWTH Aachen University)Abstract: Encrypted QUIC traffic complicates network management as traditional transport layer semantics can no longer be used for RTT or packet loss measurements. Addressing this challenge, QUIC includes an optional, carefully designed mechanism: the spin bit. While its capabilities have already been studied in test settings, its real-world usefulness and adoption are unknown. In this paper, we thus investigate the deployment and utility of the spin bit on the web. Analyzing our long-term measurements of more than 200M domains, we find that the spin bit is enabled on ~10% of those with QUIC support and for 50% / 60% of the underlying IPv4 / IPv6 hosts. The support is mainly driven by medium-sized cloud providers while most hyperscalers do not implement it. Assessing the utility of spin bit RTT measurements, the theoretical issue of reordering does not significantly manifest in our study and the spin bit provides accurate estimates for around 30.5% of connections, but drastically overestimates the RTT for another 51.7%. Overall, we conclude that the spin bit, even though an optional feature, indeed sees use in the wild and is able to provide reasonable RTT estimates for a solid share of QUIC connections, but requires solutions for making its measurements more robust.
- 12:00 - 14:00 - Lunch
- 14:00 - 15:00 - Session 3: Tagging
- Tagging (Session Chair: John Heidemann)
- Hazem Ibrahim (New York University Abu Dhabi), Rohail Asim (New York University Abu Dhabi (NYUAD)), Matteo Varvello (Nokia), Yasir Zaki (New York University Abu Dhabi (NYUAD))Abstract: Location tags enable tracking of personal belongings. This is achieved locally, e.g, via Bluetooth with a paired phone, and remotely, by piggybacking on the location reported by location-reporting devices which come into proximity of a tag. There has been anecdotal evidence that location tags are also misused to stalk people. This paper studies the performance of the two most popular location tags (Apple's AirTag and Samsung's SmartTag) through controlled experiments -- with a known large distribution of location-reporting devices -- as well as in-the-wild experiments -- with no control on the number and kind of reporting devices encountered, thus emulating real-life use-cases. We find that both tags achieve similar performance, e.g, they are located 55\% of the times in about 10 minutes within a 100 m radius. It follows that real time stalking via location tags is impractical, even when both tags are concurrently deployed which achieves comparable accuracy in half the time. Nevertheless, half of a victim's movements can be backtracked accurately (10 m error) with just a one-hour delay.
- Umar Iqbal (University of Washington), Pouneh Nikkhah Bahrami (University of California, Davis), Rahmadi Trimananda (University of California, Irvine), Hao Cui (University of California, Irvine), Alexander Gamero-Garrido (Northeastern University), Daniel Dubois (Northeastern University), David Choffnes (Northeastern University), Athina Markopoulou (University of California, Irvine), Franziska Roesner (University of Washington), Zubair Shafiq (University of California, Davis)Abstract: Smart speakers pose unique privacy concerns due to their always-on microphone, their limited user interface, collection of potentially sensitive data, and integration with third party skills. Given these concerns, there is a need for greater transparency and control over data collection, usage, and sharing by smart speaker platforms as well as third party skills supported on them. To bridge this gap, we build a framework to measure data collection, usage, and sharing by the smart speaker platforms. We apply our framework to the Amazon smart speaker ecosystem. Our results show that Amazon and third parties, including advertising and tracking services that are unique to the smart speaker ecosystem, collect smart speaker interaction data. We also find that Amazon processes smart speaker interaction data to infer user interests and uses those inferences to serve targeted ads to users. Smart speaker interaction also leads to ad targeting and as much as 30X higher bids in ad auctions, from third party advertisers. Finally, we find that Amazon's and third party skills' data practices are often not clearly disclosed in their policy documents.
- Taha Albakour (TU Berlin), Oliver Gasser (Max Planck Institute for Informatics), Georgios Smaragdakis (Delft University of Technology)Abstract: In this paper, we show that utilizing multiple protocols offers a unique opportunity to improve IP alias resolution and dual-stack inference substantially. Our key observation is that prevalent protocols, e.g., SSH and BGP, reply to unsolicited requests with a set of values that can be combined to form a unique device identifier. More importantly, this is possible by just completing the TCP handshake. Our empirical study shows that utilizing readily available scans and our active measurements can double the discovered IPv4 alias sets and more than 30x the dual-stack sets compared to the state-of-the-art techniques. We provide insights into our method's accuracy and performance compared to popular techniques.
- 15:00 - 16:00 - Session 4: Latency
- Latency (Session Chair: Cecilia Testart)
- Zeinab Shmeis (EPFL), Mohammad Abdullah (EPFL), Pavlos Nikolopoulos (EPFL), Katerina Argyraki (EPFL), David Choffnes (Northeastern University), Phillipa Gill (Google)Abstract: Network neutrality is important for users, content providers, policymakers, and regulators interested in understanding how network providers differentiate performance. When determining whether a network differentiates against certain traffic, it is important to have strong evidence, especially given that traffic differentiation is illegal in certain countries. Prior work (Wehe) detects differentiation via end-to-end throughput measurements between a client and server, but does not isolate the network responsible for it. Differentiation can occur anywhere on the network path between endpoints; thus, further evidence is needed to assign blame to a network. We present a system, built atop Wehe, that can localize traffic differentiation, i.e., obtain concrete evidence that the differen- tiation happened within the edge ISP. Our system builds on ideas from network performance tomography; the challenge we solve is that TCP congestion control creates an adversarial environment for performance tomography (because it can significantly reduce the performance correlation on which tomography fundamentally relies). We evaluate our system via measurements “in the wild”, as well as in a set of emulated scenarios with a wide-area testbed; we further explore its lim- its via simulations and show that it accurately localizes traffic differentiation across a wide range of network conditions.
- Abstract: Keeping track of Internet latency is a classic measurement problem. Open measurement platforms like RIPE Atlas are a great solution, but they also face challenges: preventing network overload that may result from uncontrolled active measurements, and maintaining the involved devices, which are typically contributed by volunteers and non-profit organizations, and tend to lag behind the state of the art in terms of features and performance. We explore gaming footage as a new source of real-time, publicly available, passive latency measurements, which have the potential to complement open measurement platforms. We show that it is feasible to mine this source of information by presenting Tero, a system that continuously downloads gaming footage from the Twitch streaming platform, extracts latency measurements from it, and converts them to latency distributions per geographical region.
- Xiao Song (University of Southern California), Guillermo Baltra (University of Southern California / Information Sciences Institute), John Heidemann (University of Southern California / Information Sciences Institute)Abstract: Network traffic is often diurnal, with some networks peaking during the workday and many homes during evening streaming hours. Monitoring systems consider diurnal trends for capacity planning and anomaly detection. In this paper, we reverse this inference and use \emph{diurnal network trends and their absence to infer human activity}. We draw on existing and new ICMP echo-request scans of more than 5M /24 IPv4 networks to identify diurnal trends in IP address responsiveness. Some of these networks are \emph{change-sensitive}, with diurnal patterns correlating with human activity. We develop algorithms to clean this data, extract underlying trends from diurnal and weekly fluctuation, and detect changes in that behavior. Although firewalls hide many networks, and Network Address Translation often hides human trends, we show that about 170k to 420k (4--8\% of the 5M) /24 IPv4 networks are change-sensitive. These blocks are spread globally, representing some of the most active 58\% of $2\times2^\circ$ geographic grids, regions that include 98\% of ping-responsive blocks. Finally, we detect interesting changes in human activity. Reusing existing data allows our new algorithm to identify changes, such as Work-from-Home, due to the global reaction to the emergence of Covid-19 in 2020. We also see other changes in human activity, such as national holidays and government-mandated curfews. This ability to detect trends in human behavior from Internet data provides a new ability to understand our world, complementing other public information sources such as news reports and wastewater virus observation.
- 16:00 - 16:30 - Break
- 16:30 - 17:30 - Session 5: Cellular and mobile networks
- Cellular and mobile networks (Session Chair: Steve Uhlig)
- Stefanos Bakirtzis (University of Cambridge, Ranplan Wireless), André Felipe Zanella (IMDEA Networks, Universidad Carlos III de Madrid (U3CM)), Stefania Rubrichi (Orange), Cezary Ziemlicki (Orange), Zbigniew Smoreda (Orange), Ian Wassell (University of Cambridge), Jie Zhang (University of Sheffield, Ranplan Wireless), Marco Fiore (IMDEA Networks)Abstract: Indoor cellular networks (ICNs) are anticipated to become a principal component of 5G and beyond systems. ICNs aim at extending network coverage and enhancing users' quality of service and experience, consequently producing a substantial volume of traffic in the coming years. Despite the increasing importance that ICNs will have in cellular deployments, there is nowadays little understanding of the type of traffic demands that they serve. Our work contributes to closing that gap, by providing a first characterization of the usage of mobile services across more than 4,500 cellular antennas deployed at over 1,000 indoor locations across a whole country. Our analysis reveals that ICNs inherently manifest a limited set of mobile application utilization profiles, which are not present in conventional outdoor macro base stations (BSs). We interpret the indoor traffic profiles via explainable machine learning techniques, and show how they are correlated to the indoor environment. Our findings show how indoor cellular demands are strongly dependent on the nature of the deployment location, which allows anticipating the type of demands that indoor 5G networks will have to serve and paves the way for their efficient planning and dimensioning.
- Jiayi Meng (Purdue University), Jinqgi Huang (Purdue University), Y. Charlie Hu (Purdue University), Yaron Koral (AT&T Labs), Xiaojun Lin (Purdue University), Muhammad Shahbaz (Purdue University), Abhigyan Sharma (AT&T Labs)Abstract: With 5G deployment gaining momentum, the control-plane traffic volume of cellular networks is escalating. Such rapid traffic growth motivates the need to study the mobile core network (MCN) control-plane design and performance optimization. Doing so requires realistic, large control-plane traffic traces in order to profile and debug the mobile network performance under real workload. However, large-scale control-plane traffic traces are not made available to the public by mobile operators due to business and privacy concerns. As such, it is critically important to develop accurate, scalable, versatile, and open-to-innovation control traffic generators, which in turn critically rely on an accurate traffic model for the control plane. Developing an accurate model of control-plane traffic faces several challenges: (1) how to capture the dependence among the control events generated by each User Equipment (UE), (2) how to model the inter-arrival time and sojourn time of control events of individual UEs, and (3) how to capture the diversity of control-plane traffic across UEs. We present a novel two-level hierarchical state-machine-based control-plane traffic model. We further show how our model can be easily adjusted from LTE to NextG networks (e.g., 5G) to support modeling future control-plane traffic. We experimentally validate that the proposed model can generate large realistic control-plane traffic traces. Our traffic generator will be provided to the public as open-source to stimulate MCN research.
- Moinak Ghoshal (Northeastern University), Imran Khan (Northeastern University), Z. Jonny Kong (Purdue University), Phuc Dinh (Northeastern University), Jiayi Meng (Purdue University), Y. Charlie Hu (Purdue University), Dimitrios Koutsonikolas (Northeastern University)Abstract: After 4 years of rapid deployment in the US, 5G is expected to have significantly improved the performance and overall user experience of mobile networks. However, recent measurement studies have focused either on static performance or a single aspect (e.g., handovers) under driving conditions of 5G, and do not provide a complete picture of cellular network performance today under driving conditions – a major use case of mobile networks. Through a cross-continental US driving trip (from LA to Boston, 5700km+), we conduct an in-depth measurement study of user-perceived experience (network coverage/performance and QoE of a set of major latency-critical 5G “killer” apps) under all three major US carriers, while collecting low-level 5G statistics and signaling messages. Our study shows disappointingly low coverage of 5G networks today under driving and highly fragmented coverage by cellular technologies (LTE, LTE-A, 5G low band, midband, and mmWave). More importantly, the network and application performance is often poor under driving even in areas with full 5G coverage. Further, our study sheds lights on the key question of whether 5G today can enable latency- critical apps and possible “digital-divide” by comparing 5G deployment strategies of commercial cellular networks and correlating technology-wise coverage and performance with geo-location and population density.
- André Felipe Zanella (IMDEA Networks, Universidad Carlos III de Madrid (U3CM)), Antonio Bazco-Nogueras (IMDEA Networks Institute), Cezary Ziemlicki (Orange Labs), Marco Fiore (IMDEA Networks Institute)Abstract: We analyze 4G and 5G transport-layer sessions generated by a wide range of mobile services at over 282,000 base stations (BSs) of an operational mobile network, and carry out a statistical characterization of their demand rates, associated traffic volume and temporal duration. Our measurement-based study unveils previously unobserved session-level behaviors that are specific to individual mobile applications and persistent across space, time and radio access technology. Based on these insights, we model the arrival process of sessions at heterogeneously loaded BSs, the distribution of the session-level load and its relationship with the session duration, using simple yet effective mathematical approaches. Our models are fine-tuned to a variety of services, and complement existing tools that mimic packet-level statistics or aggregated spatiotemporal traffic demands at mobile network BSs. They thus offer an original angle to mobile traffic data generation, and support a more credible performance evaluation of solutions for network planning and management, including a more dependable training and testing of data-driven tools. We assess the utility of the models in a practical application use case, demonstrating how their realism enables a more trustworthy evaluation of energy-efficient allocation of compute resources in virtualized radio access networks.
