Internet-wide measurement infrastructures have been used for a wide range of experiments to create “big (Internet) data”. These large datasets have rich semantic content on the structure, dynamics, and usage of today’s Internet at all levels, from physical, to application and service layer. Yet, Internet research remains hampered by long-recognized issues, ranging from the limited geographic and network diversity captured, to the tension between privacy, measurement visibility and experimental control.

A large set of widely deployed vantage points especially inside edge networks, that are programmable to drive general experimentation are essential to address these issues. Research efforts have expanded the human-centric focus of traditional crowdsourcing (such as the Amazon Mechanical Turk, Ushahidi and others) by enlisting a large and diverse set of users and devices and turning them into vantage points. Such a trend raises new and interesting methodological challenges: How should we enlist vantage points in the right locations? How do these platforms differ from/extend human-entered crowdsourcing systems? What kinds of experiments are technically and ethically viable? What is the right programming interface for the experimenter? Given the limited control we have on these platforms, what is the right experimental model? Could we build a federation of platforms and how would that work?

After obtaining such crowdsourced big Internet data, we will need innovative approaches for curating and storing this data to the benefit of the wider community. For example, it should be easy to search or transfer data, remove sensitive information without affecting its applicability to problems of practical interest, and to share it with others that respect the constrained use of such “synthesized” data.

We believe that to make progress on these and related problems requires the collective expertise and input from the larger network community. To facilitate this process, the workshop organizers invite short submissions of (1) papers describing original, early-work research on topics relevant to the topic of the workshop or (2) position papers raising new issues or describing new or existing platforms/systems for crowdsourcing Internet measurements. Papers should illustrate what role platforms (could) play in building the envisioned community-driven meta-platform for the purpose of crowd-sourcing and crowd-sharing Internet data.

Topics of particular interest include, but are not limited to:

  • Vantage point selection, biases and measurement needs
  • Recruitment models, from altruism to monetary incentives
  • Techniques for transforming collected data into “synthesized” data with practical value for third parties
  • Incorporating mobile hosts
  • Experiences with existing platforms
  • Designing extensible and programmable software agents
  • Platform security and end-user privacy
  • Ethical use of platforms by third parties
  • Vetting of sourced and shared data

