skip to main content
research-article

Long-term Measurement and Analysis of the Free Proxy Ecosystem

Published:26 November 2019Publication History
Skip Abstract Section

Abstract

Free web proxies promise anonymity and censorship circumvention at no cost. Several websites publish lists of free proxies organized by country, anonymity level, and performance. These lists index hundreds of thousands of hosts discovered via automated tools and crowd-sourcing. A complex free proxy ecosystem has been forming over the years, of which very little is known. In this article, we shed light on this ecosystem via a distributed measurement platform that leverages both active and passive measurements. Active measurements are carried out by an infrastructure we name ProxyTorrent, which discovers free proxies, assesses their performance, and detects potential malicious activities. Passive measurements focus on proxy performance and usage in the wild, and are accomplished by means of a Chrome extension named Ciao. ProxyTorrent has been running since January 2017, monitoring up to 230K free proxies. Ciao was launched in March 2017 and has thus far served roughly 9.7K users and generated 14TB of traffic. Our analysis shows that less than 2% of the proxies announced on the Web indeed proxy traffic on behalf of users; further, only half of these proxies have decent performance and can be used reliably. Every day, around 5%--10% of the active proxies exhibit malicious behaviors, e.g., advertisement injection, TLS interception, and cryptojacking, and these proxies are also the ones providing the best performance. Through the analysis of more than 14TB of proxied traffic, we show that web browsing is the primary user activity. Geo-blocking avoidance—allegedly a popular use case for free web proxies—accounts for 30% or less of the traffic, and it mostly involves countries hosting popular geo-blocked content.

References

  1. Devdatta Akhawe and Adrienne Porter Felt. 2013. Alice in warningland: A large-scale field study of browser security warning effectiveness. In Proceedings of the USENIX Security Symposium. 257--272.Google ScholarGoogle Scholar
  2. Taejoong Chung, David R. Choffnes, and Alan Mislove. 2016. Tunneling for transparency: A large-scale analysis of end-to-end violations in the internet. In Proceedings of the ACM Internet Measurement Conference (IMC’16). 199--213.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. CIAO. 2017. Automated Free Proxies Discovery/usage. https://goo.gl/NgJmLE.Google ScholarGoogle Scholar
  4. CURL. 2017. Command Line Tool and Library for Transferring Data with URLs. https://curl.haxx.se/.Google ScholarGoogle Scholar
  5. David Dittrich and Erin Kenneally. 2012. The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research. Technical Report, US Department of Homeland Security.Google ScholarGoogle Scholar
  6. Zakir Durumeric, Eric Wustrow, and J. Alex Halderman. 2013. ZMap: Fast internet-wide scanning and its security applications. In Proceedings of the USENIX Security Symposium. 605--620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Brendan J. Frey and Delbert Dueck. 2007. Clustering by passing messages between data points. Science 315, 5814 (2007), 972--976.Google ScholarGoogle Scholar
  8. Haschek Solutions. 2017. ProxyChecker. https://github.com/chrisiaut/proxycheck_script.Google ScholarGoogle Scholar
  9. Hola. 2017. Free VPN, Secure Browsing, Unrestricted Access. http://hola.org/.Google ScholarGoogle Scholar
  10. Muhammad Ikram, Narseo Vallina-Rodriguez, Suranga Seneviratne, Mohamed Ali Kaafar, and Vern Paxson. 2016. An analysis of the privacy and security risks of Android VPN permission-enabled apps. In Proceedings of the ACM Internet Measurement Conference (IMC’16). 349--364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Christian Kreibich, Nicholas Weaver, Boris Nechaev, and Vern Paxson. 2010. Netalyzr: Illuminating the edge network. In Proceedings of the ACM Internet Measurement Conference (IMC’10). 246--259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. letsencrypt. 2017. A Free, Automated, and Open Certificate Authority. https://letsencrypt.org/.Google ScholarGoogle Scholar
  13. Akshaya Mani, Tavish Vaidya, David Dworken, and Micah Sherr. 2018. An extensive evaluation of the internet’s open proxies. In Proceedings of the 34th Computer Security Applications Conference (ACSAC’18). ACM, New York, NY, 252--265. DOI:https://doi.org/10.1145/3274694.3274711Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. MAXMIND. 2017. IP Geolocation and Online Fraud Prevention. https://www.maxmind.com/.Google ScholarGoogle Scholar
  15. NGINX. 2017. A Free, Open-source, High-performance HTTP Server. https://nginx.org/.Google ScholarGoogle Scholar
  16. Diego Perino, Matteo Varvello, and Claudio Soriente. 2018. ProxyTorrent: Untangling the free HTTP(S) proxy ecosystem. In Proceedings of the World Wide Web Conference (WWW’18). 197--206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Vasile Claudiu Perta, Marco Valerio Barbera, Gareth Tyson, Hamed Haddadi, and Alessandro Mei. 2015. A glance through the VPN looking glass: IPv6 leakage and DNS hijacking in commercial VPN clients. In Proceedings of the Conference on Privacy Enhancing Technologies (PoPETs’15). 77--91.Google ScholarGoogle ScholarCross RefCross Ref
  18. PhantomJS. 2017. Headless Browser. http://phantomjs.org/.Google ScholarGoogle Scholar
  19. PLANETLAB. 2017. An Open Platform for Developing, Deploying, and Accessing Planetary-scale Services. https://www.planet-lab.org/.Google ScholarGoogle Scholar
  20. ProxyTorrent team.2017. Ciao Code. https://github.com/ciao-dev/CIAO.Google ScholarGoogle Scholar
  21. Charles Reis, Steven D. Gribble, Tadayoshi Kohno, and Nicholas C. Weaver. 2008. Detecting in-flight page changes with web tripwires. In Proceedings of the USENIX Symposium on Networked Systems Design 8 Implementation (NSDI’08). 31--44.Google ScholarGoogle Scholar
  22. Will Scott, Ravi Bhoraskar, and Arvind Krishnamurthy. 2015. Understanding open proxies in the wild. In Proceedings of the Chaos Communication Camp.Google ScholarGoogle Scholar
  23. Georgios Tsirantonakis, Panagiotis Ilia, Sotiris Ioannidis, Elias Athanasopoulos, and Michalis Polychronakis. 2018. A large-scale analysis of content modification by open HTTP proxies. In Proceedings of the Network and Distributed System Security Symposium (NDSS’18). (2018).Google ScholarGoogle ScholarCross RefCross Ref
  24. Gareth Tyson, Shan Huang, Félix Cuadrado, Ignacio Castro, Vasile Claudiu Perta, Arjuna Sathiaseelan, and Steve Uhlig. 2017. Exploring HTTP header manipulation in-the-wild. In Proceedings of the International Conference on World Wide Web (WWW’17). 451--458.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Matteo Varvello, Jeremy Blackburn, David Naylor, and Konstantina Papagiannaki. 2016. EYEORG: A platform for crowdsourcing web quality of experience measurements. In Proceedings of the Conference on Emerging Network Experiment and Technology (CoNEXT’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nicholas Weaver, Christian Kreibich, Martin Dam, and Vern Paxson. 2014. Here be web proxies. In Proceedings of the Passive and Active Measurement Conference (PAM’14). 183--192.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Long-term Measurement and Analysis of the Free Proxy Ecosystem

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on the Web
          ACM Transactions on the Web  Volume 13, Issue 4
          November 2019
          139 pages
          ISSN:1559-1131
          EISSN:1559-114X
          DOI:10.1145/3372405
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 November 2019
          • Accepted: 1 August 2019
          • Revised: 1 June 2019
          • Received: 1 April 2018
          Published in tweb Volume 13, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format