skip to main content
10.1145/3618257.3624841acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
short-paper
Open Access

The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement

Published:24 October 2023Publication History

ABSTRACT

Much of the content and structure of the Web remains inaccessible to evaluate at scale because it is gated by user authentication. This limitation restricts researchers to examining only a superficial layer of a website: the landing page or public, search-indexable pages. Since it is infeasible to create individual accounts across thousands of webpages, we examine the prevalence of Single Sign-On (SSO) on the web to explore the feasibility of using a few accounts to authenticate to many sites. We find that 58% of the top 10K websites with logins are accessible with popular 3rd-party SSO providers, such as Google, Facebook, and Apple, indicating that leveraging SSO offers a scalable solution to access a large volume of user-gated content.

References

  1. [n. d.]. Cached Chrome Top Million Websites. https://github.com/zakird/crux- top-listsGoogle ScholarGoogle Scholar
  2. [n. d.]. Simplabel. https://github.com/hlgirard/SimplabelGoogle ScholarGoogle Scholar
  3. Adrian Rosebrock. [n.,d.]. Multi-scale Template Matching using Python and OpenCV. https://pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/Google ScholarGoogle Scholar
  4. Bernhard Ager, Wolfgang Mühlbauer, Georgios Smaragdakis, and Steve Uhlig. 2011. Web Content Cartography. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (Berlin, Germany) (IMC '11). Association for Computing Machinery, New York, NY, USA, 585--600. https://doi.org/10.1145/2068816.2068870Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Apple. 2019. New Guidelines for Sign in with Apple. https://developer.apple.com/news/?id=09122019bGoogle ScholarGoogle Scholar
  6. Apple. 2023. App Store Review Guidelines. https://developer.apple.com/app-store/review/guidelines/#sign-in-with-appleGoogle ScholarGoogle Scholar
  7. Waqar Aqeel, Balakrishnan Chandrasekaran, Anja Feldmann, and Bruce M. Maggs. 2020. On Landing and Internal Web Pages: The Strange Case of Jekyll and Hyde in Web Performance Measurement. In Proceedings of the ACM Internet Measurement Conference (Virtual Event, USA) (IMC '20). Association for Computing Machinery, New York, NY, USA, 680--695. https://doi.org/10.1145/3419394.3423626Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. SURF: Speeded Up Robust Features. In Computer Vision - ECCV 2006, Alevs Leonardis, Horst Bischof, and Axel Pinz (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 404--417.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Michael Butkiewicz, Harsha V. Madhyastha, and Vyas Sekar. 2011. Understanding Website Complexity: Measurements, Metrics, and Implications. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (Berlin, Germany) (IMC '11). Association for Computing Machinery, New York, NY, USA, 313--328. https://doi.org/10.1145/2068816.2068846Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Michael Butkiewicz, Daimeng Wang, Zhe Wu, Harsha V. Madhyastha, and Vyas Sekar. 2015. Klotski: Reprioritizing Web Content to Improve User Experience on Mobile Devices. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). USENIX Association, Oakland, CA, 439--453. https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/butkiewiczGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  11. Trinh Viet Doan, Roland van Rijswijk-Deij, Oliver Hohlfeld, and Vaibhav Bajpai. 2022. An Empirical View on Consolidation of the Web. ACM Trans. Internet Technol., Vol. 22, 3, Article 70 (feb 2022), 30 pages. https://doi.org/10.1145/3503158Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Theresa Enghardt, Thomas Zinner, and Anja Feldmann. 2019. Web Performance Pitfalls. In Passive and Active Measurement, David Choffnes and Marinho Barcellos (Eds.). Springer International Publishing, Cham, 286--303.Google ScholarGoogle Scholar
  13. Google. [n.,d.]. Detect logos | Cloud Vision API. https://cloud.google.com/vision/docs/detecting-logosGoogle ScholarGoogle Scholar
  14. Dick Hardt. 2012. The OAuth 2.0 Authorization Framework. RFC 6749. https://doi.org/10.17487/RFC6749Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Andrew J. Kaizer and Minaxi Gupta. 2016. Characterizing Website Behaviors Across Logged-in and Not-Logged-in Users. In Proceedings of the 2016 Internet Measurement Conference (Santa Monica, California, USA) (IMC '16). Association for Computing Machinery, New York, NY, USA, 111--117. https://doi.org/10.1145/2987443.2987450Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Conor Kelton, Jihoon Ryoo, Aruna Balasubramanian, and Samir R. Das. 2017. Improving User Perceived Page Load Times Using Gaze. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 545--559. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/keltonGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nico Kokonas. 2021. Playwright Stealth. https://gist.github.com/nicoandmee/1ec1b6a07c94f82df41d2496194ef3a6Google ScholarGoogle Scholar
  18. Zhichun Li, Ming Zhang, Zhaosheng Zhu, Yan Chen, Albert Greenberg, and Yi-Min Wang. 2010. WebProphet: Automating Performance Prediction for Web Services. In 7th USENIX Symposium on Networked Systems Design and Implementation (NSDI 10). USENIX Association, San Jose, CA. https://www.usenix.org/conference/nsdi10-0/webprophet-automating-performance-prediction-web-servicesGoogle ScholarGoogle Scholar
  19. Greg Linden. 2006. Make Data Useful. http://sites.google.com/site/glinden/Home/StanfordDataMining.2006-11-28.ppt.Google ScholarGoogle Scholar
  20. Ravi Netravali, Ameesh Goyal, James Mickens, and Hari Balakrishnan. 2016. Polaris: Faster Page Loads Using Fine-grained Dependency Tracking. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA. https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/netravaliGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ravi Netravali, Vikram Nathan, James Mickens, and Hari Balakrishnan. 2018. Vesper: Measuring Time-to-Interactivity for Web Pages. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 217--231. https://www.usenix.org/conference/nsdi18/presentation/netravali-vesperGoogle ScholarGoogle Scholar
  22. New York Times. 2023. Internet Archive Snapshot of nytimes.com/robots.txt from 2023/05/19. http://web.archive.org/web/20230519003326/https://www.nytimes.com/robots.txtGoogle ScholarGoogle Scholar
  23. OpenCV. [n.,d.]. OpenCV: Template Matching. https://docs.opencv.org/3.4/d4/dc6/tutorial_py_template_matching.htmlGoogle ScholarGoogle Scholar
  24. Playwright. 2023. Playwright: Fast and reliable end-to-end testing for modern web apps. https://playwright.dev/Google ScholarGoogle Scholar
  25. Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 779--788. https://doi.org/10.1109/CVPR.2016.91Google ScholarGoogle ScholarCross RefCross Ref
  26. Kimberly Ruth, Aurore Fass, Jonathan Azose, Mark Pearson, Emma Thomas, Caitlin Sadowski, and Zakir Durumeric. 2022a. A World Wide View of Browsing the World Wide Web. In Proceedings of the 22nd ACM Internet Measurement Conference (Nice, France) (IMC '22). Association for Computing Machinery, New York, NY, USA, 317--336. https://doi.org/10.1145/3517745.3561418Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kimberly Ruth, Deepak Kumar, Brandon Wang, Luke Valenta, and Zakir Durumeric. 2022b. Toppling Top Lists: Evaluating the Accuracy of Popular Website Lists. In Proceedings of the 22nd ACM Internet Measurement Conference (Nice, France) (IMC '22). Association for Computing Machinery, New York, NY, USA, 374--387. https://doi.org/10.1145/3517745.3561444Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W3 Schools. [n.,d.]. Accessibility Labels. https://www.w3schools.com/accessibility/accessibility_labels.phpGoogle ScholarGoogle Scholar
  29. Stoyan Stefanov. 2008. YSlow 2.0. In CSDN SD2C.Google ScholarGoogle Scholar
  30. Tailscale Inc. 2023. Supported SSO identity providers. https://tailscale.com/kb/1013/sso-providers/Google ScholarGoogle Scholar
  31. Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David Wetherall. 2013. Demystifying Page Load Performance with WProf. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENIX Association, Lombard, IL, 473--485. https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/wang_xiaoGoogle ScholarGoogle Scholar
  32. Wikipedia. [n.,d.]. Template Matching. https://en.wikipedia.org/wiki/Template_matchingGoogle ScholarGoogle Scholar
  33. Rui Xin, Shihan Lin, and Xiaowei Yang. 2023. Quantifying User Password Exposure To Third-Party CDNs. In Passive and Active Measurement: 24th International Conference, PAM 2023, Virtual Event, March 21-23, 2023, Proceedings. Springer-Verlag, Berlin, Heidelberg, 652--668. https://doi.org/10.1007/978-3-031-28486-1_27Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The Prevalence of Single Sign-On on the Web: Towards the Next Generation of Web Content Measurement

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          IMC '23: Proceedings of the 2023 ACM on Internet Measurement Conference
          October 2023
          746 pages
          ISBN:9798400703829
          DOI:10.1145/3618257

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 October 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate277of1,083submissions,26%

          Upcoming Conference

          IMC '24
          ACM Internet Measurement Conference
          November 4 - 6, 2024
          Madrid , AA , Spain
        • Article Metrics

          • Downloads (Last 12 months)323
          • Downloads (Last 6 weeks)64

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader