ABSTRACT
Many datasets feature seemingly disparate entries that actually refer to the same entity. Reconciling these entries, or "matching," is challenging, especially in situations where there are errors in the data. In certain contexts, the situation is even more complicated: an active adversary may have a vested interest in having the matching process fail. By leveraging eight years of data, we investigate one such adversarial context: matching different online anonymous marketplace vendor handles to unique sellers. Using a combination of random forest classifiers and hierarchical clustering on a set of features that would be hard for an adversary to forge or mimic, we manage to obtain reasonable performance (over 75% precision and recall on labels generated using heuristics), despite generally lacking any ground truth for training. Our algorithm performs particularly well for the top 30% of accounts by sales volume, and hints that 22,163 accounts with at least one confirmed sale map to 15,652 distinct sellers---of which 12,155 operate only one account, and the remainder between 2 and 11 different accounts. Case study analysis further confirms that our algorithm manages to identify non-trivial matches, as well as impersonation attempts.
- P. Agten, W. Joosen, F. Piessens, and N. Nikiforakis. 2015. Seven months' worth of mistakes: A longitudinal study of typosquatting abuse. In Proc. ISOC NDSS.Google Scholar
- C. Aliens. 2017. The Darknet Search Engine ?Grams' is Shutting Down. https://web.archive.org/web/20180124070700/https://www.deepdotweb.com/2017/12/15/darknet-search-engine-grams-shutting/. Accessed May 18, 2019.Google Scholar
- Anonymous. 2017. Grams: Search the Darknet. Was at http://grams7enufi7jmdl.onion. Taken offline in December 2017.Google Scholar
- J. Bien and R. Tibshirani. 2011. Hierarchical Clustering With Prototypes via Minimax Linkage. J. Am. Stat. Assoc. 106 495 (2011), 1075--1084.Google Scholar
- J. Broséus, D. Rhumorbarbe, C. Mireault, V. Ouellette, F. Crispino, and D. Décary- Hétu. 2016. Studying illicit drug trafficking on Darknet markets: Structure and organisation from a Canadian perspective. Forensic Sci. Int. 264 (2016), 7--14.Google Scholar
- P. Christen. 2012. Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer. Google ScholarDigital Library
- N. Christin. 2013. Traveling the Silk Road: A measurement analysis of a large anonymous online marketplace. In Proc. WWW'13. Rio de Janeiro, Brazil, 213--224. Google ScholarDigital Library
- N. Christin. 2017. An EU-focused analysis of drug supply on the AlphaBay marketplace. EMCDDA report for contract CT.17.SAT.0063.1.0. Available at http://www.emcdda.europa.eu/system/files/attachments/6622/AlphaBay-final-paper.pdf.Google Scholar
- DHS S&T -- CSD. {n. d.}. Information Marketplace for Policy and Analysis of Cyber-risk & Trust (IMPACT). Retrieved May 18, 2019, from https://impactcybertrust.org.Google Scholar
- R. Dingledine, N. Mathewson, and P. Syverson. 2004. Tor: The Second-Generation Onion Router. In Proceedings of the 13th USENIX Security Symposium. Google ScholarDigital Library
- M. Dittus, J. Wright, and M. Graham. 2018. Platform criminalism: the last-mile geography of the darknet market supply chain. In Proc. of the 2018Web Conference. Lyon, France, 277--286. Google ScholarDigital Library
- J. Douceur. 2002. The Sybil Attack. In Proc. IPTPS '02. Cambridge, MA. Google ScholarDigital Library
- I. Fellegi and A. Sunter. 1969. A Theory for Record Linkage. J. Am. Stat. Assoc. 64, 328 (1969), 1183--1210.Google ScholarCross Ref
- M. Gilbert and N. Dasgupta. 2017. Silicon to syringe: Cryptomarkets and disruptive innovation in opioid supply chains. Int. J. Drug Policy 46 (2017), 160--167.Google ScholarCross Ref
- G. Kappos, H. Yousaf, M. Maller, and S. Meiklejohn. 2018. An Empirical Analysis of Anonymity in Zcash. In Proc. USENIX Security. Google ScholarDigital Library
- P. Kintis, N. Miramirkhani, C. Lever, Y. Chen, R. Romero-Gomez, N. Pitropakis, N. Nikiforakis, and M. Antonakakis. 2017. Hiding in plain sight: a longitudinal study of combosquatting abuse. In Proc. ACM CCS. 569--586. Google ScholarDigital Library
- S. Kumar, J. Cheng, J. Leskovec, and V.S. Subrahmanian. 2017. An Army of Me: Sockpuppets in Online Discussion Communities. In Proc.WWW. Perth, Australia, 857--866. Google ScholarDigital Library
- J. Martin. 2014. Drugs on the dark net: How cryptomarkets are transforming the global trade in illicit drugs. Springer.Google Scholar
- S. Meiklejohn, M. Pomarole, G. Jordan, K. Levchenko, D. McCoy, G. Voelker, and S. Savage. 2013. A fistful of bitcoins: characterizing payments among men with no names. In Proc. ACM/USENIX IMC. Barcelona, Spain, 127--140. Google ScholarDigital Library
- T. Moore and B. Edelman. 2010. Measuring the Perpetrators and Funders of Typosquatting. In Proc. IFCA Financial Crypto. 175--191. Google ScholarDigital Library
- M. Möser, K. Soska, E. Heilman, K. Lee, H. Heffan, S. Srivastava, K. Hogan, J. Hennessey, A. Miller, A. Narayanan, and N. Christin. 2018. An Empirical Analysis of Traceability in the Monero Blockchain. In Proc. PETS, Vol. 3. Barcelona, Spain.Google Scholar
- L. Norbutas. 2018. Offline constraints in online drug marketplaces: An exploratory analysis of a cryptomarket trade network. Int. J. Drug Policy 56 (2018), 92--100.Google ScholarCross Ref
- N. Popper. 2015. The tax sleuth who took down a drug lord. https://www.nytimes.com/2015/12/27/business/dealbook/the-unsung-tax-agent-who-put-a-face-on-the-silk-road.html. Last accessed: May 18, 2019..Google Scholar
- R. R. Sokal and C. D. Michener. 1958. A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin 38 (1958), 1409--1438.Google Scholar
- K. Soska and N. Christin. 2015. Measuring the Longitudinal Evolution of the Online Anonymous Marketplace Ecosystem. In Proc. USENIX Security. Washington, DC, 33--48. Google ScholarDigital Library
- J. Szurdi, B. Kocso, G. Cseh, J. Spring, M. Felegyhazi, and C. Kanich. 2014. The Long ?Taile" of Typosquatting Domain Names. In Proc. USENIX Security. San Diego, CA, 191--206. Google ScholarDigital Library
- The Grugq. 2017. Operational Security and the RealWorld. https://medium.com/@thegrugq/operational-security-and-the-real-world-3c07e7eeb2e8. Retrieved May 18, 2019.Google Scholar
- United States District Court, Eastern District of California. 2016. Affidavit of Matthew Larsen. https://www.justice.gov/usao-edca/file/836576/download, accessed 2017-08--20.Google Scholar
- United States District Court, Eastern District of New York. 2016. Affidavit in Support of Removal to the Eastern District of California. https://regmedia.co.uk/2016/08/12/almashwali_arrest.pdf, accessed 2017-08--20. dark51.Google Scholar
- R. van Wegberg, S. Tajalizadehkhoob, K. Soska, U. Akyazi, C. Hernandez Ganan, B. Klievink, N. Christin, and M. van Eeten. 2018. Plug and Prey? Measuring the Commoditization of Cybercrime via Online Anonymous Markets. In Proc. USENIX Security. Baltimore, MD. Google ScholarDigital Library
- S. Ventura, R. Nugent, and E. Fuchs. 2015. Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records. Research Policy 44, 9 (2015), 1672--1701.Google ScholarCross Ref
- B. Viswanath, M. Bashir, M. Zafar, S. Bouget, S. Guha, K. Gummadi, A. Kate, and A. Mislove. 2015. Strength in numbers: Robust tamper detection in crowd computations. In Proc. ACM COSN. 113--124. Google ScholarDigital Library
- B. Viswanath, A. Post, K. Gummadi, and A. Mislove. 2011. An analysis of social network-based sybil defenses. ACM SIGCOMM CCR 41, 4 (2011), 363--374. Google ScholarDigital Library
- X. Wang, P. Peng, C. Wang, and G. Wang. 2018. You Are Your Photographs: Detecting Multiple Identities of Vendors in the Darknet Marketplaces. In Proc. ACM ACIACCS. 431--442. Google ScholarDigital Library
- W. Winkler. 2006. Overview of record linkage and current research directions. Technical Report Statistics #2006-02. Bureau of the Census.Google Scholar
Index Terms
- Adversarial Matching of Dark Net Market Vendor Accounts
Recommendations
Matching in the Sourcing Market: A Structural Analysis of the Upstream Channel
Building on the structural two-sided matching model, we develop a framework to study the sourcing market in the context of marketing firms matching with manufacturers. Both sides prefer partners that could generate significant values with better ...
Record Matching over Query Results from Multiple Web Databases
Record matching, which identifies the records that represent the same real-world entity, is an important step for data integration. Most state-of-the-art record matching methods are supervised, which requires the user to provide training data. These ...
Market Entry and Consumer Behavior: An Investigation of a Wal-Mart Supercenter
This paper provides an empirical study of entry by a Wal-Mart supercenter into a local market. Using a unique frequent-shopper database that records transactions for over 10,000 customers, we study the impact of Wal-Mart's entry on consumer purchase ...
Comments