ABSTRACT
Many tools for analyzing distributed systems propagate contexts along the execution paths of requests, tasks, and jobs, in order to correlate events across process, component and machine boundaries. There is a wide range of existing and proposed uses for these tools, which we call cross-cutting tools, such as tracing, debugging, taint propagation, provenance, auditing, and resource management, but few of them get deployed pervasively in large systems. When they do, they are brittle, hard to evolve, and cannot coexist with each other. While they use very different context metadata, the way they propagate the information alongside execution is the same. Nevertheless, in existing tools, these aspects are deeply intertwined, causing most of these problems.
In this paper, we propose a layered architecture for cross-cutting tools that separates concerns of system developers and tool developers, enabling independent instrumentation of systems, and the deployment and evolution of multiple such tools. At the heart of this layering is a general underlying format, baggage contexts, that enables the complete decoupling of system instrumentation for context propagation from tool logic. Baggage contexts make propagation opaque and general, while still maintaining correctness of the metadata under arbitrary concurrency and different data types. We demonstrate the practicality of the architecture with implementations in Java and Go, porting of several existing cross-cutting tools, and instrumenting existing distributed systems with all of them.
- Almeida, P. S., Baquero, C., and Fonte, V. Interval Tree Clocks: A Logical Clock for Dynamic Systems. In 12th International Conference On Principles Of Distributed Systems (OPODIS '08). (§2.2). Google ScholarDigital Library
- Alshuqayran, N., Ali, N., and Evans, R. A Systematic Mapping Study in Microservice Architecture. In 9th IEEE International Conference on Service-Oriented Computing and Applications (SOCA '16). (§1, 2, and 2.1).Google Scholar
- Apache. Accumulo. Retrieved January 2017 from https://accumulo.apache.org/. (§2.2).Google Scholar
- Apache. ACCUMULO-1197: Pass Accumulo trace functionality through the DFSClient. Retrieved January 2017 from https://issues.apache.org/jira/browse/ACCUMULO-1197. (§2.2).Google Scholar
- Apache. ACCUMULO-3725: Majc trace tacked onto minc trace. Retrieved January 2017 from https://issues.apache.org/jira/browse/ACCUMULO-3725. (§2.2).Google Scholar
- Apache. ACCUMULO-3741: Reduce incompatibilities with htrace 3.2.0-incubating. Retrieved January 2017 from https://issues.apache.org/jira/browse/ACCUMULO-3741. (§2.2).Google Scholar
- Apache. ACCUMULO-898: Look into replacing Cloud-Trace. Retrieved January 2017 from https://issues.apache.org/jira/browse/ACCUMULO-898. (§2.2).Google Scholar
- Apache. Accumulo CloudTrace. Retrieved January 2017 from http://accumulo.apache.org/1.6/accumulo_user_manual.html#_tracing. (§2.2).Google Scholar
- Apache. Cassandra. Retrieved January 2017 from https://cassandra.apache.org/. (§2.2).Google Scholar
- Apache. CASSANDRA-10392: Allow Cassandra to trace to custom tracing implementations. Retrieved January 2017 from https://issues.apache.org/jira/browse/CASSANDRA-10392. (§2.2).Google Scholar
- Apache. CASSANDRA-1123: Allow tracing query details. Retrieved January 2017 from https://issues.apache.org/jira/browse/CASSANDRA-1123. (§2.2).Google Scholar
- Apache. CASSANDRA-7644: Tracing does not log commit-log/memtable ops when the coordinator is a replica. Retrieved January 2017 from https://issues.apache.org/jira/browse/CASSANDRA-7644. (§2.2).Google Scholar
- Apache. CASSANDRA-7657: Tracing doesn't finalize under load when it should. Retrieved January 2017 from https://issues.apache.org/jira/browse/CASSANDRA-7657. (§2.2).Google Scholar
- Apache. CASSANDRA-8553: Add a key-value payload for third party usage. Retrieved January 2017 from https://issues.apache.org/jira/browse/CASSANDRA-8553. (§2.2).Google Scholar
- Apache. HBase. Retrieved June 2016 from https://hbase.apache.org. (§2.2).Google Scholar
- Apache. HBASE-13077: BoundedCompletionService doesn't pass trace info to server. Retrieved January 2017 from https://issues.apache.org/jira/browse/HBASE-13077. (§2.2).Google Scholar
- Apache. HBASE-14451: Move on to htrace-4.0.1 (from htrace-3.2.0) and tell a couple of good trace stories. Retrieved January 2017 from https://issues.apache.org/jira/browse/HBASE-14451. (§2.2).Google Scholar
- Apache. HBASE-15880: RpcClientImpl#tracedWriteRequest incorrectly closes HTrace span. Retrieved January 2017 from https://issues.apache.org/jira/browse/HBASE-14880. (§2.2).Google Scholar
- Apache. HBASE-6215: Per-request profiling. Retrieved January 2017 from https://issues.apache.org/jira/browse/HBASE-6215. (§2.2).Google Scholar
- Apache. HBASE-6449: Dapper like tracing. Retrieved January 2017 from https://issues.apache.org/jira/browse/HBASE-6449. (§2.2 and 7).Google Scholar
- Apache. HDFS-11622 TraceId hardcoded to 0 in DataStreamer, correlation between multiple spans is lost. Retrieved April 2017 from https://issues.apache.org/jira/browse/HDFS-11622. (§2.2).Google Scholar
- Apache. HDFS-5274: Add Tracing to HDFS. Retrieved January 2017 from https://issues.apache.org/jira/browse/HDFS-5274. (§2.2).Google Scholar
- Apache. HDFS-7054: Make DFSOutputStream tracing more fine-grained. Retrieved January 2017 from https://issues.apache.org/jira/browse/HDFS-7054. (§2.2).Google Scholar
- Apache. HDFS-9080: update htrace version to 4.0.1. Retrieved January 2017 from https://issues.apache.org/jira/browse/HDFS-9080. (§2.2).Google Scholar
- Apache. HDFS-9853: Ozone: Add container definitions. Retrieved January 2017 from https://issues.apache.org/jira/browse/HDFS-9853. (§2.2).Google Scholar
- Apache. HTRACE-330: Add to Tracer, TRACE-level logging of push and pop of contexts to aid debugging "Can't close TraceScope..". Retrieved January 2017 from https://issues.apache.org/jira/browse/HTRACE-330. (§2.2).Google Scholar
- Apache. Phoenix 195: Zipkin. Retrieved January 2017 from https://github.com/apache/phoenix/pull/195. (§2.2).Google Scholar
- Chanda, A., Cox, A. L., and Zwaenepoel, W. Whodunit: Transactional Profiling for Multi-Tier Applications. In 2nd ACM European Conference on Computer Systems (EuroSys '07). (§2). Google ScholarDigital Library
- Chanda, A., Elmeleegy, K., Cox, A. L., and Zwaenepoel, W. Causeway: Support for Controlling and Analyzing the Execution of Multi-tier Applications. In 6th ACM/IFIP/USENIX International Middleware Conference (Middleware '05). (§1). Google ScholarDigital Library
- Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. Bigtable: A Distributed Storage System for Structured Data. In 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI '06). (§2.2). Google ScholarDigital Library
- Chen, M. Y., Kiciman, E., Fratkin, E., Fox, A., and Brewer, E. Pinpoint: Problem Determination in Large, Dynamic Internet Services. In 32nd IEEE/IFIP International Conference on Dependable Systems and Networks (DSN '02). (§1 and 2.1). Google ScholarDigital Library
- Chow, M., Meisner, D., Flinn, J., Peek, D., and Wenisch, T. F. The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14). (§2.2). Google ScholarDigital Library
- Chow, M., Veeraraghavan, K., Cafarella, M., and Flinn, J. DQBarge: Improving Data-Quality Tradeoffs in Large-Scale Internet Services. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16). (§1 and 2). Google ScholarDigital Library
- Dietz, M., Shekhar, S., Pisetsky, Y., Shu, A., and Wallach, D. S. QUIRE: Lightweight Provenance for Smart Phone Operating Systems. In 20th USENIX Security Symposium (Security '11). (§2). Google ScholarDigital Library
- Enck, W, Gilbert, P., Chun, B.-G., Cox, L., Jung, J., McDaniel, P., and Sheth, A. N. TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI '10). (§2). Google ScholarDigital Library
- Erickson, J., Kornacker, M., and Kumar, D. New SQL Choices in the Apache Hadoop Ecosystem: Why Impala Continues to Lead. (May 2014). Retrieved January 2017 from https://blog.cloudera.com/blog/2014/05/new-sql-choices-in-the-apache-hadoop-ecosystem-why-impala-continues-to-lead/. (§6.3).Google Scholar
- Esposito, C., Castiglione, A., and Choo, K.-K. R. Challenges in Delivering Software in the Cloud as Microservices. IEEE Cloud Computing 3, 5 (2016), 10--14. (§1 and 2).Google ScholarCross Ref
- Fonseca, R., Freedman, M. J., and Porter, G. Experiences with Tracing Causality in Networked Services. In 2010 USENIX Internet Network Management Workshop/Workshop on Research on Enterprise Networking (INM/WREN '10). (§2.2). Google ScholarDigital Library
- Fonseca, R., Porter, G., Katz, R. H., Shenker, S., and Stoica, I. X-Trace: A Pervasive Network Tracing Framework. In 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI '07). (§1, 1, 2, 2.1, 2.2, and 6.1). Google ScholarDigital Library
- Google. Compute Engine. Retrieved January 2017 from https://cloud.google.com/compute/. (§6.3).Google Scholar
- Google. gRPC/Census. Retrieved January 2017 from https://goo.gl/iEqlqH. (§2 and 2.1).Google Scholar
- Gould, O. Real World Microservices: When Services Stop Playing Well and Start Getting Real. (May 2016). Retrieved July 2017 from https://blog.buoyant.io/2016/05/04/real-world-microservices-when-services-stop-playing-well-and-start-getting-real/. (§1 and 2).Google Scholar
- Guo, Z., McDirmid, S., Yang, M., Zhuang, L., Zhang, P., Luo, Y., Bergan, T., Musuvathi, M., Zhang, Z., and Zhou, L. Failure Recovery: When the Cure Is Worse Than the Disease. In 14th USENIX Workshop on Hot Topics in Operating Systems (HotOS '13). (§2). Google ScholarDigital Library
- Heath, M. A Journey into Microservices: Dealing with Complexity. (March 2015). Retrieved January 2017 from http://sudo.hailoapp.com/services/2015/03/09/journey-into-a-microservice-world-part-3/. (§2).Google Scholar
- Impala TPC-DS Kit. TPC-DS Query 43. Retrieved January 2017 from https://github.com/cloudera/impala-tpcds-kit/blob/c5d32ae55a5259dd081bf4546bb650b2a3d668de/queries/q43.sql. (§6.3.1).Google Scholar
- Kaldor, J., Mace, J., Bejda, M., Gao, E., Kuropatwa, W, O'Neill, J., Ong, K. W., Schaller, B., Shan, P., Viscomi, B., Vekataraman, V., Veeraraghavan, K., and Song, Y. J. Canopy: An End-to-End Performance Tracing And Analysis System. In 26th ACM Symposium on Operating Systems Principles (SOSP '17). (§2, 2.2, 4.1, and 7). Google ScholarDigital Library
- Karumuri, S. PinTrace: Distributed Tracing at Pinterest. (August 2016). Retrieved July 2017 from https://www.slideshare.net/mansu/pintrace-advanced-aws-meetup. (§2.2).Google Scholar
- Killalea, T. The Hidden Dividends of Microservices. Communications of the ACM 59, 8 (2016), 42--45. (§1 and 2). Google ScholarDigital Library
- Lightstep. Lightstep. Retrieved January 2017 from http://lightstep.com. (§2.2).Google Scholar
- Loff, J., Porto, D., Baquero, C., Garcia, J., Preguiça, N., and Rodrigues, R. Transparent Cross-System Consistency. In 3rd International Workshop on Principles and Practice of Consistency for Distributed Data (PaPoC '17). (§1 and 2). Google ScholarDigital Library
- Mace, J., Bodik, P., Fonseca, R., and Musuvathi, M. Retro: Targeted Resource Management in Multi-Tenant Distributed Systems. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI '15). (§1, 1, 2, 2.1, 2.2, 4.1, and 6.1). Google ScholarDigital Library
- Mace, J., Bodik, P., Fonseca, R., and Musuvathi, M. Towards General-Purpose Resource Management in Shared Cloud Services. In 10th USENIX Workshop on Hot Topics in System Dependability (HotDep '14). (§2.2). Google ScholarDigital Library
- Mace, J., Roelke, R., and Fonseca, R. Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems. In 25th ACM Symposium on Operating Systems Principles (SOSP '15). (§1, 2, 2.1, 2.2, 3, 5.5, 5.6, 6.1, and 7). Google ScholarDigital Library
- Muniswamy-Reddy, K.-K., Macko, P., and Seltzer, M. I. Provenance for the Cloud. In 8th USENIX Conference on File and Storage Technologies (FAST '10). (§2). Google ScholarDigital Library
- Myers, A. C., and Liskov, B. A Decentralized Model for Information Flow Control. In 16th ACM Symposium on Operating Systems Principles (SOSP '97). (§2). Google ScholarDigital Library
- Naver. Pinpoint. Retrieved January 2017 from https://github.com/naver/pinpoint. (§7).Google Scholar
- Netflix. Netflix Open Source Software. Retrieved January 2017 from http://netflix.github.io/. (§1).Google Scholar
- Newman, S. Building Microservices. O'Reilly Media, Inc., 2015. (§1). Google ScholarDigital Library
- Oliner, A., Ganapathi, A., and Xu, W. Advances and Challenges in Log Analysis. Communications of the ACM 55, 2 (2012), 55--61. (§2). Google ScholarDigital Library
- OpenTracing. OpenTracing. Retrieved January 2017 from http://opentracing.io/. (§1 and 7).Google Scholar
- OpenTracing. OpenTracing 28: Non-RPC Spans and Mapping to Multiple Parents. Retrieved January 2017 from https://github.com/opentracing/opentracing.io/issues/28. (§2.2).Google Scholar
- OpenTracing. Specification 5: Non-RPC Spans and Mapping to Multiple Parents. Retrieved February 2017 from https://github.com/opentracing/specification/issues/5. (§2.2).Google Scholar
- OpenZipkin. B3-Propagation. Retrieved January 2017 from https://github.com/openzipkin/b3-propagation. (§2.1).Google Scholar
- OpenZipkin. OpenZipkin 48: Would a common http response id header be helpful? Retrieved January 2017 from https://github.com/openzipkin/openzipkin.github.io/issues/48. (§2.2).Google Scholar
- OpenZipkin. Zipkin. Retrieved July 2017 from http://zipkin.io/. (§1 and 7).Google Scholar
- OpenZipkin. Zipkin 1189: Representing an asynchronous span in Zipkin. Retrieved January 2017 from https://github.com/openzipkin/zipkin/issues/1189. (§2.2).Google Scholar
- OpenZipkin. Zipkin 1243: Support async spans. Retrieved January 2017 from https://github.com/openzipkin/zipkin/issues/1243. (§2.2).Google Scholar
- OpenZipkin. Zipkin 1244: Multiple parents aka Linked traces. Retrieved January 2017 from https://github.com/openzipkin/zipkin/issues/1244. (§2.2).Google Scholar
- OpenZipkin. Zipkin 925: How to track async spans? Retrieved January 2017 from https://github.com/openzipkin/zipkin/issues/925. (§2.2).Google Scholar
- OpenZipkin. Zipkin 939: Zipkin v2 span model. Retrieved January 2017 from https://github.com/openzipkin/zipkin/issues/939. (§2.2).Google Scholar
- Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., and Chun, B.-G. Making Sense of Performance in Data Analytics Frameworks. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI '15). (§6.3). Google ScholarDigital Library
- Parker, D. S., Popek, G. J., Rudisin, G., Stoughton, A., Walker, B. J., Walton, E., Chow, J. M., Edwards, D., Kiser, S., and Kline, C. Detection of Mutual Inconsistency in Distributed Systems. IEEE Transactions on Software Engineering, 3 (1983), 240--247. (§2.2 and 5.6). Google ScholarDigital Library
- Ravindranath, L., Padhye, J., Mahajan, R., and Balakrishnan, H. Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications. In 24th ACM Symposium on Operating Systems Principles (SOSP '13). (§1 and 2). Google ScholarDigital Library
- Reynolds, P., Killian, C. E., Wiener, J. L., Mogul, J. C., Shah, M. A., and Vahdat, A. Pip: Detecting the Unexpected in Distributed Systems. In 3rd USENIX Symposium on Networked Systems Design and Implementation (NSDI '06). (§1). Google ScholarDigital Library
- Roman, J. The Hadoop Ecosystem Table. Retrieved January 2017 from https://hadoopecosystemtable.github.io/. (§2.2).Google Scholar
- Sambasivan, R. R., Shafer, I., Mace, J., Sigelman, B. H., Fonseca, R., and Ganger, G. R. Principled Workflow-Centric Tracing of Distributed Systems. In 7th ACM Symposium on Cloud Computing (SOCC '16). (§2.2, 6.4, and 7). Google ScholarDigital Library
- Shapiro, M., Preguiça, N., Baquero, C., and Zawirski, M. Conflict-Free Replicated Data Types. In 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS '11). (§2.1, 4.3, and 7). Google ScholarDigital Library
- Shapiro, M., Preguiça, N., Baquero, C., and Zawirski, M. A Comprehensive Study of Convergent and Commutative Replicated Data Types. Technical Report, Inria-Centre Paris-Rocquencourt; INRIA, 2011. (§4.3, 5.6, and 7).Google ScholarDigital Library
- Shkuro, Y. Baggage Propagation at Uber. (September 2017). Retrieved October 2017 from https://github.com/TraceContext/tracecontext-spec/issues/13#issuecomment-330094227. (§1 and 2).Google Scholar
- Shkuro, Y. Evolving Distributed Tracing at Uber Engineering. (February 2017). Retrieved July 2017 from https://eng.uber.com/distributed-tracing/. (§7).Google Scholar
- Shkuro, Y. Jaeger #373: Baggage Whitelisting. (September 2017). Retrieved October 2017 from https://github.com/jaegertracing/jaeger/issues/373. (§5.5).Google Scholar
- Shue, D., Freedman, M. J., and Shaikh, A. Performance Isolation and Fairness for Multi-Tenant Cloud Storage. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI '12). (§1). Google ScholarDigital Library
- Shvachko, K., Kuang, H., Radia, S., and Chansler, R. The Hadoop Distributed File System. (§6).Google Scholar
- Sigelman, B. H. Towards Turnkey Distributed Tracing. (June 2016). Retrieved January 2017 from https://medium.com/opentracing/towards-turnkey-distributed-tracing-5f4297d1736. (§2.2).Google Scholar
- Sigelman, B. H., Barroso, L. A., Burrows, M., Stephenson, P., Plakal, M., Beaver, D., Jaspan, S., and Shanbhag, C. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical Report, Google, 2010. (§1, 2, 2.1, 2.2, and 5).Google Scholar
- Spring. Spring Cloud Sleuth. Retrieved October 2017 from http://projects.spring.io/spring-cloud/. (§5.3).Google Scholar
- Spring. Spring Cloud Sleuth. Retrieved January 2017 from http://cloud.spring.io/spring-cloud-sleuth/. (§7).Google Scholar
- Spring Cloud. Sleuth 410: Trace ID problem when using Spring ThreadPoolTaskExecutor. Retrieved January 2017 from https://github.com/spring-cloud/spring-cloud-sleuth/issues/410. (§2.2).Google Scholar
- Spring Cloud. Sleuth 424: Not seeing traceids in the http response headers. Retrieved January 2017 from https://github.com/spring-cloud/spring-cloud-sleuth/issues/424. (§2.2).Google Scholar
- Sun, H. General Baggage Model for End-to-End Tracing and Its Application on Critical Path Analysis. M.Sc. Thesis, Brown University, 2016. (§2).Google Scholar
- The Go Blog. Go kit: A toolkit for microservices. Retrieved October 2017 from https://gokit.io/. (§5.3).Google Scholar
- Transaction Processing Performance Council. TPC Benchmark DS Version 2.4.0. (February 2017). Retrieved March 2017 from http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.4.0.pdf. (§6.3).Google Scholar
- Varda, K. Protocol Buffers: Google's Data Interchange Format. (July 2008). Retrieved January 2017 from https://opensource.googleblog.com/2008/07/protocol-buffers-googles-data.html. (§3.2 and 7).Google Scholar
- Vavilapalli, V. K., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O'Malley, O., Radia, S., Reed, B., and Baldeschwieler, E. Apache Hadoop YARN: Yet Another Resource Negotiator. In 4th ACM Symposium on Cloud Computing (SoCC '13). (§6). Google ScholarDigital Library
- Weaveworks, and Container Solutions. Sock shop: A microservices demo application. Retrieved October 2017 from https://microservices-demo.github.io. (§1, 5, and 5.3).Google Scholar
- Wever, M. S. Replacing Cassandra's tracing with Zipkin. (December 2015). Retrieved July 2017 from http://thelastpickle.com/blog/2015/12/07/using-zipkin-for-full-stack-tracing-including-cassandra.html. (§2.2).Google Scholar
- Workgroup, D. T. Tracing Workshop. (February 2017). Retrieved February 2017 from https://goo.gl/2WKjhR. (§2.2 and 7).Google Scholar
- Wright, P. CrossStitch: What Etsy Learned Building a Distributed Tracing System. (September 2014). Retrieved January 2017 from https://www.slideshare.net/PaulWright9/crossstitch-what-etsy-learned-building-a-distributed-tracing-system-for-surge-conference-2014. (§7).Google Scholar
- Yan, L.-K., and Yin, H. DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis. In 21st USENIX Security Symposium (Security '12). (§2). Google ScholarDigital Library
- Yuri Shkuro, Uber. Personal Communication. (February 2017). (§2.1).Google Scholar
- Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., and Stoica, I. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI '12). (§6). Google ScholarDigital Library
- Universal context propagation for distributed system instrumentation
Recommendations
Instrumentation Blueprints: Towards Combining Several Android Instrumentation Tools
Applied Cryptography and Network Security WorkshopsAbstractThe explosive growth of the amount of Android apps has given rise to a pressing need to analyse these apps, most importantly for security purposes. Many Android app analysis and hardening tools rely on bytecode instrumentation: the modification of ...
The CSI Framework for Compiler-Inserted Program Instrumentation
SIGMETRICS '18The CSI framework provides comprehensive static instrumentation that a compiler can insert into a program-under-test so that dynamic-analysis tools - memory checkers, race detectors, cache simulators, performance profilers, code-coverage analyzers, etc. ...
Controlling program execution through binary instrumentation
Special issue on the 2005 workshop on binary instrumentation and applicationBinary instrumentation has been widely used to observe dynamic program behavior, but current binary instrumentation systems do not allow the tool writer to alter the program execution path. This paper introduces some simple and general mechanisms for a ...
Comments