Elsevier

Computer Networks

Volume 50, Issue 14, 5 October 2006, Pages 2462-2474
Computer Networks

Lightweight thread tunnelling in network applications

https://doi.org/10.1016/j.comnet.2006.04.021Get rights and content

Abstract

Active Network nodes allow for non-trivial processing of data streams. These complex network applications typically benefit from protection between their components for fault-tolerance or security. However, fine-grained memory protection introduces bottlenecks in communication among components. This paper describes memory protection in Expert, an OS for programmable network elements which re-examines thread tunnelling as a way of allowing these complex applications to be split over multiple protection domains. We argue that previous problems with tunnelling are symptoms of overly general designs, and we demonstrate a minimal domain-crossing primitive which nevertheless achieves the majority of benefits possible from tunnelling.

Introduction

Modern network elements have many software components – for instance to support user-programmability. Software systems in such contexts must address a fundamental concern, that of multiplexing (sharing) the network element amongst many users. Generally, this involves some form of sandboxing to allow code to be executed on behalf of untrusted users [1], [2]. Consequentially the multiplexing scheme must trade off between performance and security as well as other factors.

Sandboxing can be performed either at the language level (by using safe languages such as Java or ML), or at the machine level (by appropriate memory protection and CPU features). There has been much prior work on language-level sandboxing [3], [4], [5], and most current EEs (Execution Environments) rely on these techniques [6], [7]. However, language-level sandboxing lacks flexibility: invocations between components written in different languages must negotiate some common data marshalling format. Language-level sandboxing also reduces the utility of large bodies of pre-existing code by making it harder to re-use them.

Using hardware facilities to control access to memory and schedule the node’s resources is desirable because it allows any language to be used, permitting legacy code re-use. Marshalling can be efficient since native machine formats for data can be used. The main drawback of using hardware to protect the EEs is that communication between protection domains is expensive due to context switches and cache invalidations. Furthermore, these penalties are increasing: as CPU speeds rise, more and more of their performance comes from effective caching of memory contents, branch predictions, and speculative execution. Frequent switching makes these caches ineffective.

Overall, these costs dissuade application designers from placing their modules in separate protection domains, especially if there is a continual stream of data passing through the application.

Thompson wrote: “The UNIX kernel is an I/O multiplexer more than a complete operating system. This is as it should be.” [8]. This vision of a simple I/O multiplexer is one to which we find ourselves drawn once again, this time in the context of Active Network nodes. We introduce Expert, an OS designed specifically for network elements, filling this I/O multiplexer niche [9]. Sample applications include transcoders, protocol boosters [10], and user-supplied routing/forwarding protocols (e.g. multicast variants). In this paper, we describe how Expert’s memory protection architecture and its lightweight thread tunnelling directly support modular hardware-protected applications without suffering an undue performance penalty.

A good multiplexer will schedule the resource it manages. To this end, Expert uses the concept of a path (first introduced in the Scout OS [11]) to represent a flow of packets and their associated processing resources. The alternative of using multiple processes chained into a pipeline causes several problems: (1) extra context switching adds overhead; (2) as each process has its own scheduling parameters, it only takes one under-provisioned process to rate-limit the entire pipeline; (3) per-flow resource reclamation is complex, needing all processes to participate, and atomic revocation may be impossible; and (4) if multiple flows with different service characteristics are to be processed by the pipeline then each process must ensure it sub-schedules internally.

Unlike Scout, Expert paths can seamlessly cross protection domains. As an example, Fig. 1 shows the execution trace of a path which has tunnelled from module A to execute privileged code in module B, which in turned tunnelled into module C before returning back to module B and thence to A. Modules A, B, and C are all in separate protection domains, allowing B and C to implement trusted functionality securely.

Section 2 discusses existing thread tunnelling schemes, and their shortcomings. Section 3 describes Expert’s lightweight thread tunnelling primitive, and how it allows protected modules to be entered. A transcoder used as an example application is described in Section 4, and results quantifying the performance of the tunnelling system and the transcoder built over it are presented in Section 5. Section 6 concludes this paper and suggests areas for further work.

Section snippets

Background

Thread tunnelling was originally proposed as a solution to the performance problems observed in micro-kernel systems, where much inter-component communication takes place. In this guise, tunnelling is usually integrated into the IPC mechanism [12], [13], rather than using a message-passing approach.

Thread tunnelling designs all need to perform a number of core functions. A thread tunnelling primitive takes a thread and changes its runtime environment without passing through the scheduler: how

Thread tunnelling in expert

Expert starts from the premise that the thread tunnelling primitive should be as simple as possible while still being flexible enough to support more complex schemes. The tunnelling primitive only changes memory access rights and forces the program counter to the module’s entry point – state switching and other environmental modifications are delegated to the called code. We now describe Expert’s thread tunnelling architecture in detail.

An audio transcoder

In this section we describe an example application which benefits from being decomposed into separate modules. We assume an Internet radio station which produces its output as a 44.1 kHz stereo 192 kbit/s stream of MPEG-1 Layer III audio (MP3). The illustrated application transcodes this source stream into three tiers: “gold” (the premium stream), “silver” (44.1 kHz stereo, 128 kbit/s, at a reduced price) and “bronze” (11 kHz stereo, 32 kbit/s, available for free). Fig. 5 shows how the transcoders

Results

The test platform in all these experiments is an Intel Pentium Pro system running at 200 MHz, with 32 MB RAM, 256 kB L2 and a split L1 cache: 8 kB I/8 kB D. The test machine runs the Expert OS, and the transcoder application in either the path-based or all-in-one variant. It is easy to overload this modest machine: looking at systems when they are overloaded is instructive because this is where differences in architecture matter – if a system cannot shed load in a controlled fashion then it cannot

Conclusion

We described how Expert binds protection domains to code modules and lets threads tunnel into these “pods”, allowing fine-grained memory protection. Expert’s efficient and flexible tunnelling support allows this protection to be used, even in the kinds of I/O-intensive applications typical of Active Network nodes. Expert’s use of paths is well suited to scheduling the resource consumption of such I/O-driven applications.

In an example application memory protection added a cost of between 2% and

Acknowledgements

I would like to thank Jonathan Smith at the University of Pennsylvania, whose encouragement helped make this paper happen. I would also like to thank Tim Harris, Keir Fraser, and the anonymous reviewers for their helpful feedback.

Austin Donnelly is a research software developer at Microsoft Research, Cambridge, UK. He obtained his Ph.D. from the University of Cambridge in 2002. His current interests are in distributed network management and performance monitoring. In the past he has worked on Ethernet topology discovery, operating systems and network stack design, and realtime audio/video systems.

References (24)

  • D.L. Tennenhouse et al.

    Towards an active network architecture

    ACM Computer Communications Review (CCR)

    (1996)
  • L.L. Peterson, S.C. Karlin, K. Li, Os support for general-purpose routers, in: Proceedings of the 7th Workshop on Hot...
  • B. Bershad, S. Savage, P. Pardyak, E.G. Sirer, D. Becker, M. Fiuczynski, C. Chambers, S. Eggers, Extensibility, safety...
  • G.C. Necula, Proof-carrying code, in: Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming...
  • M.E. Fiuczynski, R.P. Martin, T. Owa, B.N. Bershad, Spine: a safe programmable and integrated network environment, in:...
  • D.J. Wetherall, J.V. Guttag, D.L. Tennenhouse, Ants: a toolkit for building and dynamically deploying network...
  • J.T. Moore, M. Hicks, S. Nettles, Practical programmable packets, in: Proceedings of the 20th Annual Joint Conference...
  • K. Thompson

    Unix implementation

    Bell System Technical Journal

    (1978)
  • A. Donnelly, Resource control in network elements, Ph.D. thesis, Cambridge University Computer Laboratory, January...
  • D.C. Feldmeier, A.J. McAuley, J.M. Smith, D.S. Bakin, W.S. Marcus, T.M. Raleigh, Protocol boosters, IEEE Journal on...
  • D. Mosberger, L.L. Peterson, Making paths explicit in the scout operating system, in: Proceedings of the 2nd Symposium...
  • B.N. Bershad et al.

    Lightweight remote procedure call

    ACM Transactions on Computer Systems

    (1990)
  • Cited by (0)

    Austin Donnelly is a research software developer at Microsoft Research, Cambridge, UK. He obtained his Ph.D. from the University of Cambridge in 2002. His current interests are in distributed network management and performance monitoring. In the past he has worked on Ethernet topology discovery, operating systems and network stack design, and realtime audio/video systems.

    View full text