7.1 Interception practices

Intercepting calls to statically bound native functions generally involves patching code at runtime, in much the same way software debuggers catch control breakpoints.1 Hunt and Scott (1999) suggest moving the first few instructions at the start of an intercepted function to a “trampoline,” and replacing them with a jump instruction which transfers execution to the trampoline. This code can take any actions it desires before executing the copied instructions from the original function and transferring execution back to the remainder of it. Hunt and Scott suggest intercepting the return call by overwriting the return address on the call stack, and storing this value in thread-local storage for later reference.

Due to the characteristics of the component technology studied in this work, interception can be implemented in ways that do not involve patching existing code at runtime. This is especially beneficial for embedded systems that may run code directly from read-only or flash memory, and may thus not allow for the runtime modification of code. Component models that mandate the use of runtime software to invoke operations can implement interception trivially. A CORBA ORB, for instance, allows interceptors to register with it, and ensures that they are invoked when calls are made (Schmidt and Vinoski 2003). Platforms based on virtual machines, such as Java and .NET, can easily implement interception, as the “machines” they target are software constructs (unless ahead-of-time compilation to native code is employed). The Enterprise Edition of Java, though, elects to implement interception by requiring that Enterprise JavaBeans are accessed only through an explicit intermediary that invokes services; clients are thus under no pretense that they are communicating directly with the target object (Szyperski et al. 2002:310).

Component models based on binary standards, such as COM, allow objects to communicate directly without the use of a mediating runtime system, and must therefore use different means to realize interception. Due to the universal use of dynamic dispatch by the component models considered in this thesis, this can be done without patching code at runtime, as dynamic dispatch implies that binding to an implementation is done at runtime. The implementation may thus not be the true target object, but a wrapper that forwards calls and ensures that the services provided by the component model are invoked before and after the call to the target object is made. (Such a wrapper object is also free not to forward requests, at its own discretion.)

The simplest way to achieve this scheme is arguably to statically generate wrapper classes, in much the same way that proxy classes are generated at build-time in Sony Ericsson’s ECMX system. The mechanism used to instantiate objects needs to be wrapper-savvy and return a wrapper object if one is needed. However, generating wrapper classes at build-time adds unnecessary code bloat, which can be avoided by synthesizing wrapper objects at runtime.

Brown (1999a, 1999b) does just this by introducing a generic wrapper that can wrap any object, and allows for the execution of arbitrary code before and after the operations of the target object are invoked. This generic wrapper is interesting in that it does not require type information to be available at runtime (which also means that services cannot usefully process arguments given to operations). It manages this feat by using a domain-agnostic dispatch table, whose functions delegate calls to a single generic function that dispatches the call to the original object and allows services to run. The only information needed by this generic function is the offset into the dispatch table, which the functions pointed to by the generic dispatch table helpfully push onto the call stack before invoking the generic function. (By necessity, all this code must be written in assembly language.) The generic wrapper intercepts return calls from the target object by overwriting the return address in the stack frame.

The one weakness of the generic wrapper is that interceptors cannot prevent an invocation from propagating to the target object, which is due to the calling convention used by COM on 32-bit systems—stdcall. This calling convention requires the receiver of calls to adjust the stack pointer. As a result, the interceptor must let all invocations propagate so that the receiver can adjust the stack pointer, as it cannot do this on its own due to lack of type information. Incidentally, this is not an issue on 64-bit Windows systems, as the one universal calling convention dictates that the caller adjusts the stack pointer.

Brown’s generic wrapper must be explicitly instantiated at runtime by users. By contrast, wrappers are automatically used by the COM-based interception system devised by Hunt and Scott (1999). This is realized by intercepting all object-instantiation calls to COM’s shared library, and returning wrappers instead of the sought objects. These functions are intercepted by patching parts of the COM runtime system, as described above. Arbitrary services may be implemented on top of this system.

Microsoft Transaction Server uses a similar system, but allows only for system-provided services. It ensures that its wrappers are used, not by patching COM’s object-instantiation functions, but by modifying the Registry, which normally associates classes with the files that house them. These modifications ensure that a component provided by MTS is identified as the file housing classes of interest to MTS. It maintains an alternative data store containing the real file names, thereby ensuring that its wrappers can instantiate the true target objects when they are created. This data store also contains information on what services should be provided to objects. Developers must take care not to return a direct reference to an MTS object, as doing so circumvents the interception system; all external invocations must go through wrappers (Pattison 2000).

With the release of Windows 2000, Microsoft brought a number of improvements to COM in the form of COM+. The most significant change was merging the functionality of MTS with COM. As a result, COM gained an application server as well as enterprise services configured using declarative attributes and realized using interception. Whereas MTS had been layered strictly on top of COM, COM+ integrated the enterprise features directly.

In COM+, all objects reside in contexts, which themselves are part of apartments. Objects that are similarly configured may be part of the same context, in which case they are able to access one another directly. Objects that are part of different contexts access one another through proxies, which act as the wrappers that allow COM+ to intercept invocations. All object references are specific to the context in which it was created—sharing an object reference with an object in a different context prevents COM+ from intercepting calls (Box 1999).

Like MTS, COM+ maintains a data store separate from the Registry to store the declarative attributes of classes. Unlike MTS, it does not need to modify the Registry to make it point to COM+ wrappers, as the COM+ runtime system is service-savvy. With COM+, interception has been integrated directly with COM, simplifying the technology significantly.

Footnotes

  1. Calls to shared library functions are easier to intercept as they typically go through a jump table which can be patched instead of the code itself. The one caveat is that this only works for implicit runtime linking.