Component technology in an embedded system
Master's thesis in computer science
Master's thesis in computer science
COM is both an object model and a component model—components encapsulate instantiable classes, which implement interfaces, through which objects communicate. As a binary standard, it standardizes aspects of components and objects that are important to the correct functioning of COM. It is agnostic to the implementation language used to write classes, as long as the binary standard is adhered to.
Through its binary standard, COM standardizes the access mechanism for objects by mandating a specific memory layout, calling convention and type system. As interfaces are the sole means of accessing COM objects, COM is said to be a binary standard for interfaces. The memory layout mandated by COM is notably compatible with pure virtual C++ classes, as produced by Microsoft’s own C++ compiler.1 As a result, COM goes some way toward standardizing a C++ application binary interface (ABI) on the Windows platform, making code produced by different C++ compilers compatible, at least as far as COM is concerned.2 Vendors of compilers for languages other than C++ need to adhere to the standard as well, if their language is to be compatible with COM. Embarcadero’s Delphi integrated development environment, whose Delphi programming language is a variation of Object Pascal, fully adheres to the COM binary standard by producing COM-conformant objects (Calvert 1999:381). COM thus successfully creates a standard that enables disparate object-oriented languages to communicate without losing their object semantics, and does so by only standardizing the absolute minimum required to ensure binary interoperability.
The binary standard of COM is similar, but not identical, to the binary standard developed in Chapter 4. Like this binary standard, COM does not support implementation inheritance, which can be seen as a feature (as argued in section 4.3.2). A client variable points to a memory area whose first member points to a dispatch table, the fields of which point to the actual implementation (Szyperski et al. 2002:330). Figure 5.1 depicts this visually. The first argument to a function that serves as the implementation of a COM operation must be a this pointer, which again is consistent with Chapter 4. This allows COM to exhibit true object characteristics.3
A COM class may implement any number of interfaces. All interfaces directly descend from one other interface, except IUnknown, the root of the interface inheritance hierarchy (by convention, all compile-time interface names start with “I”). IUnknown is, for most intents and purposes, identical to the Fundamental interface introduced in Chapter 4. The COM equivalent to the Fundamental::SwitchInterface() operation is IUnknown::QueryInterface(), and Fundamental::AddReference() and Fundamental::RemoveReference() correspond to IUnknown::AddRef() and IUnknown::Release(), respectively.
COM uses reference counting to manage memory. Objects need not be reference counted in their entirety—each interface implemented by an object can be separately reference counted. This feature is known as tear-off interfaces, and can be used by an object to initialize and destroy resources on a per-interface basis, thus conserving resources (Szyperski et al. 2002:334).
Runtime names need to be assigned to a large number of different COM entities, including classes and interfaces. COM uses Universally Unique Identifiers (UUIDs) for this purpose, also referred to as Globally Unique Identifiers (GUIDs) by Microsoft. A UUID is a 128-bit number which has a very high probability of being globally unique. A textual representation of a UUID can look as follows: “7a3fc5d3-f79a-4de5-827d-d0f5619a4c99.” The Windows Registry serves as the data store that maps UUIDs to components (shared libraries for objects that run in-process and executable files for objects that run out-of-process). (Recent Windows versions support in-process COM components that are not globally accessible, and thus do not need to be stored in the Registry (Templin 2005).)
Operations in COM interfaces are expected to provide error information in the form of integer return values known as HRESULT. True return values are provided as output arguments. A HRESULT value is a 32-bit integer value divided into a number of fields.
To the extent that COM supports versioning, it does so through avoidance. An interface UUID identifies not only an interface, but also its version, thus requiring that interfaces, once published, are never changed. A class can easily support multiple versions of an interface by implementing all interfaces corresponding to the different versions. Newer clients use IUnknown::QueryInterface() to query for a newer version, while older clients query for an older version.
A COM component may not only run in the caller’s context, it can also run in a different process or (through DCOM) on a different machine altogether. Inter-process communication and inter-machine communication are facilitated using client-side and server-side proxies, as explained in section 2.2.3 (referred to in COM as “proxies” and “stubs,” respectively). COM supports both synchronous (blocking) and asynchronous (non-blocking) calls to components running out-of-process (Prosise 2000a).4
Marshalling may be handled automatically by COM, called standard marshalling. Advanced users that wish to handle all marshalling aspects themselves may elect to use custom marshalling. The latter may be preferable for performance-critical applications, as it makes it possible to handle certain operations without deferring to a remote server, thereby cutting down on inter-process or inter-machine calls. A custom-written client-side proxy could, for instance, cache data in the client process, and transparently operate on this state instead of consulting the remote object. Many of the benefits of custom marshalling may be reaped using in-process handlers, without the complexity of the former approach. An in-process handler may elect to handle some operations locally, while delegating others to standard marshalling (Prosise 2000b).
COM can be used with the interface description language COM IDL, but as a binary standard, using this language is strictly speaking optional. COM IDL is an extended version of DCE’s IDL, notably adding objects to the language (Hludzinski 1998). Microsoft’s IDL compiler can generate client-side and server-side proxies, C/C++ language bindings, as well as type libraries. A type library is a non-textual, efficient representation of a set of IDL files, which may be deployed to end-users’ systems as stand-alone files, or embedded as resources in shared libraries or executable files.5 A type library is essentially a repository of type information available at runtime. The COM runtime system can read type libraries, and make the data therein available through the ITypeInfo interface. Language bindings are typically not generated directly from IDL files, but from type libraries, as type libraries are the entities that are deployed to end-users’ systems.
Factories are used in COM to instantiate classes. For a class to be instantiable, there must be an implementation of IClassFactory available that can instantiate said class. A COM component that runs in-process (and thus is implemented as a shared library) must export a function that returns an object implementing IClassFactory for a given class UUID passed as an argument. A COM component that runs out-of-process on the same machine is implemented as a standard executable file, that when started registers its class factory with the COM runtime system (Goswell 1995).
One of the selling points of COM is that it enables what Microsoft calls Automation—the ability for one program, typically a script, to access and control another, which is often written in native code. A script written in Visual Basic can, for example, use the charting engine of Microsoft Excel through Automation. Automation allows applications to make their functionality available as a set of COM objects.
Automation implies that the validity of invocations are verified only at runtime, thus making use of very late binding (see section 4.2). The traditional solution in COM is to require that classes that wish to be accessible through very late binding implement the IDispatch interface, which is analogous to the Scriptable interface presented in Chapter 4. (Classes that are accessible using both late binding and very late binding, and thus implement IDispatch in addition to traditional, domain-specific interfaces, are said to use dual interfaces). IDispatch::Invoke() does not, unlike Scriptable::InvokeOperation(), take a string representing the name of the operation as an argument. Rather, it takes a dispatch identifier, which can be retrieved at runtime using IDispatch::GetIDsOfNames(), presumably for reasons of efficiency. If a component ships with a type library, the implementation of the IDispatch interface can be fully synthesized at runtime, or by simply forwarding calls to a system-provided implementation of ITypeInfo.
Footnotes