3.2.1 Introducing late binding

Late binding implies that the implementation of a type must be found at runtime. There are a variety of means to achieve this goal—common to them all is that a data structure needs to be queried at runtime to locate the desired implementation. A reference to this data structure is stored as a (usually) hidden part of the instance data of an object, and is used behind the scenes to locate the implementation.

Listing 3.4 presents a simple Perl script that demonstrates late binding (without object semantics). The script accepts a command-line argument that must be either of the strings “first” or “second,” and picks an implementation of the “Foo” operation based on this argument. It achieves this by using a data structure in the form of a Perl associative array (a hash), which maps keys to values, in this case strings to subroutines. As the data structure is used to dispatch subroutine calls, it is called a dispatch table.

Listing 3.4: dispatch.pl
use strict;

my %first_dispatch_table = (
  foo_sub => sub { print "Foo operation, first dispatch table.\n" },
  bar_sub => sub { print "Bar operation, first dispatch table.\n" }
);

my %second_dispatch_table = (
  foo_sub => sub { print "Foo operation, second dispatch table.\n" },
  bar_sub => sub { print "Bar operation, second dispatch table.\n" }
);

# Retrieve the desired implementation subroutine.
my $implementation_sub;

if (@ARGV == 1) {
  if ($ARGV[0] eq "first") {
    $implementation_sub = $first_dispatch_table{foo_sub};
  } elsif ($ARGV[0] eq "second") {
    $implementation_sub = $second_dispatch_table{foo_sub};
  }
}

die "Cannot find implementation.\n" if !defined($implementation_sub);

# Dispatch the call.
$implementation_sub->();

It would be possible, but inadvisable, to use the same solution in C—that is, a hash table mapping operation names to function addresses. Using the names of operations—strings—is inefficient and often not possible for native code (as the names are typically not available at runtime). The standard solution is to use tables of procedural variables—structures of function pointers, in C parlance—indexed into by indices that are known at compile-time. Calling an operation using such a table is obviously not as efficient as eliminating dynamic dispatch entirely (static dispatch), but more efficient than using more elaborate data structures, such as the hash table used in Listing 3.4. This solution is very common, and is known by many names, including virtual method table (VMT), virtual table (VT, VTBL), vtable, virtual function table and dispatch table (the plethora of names containing the word “virtual” is due to the use of dispatch tables to implement virtual operations in languages such as C++).

Listing 3.5 shows the C header file for the Node interface. The specification for the Node dispatch table appears at the bottom as the Node_DispatchTable_t type, and declares two function pointers consistent with Figure 3.2.1 The Node_t type stipulates that for a pointer to be considered a reference to an object implementing the Node interface, the first member of the memory area it points to must be a pointer to a dispatch table compatible with Node_DispatchTable_t. The memory layout stipulated by these two types is an example of one part of the binary standard for the object model designed in this chapter.2 It is depicted in Figure 3.3. The convenience macros Node_Destroy() and Node_IsConstant() help clients make calls to objects implementing this interface.

Figure "custom-interface-pointer"
Figure 3.3: Memory layout of interfaces of the custom object model

Listing 3.5: Node.h
/**
 * @file
 *
 * This file contains an interface, <code>Node</code>, representing an
 * immutable node in a syntax tree. This tree represents an arithmetic
 * expression as produced by an operator-precedence parser. This
 * interface is not meant to be implemented directly; rather, it
 * stipulates what functionality all nodes in said tree are expected
 * to provide. Descendant interfaces add operations specific to the
 * type of node they are modeling.
 */

#ifndef INCLUSION_GUARD_NODE
#define INCLUSION_GUARD_NODE

#include <stdbool.h>

// Convenience macros for the operations defined in this interface:

/**
 * Destroys this node and its children (if any). This operation always
 * succeeds.
 *
 * @param[in] pNode
 *   the instance implementing this interface. May be
 *   <code>NULL</code>.
 */
#define Node_Destroy(pNode)                                             \
  if ((pNode) != NULL)                                                  \
  {                                                                     \
    (pNode)->pDispatchTable->Destroy(pNode);                            \
  }

/**
 * Returns <code>true</code> if this node is fully constant and
 * <code>false</code> otherwise.
 *
 * @param[in] pNode
 *   the instance implementing this interface. Must not be
 *   <code>NULL</code>.
 * @param[out] pResult
 *   a pointer to the variable which shall hold the result returned by
 *   this operation, that is, whether this node is constant. Must not
 *   be <code>NULL</code>.
 * @return
 *   <code>true</code> if the operation completes successfully,
 *   <code>false</code> otherwise.
 */
#define Node_IsConstant(pNode, pResult)                                 \
  (pNode)->pDispatchTable->IsConstant((pNode), (pResult))

struct Node_DispatchTable_s;

/**
 * Variables of this type may be used to represent objects
 * implementing this interface.
 */
typedef struct
{
  const struct Node_DispatchTable_s* pDispatchTable;
} Node_t;

/**
 * This is the dispatch table, or vtable, for this interface. It
 * represents an indirection that enables the implementation of an
 * interface to be bound to at runtime. Interfaces wishing to extend
 * this interface must include the complete contents of this structure
 * as the first members of their dispatch tables, changing the
 * <code>Node_t</code> type to match their own. (Only single interface
 * inheritance is supported.)
 */
typedef struct Node_DispatchTable_s
{
  // Operations defined in the Node interface:
  void (*Destroy)(Node_t* pNode);
  bool (*IsConstant)(Node_t* pNode, bool* pResult);
} Node_DispatchTable_t;

#endif // INCLUSION_GUARD_NODE

Dispatch tables compatible with the Node interface must not necessarily be of the formal compile-time type Node_DispatchTable_t, and a proper Node object reference must not necessarily be of the type Node_t. They must, however, be binary compatible with these types; in other words, regardless of the formal compile-time types of Node references and Node dispatch tables, it must be possible to treat them as though they were of the types described here. The upshot is that a dispatch table compatible with the Node_DispatchTable_t type may include data following that specified by the aforementioned type, and ditto for object references compatible with the Node_t type.

This aspect makes it straightforward for another interface to inherit from Node—its dispatch table type must simply include all members of the Node_DispatchTable_t type before its own members, changing the Node_t type to match its own type (which is binary compatible with Node_t).3 Listing 3.6 shows the BinaryOperatorNode interface, which extends Node. (It is dependent on the enumerated type BinaryOperator_t, whose members are shown in Figure 3.2.) The BinaryOperatorNode_DispatchTable_t and BinaryOperatorNode_t types are fully binary compatible with the Node_DispatchTable_t and Node_t types defined in Listing 3.5.

Listing 3.6: BinaryOperatorNode.h
/**
 * @file
 *
 * This file contains an interface, <code>BinaryOperatorNode</code>,
 * representing a binary operator node in a syntax tree. This tree
 * represents an arithmetic expression. A binary operator takes two
 * operands (in the expression "3 / x", the division operator operates
 * on the literal operand "3" and the identifier operand "x"). This
 * interface extends <code>Node</code>.
 */

#ifndef INCLUSION_GUARD_BINARY_OPERATOR_NODE
#define INCLUSION_GUARD_BINARY_OPERATOR_NODE

#include <stdbool.h>
#include "Node.h"
#include "BinaryOperator.h"

/* Convenience macros for the operations inherited from the Node
 * interface:
 */
#define BinaryOperatorNode_Destroy(pBinaryOperatorNode)                 \
  if ((pBinaryOperatorNode) != NULL)                                    \
  {                                                                     \
    (pBinaryOperatorNode)->pDispatchTable->Destroy(pBinaryOperatorNode);\
  }
#define BinaryOperatorNode_IsConstant(pBinaryOperatorNode, pResult)     \
  (pBinaryOperatorNode)->pDispatchTable->IsConstant(                    \
    (pBinaryOperatorNode), (pResult))

// Convenience macros for the operations defined in this interface:

/**
 * Returns the binary operator of this node.
 *
 * @param[in] pBinaryOperatorNode
 *   the instance implementing this interface. Must not be
 *   <code>NULL</code>.
 * @param[out] pResult
 *   a pointer to the variable which shall hold the result returned by
 *   this operation, that is, the binary operator of this node.
 * @return
 *   <code>true</code> if the operation completes successfully,
 *   <code>false</code> otherwise.
 */
#define BinaryOperatorNode_Operator(pBinaryOperatorNode, pResult)       \
  (pBinaryOperatorNode)->pDispatchTable->Operator((pBinaryOperatorNode),\
                                                  (pResult))

/**
 * Returns the left operand of this node.
 *
 * @param[in] pBinaryOperatorNode
 *   the instance implementing this interface. Must not be
 *   <code>NULL</code>.
 * @param[out] ppResult
 *   a pointer to the variable which shall hold the result returned by
 *   this operation, that is, the left operand of this node.
 * @return
 *   <code>true</code> if the operation completes successfully,
 *   <code>false</code> otherwise.
 */
#define BinaryOperatorNode_LeftOperand(pBinaryOperatorNode, ppResult)   \
  (pBinaryOperatorNode)->pDispatchTable->LeftOperand(                   \
    (pBinaryOperatorNode),                                              \
    (ppResult))

/**
 * Returns the right operand of this node.
 *
 * @param[in] pBinaryOperatorNode
 *   the instance implementing this interface. Must not be
 *   <code>NULL</code>.
 * @param[out] ppResult
 *   a pointer to the variable which shall hold the result returned by
 *   this operation, that is, the right operand of this node.
 * @return
 *   <code>true</code> if the operation completes successfully,
 *   <code>false</code> otherwise.
 */
#define BinaryOperatorNode_RightOperand(pBinaryOperatorNode, ppResult)  \
  (pBinaryOperatorNode)->pDispatchTable->RightOperand(                  \
    (pBinaryOperatorNode),                                              \
    (ppResult))

struct BinaryOperatorNode_DispatchTable_s;


/**
 * Variables of this type may be used to represent objects
 * implementing this interface.
 */
typedef struct
{
  const struct BinaryOperatorNode_DispatchTable_s* pDispatchTable;
} BinaryOperatorNode_t;

/**
 * This is the dispatch table, or vtable, for this interface. It
 * represents an indirection that enables the implementation of an
 * interface to be bound to at runtime. Interfaces wishing to extend
 * this interface must include the complete contents of this structure
 * as the first members of their dispatch tables, changing the
 * <code>BinaryOperatorNode_t</code> type to match their own. (Only
 * single interface inheritance is supported.)
 */
typedef struct BinaryOperatorNode_DispatchTable_s
{
  // Operations inherited from the Node interface:
  void (*Destroy)(BinaryOperatorNode_t* pBinaryOperatorNode);
  bool (*IsConstant)(BinaryOperatorNode_t* pBinaryOperatorNode,
                     bool* pResult);

  // Operations defined in the BinaryOperatorNode interface:
  bool (*Operator)(BinaryOperatorNode_t* pBinaryOperatorNode,
                   BinaryOperator_t* pResult);
  bool (*LeftOperand)(BinaryOperatorNode_t* pBinaryOperatorNode,
                      Node_t** ppResult);
  bool (*RightOperand)(BinaryOperatorNode_t* pBinaryOperatorNode,
                       Node_t** ppResult);
} BinaryOperatorNode_DispatchTable_t;

#endif // INCLUSION_GUARD_BINARY_OPERATOR_NODE

Writing a class that implements the BinaryOperatorNode interface (and thus also the Node interface) entails complying with the binary standard of this interface, in the form of the BinaryOperatorNode_DispatchTable_t and BinaryOperatorNode_t types. A pointer to an instance of such a class must be binary compatible with the latter type, and its pDispatchTable member must be binary compatible with the former type.

DefaultBinaryOperatorNode is such a class. All its operations are virtual, except its constructor (constructors are always bound to statically, as clients are always aware of the identities of the classes they instantiate). Its header file appears as Listing 3.7 and its implementation as Listing 3.8. Figure 3.4 depicts the instance data of DefaultBinaryOperatorNode objects, which are fully compatible with the binary standard shown in Figure 3.3. The contents of the white areas in the figure are dictated by the binary interface standard—the first member of the memory representing an object must be a pointer to a dispatch table, the layout of which is fixed—whereas the shaded areas may contain any data that is convenient for the implementation. This latter space is used for the instance data.

Figure "defaultbinaryoperatornode-instance"
Figure 3.4: Instance data of DefaultBinaryOperatorNode objects

Listing 3.7: DefaultBinaryOperatorNode.h
/**
 * @file
 *
 * This file contains function prototypes for the class operations
 * (static operations) of the <code>DefaultBinaryOperatorNode</code>
 * class.
 */

#ifndef INCLUSION_GUARD_DEFAULT_BINARY_OPERATOR_NODE
#define INCLUSION_GUARD_DEFAULT_BINARY_OPERATOR_NODE

#include <stdbool.h>
#include "Node.h"
#include "BinaryOperatorNode.h"
#include "BinaryOperator.h"

/**
 * Creates an instance of the <code>DefaultBinaryOperatorNode</code>
 * class.
 *
 * @param[in] operator
 *   the binary operator of this node.
 * @param[in] pLeftOperand
 *   the left operand of this node. Must not be <code>NULL</code>.
 * @param[in] pRightOperand
 *   the right operand of this node. Must not be <code>NULL</code>.
 * @param[out] ppResult
 *   a pointer to the variable which shall hold the instance of this
 *   class. Must not be <code>NULL</code>.
 * @return
 *   <code>true</code> if the operation completes successfully,
 *   <code>false</code> otherwise.
 */
bool DefaultBinaryOperatorNode_Create(BinaryOperator_t operator,
                                      Node_t* pLeftOperand,
                                      Node_t* pRightOperand,
                                      BinaryOperatorNode_t** ppResult);

#endif // INCLUSION_GUARD_DEFAULT_BINARY_OPERATOR_NODE

Listing 3.8: DefaultBinaryOperatorNode.c
/**
 * @file
 *
 * This file contains a class, <code>DefaultBinaryOperatorNode</code>,
 * which is the default implementation of the
 * <code>BinaryOperatorNode</code> interface.
 */

#include <stdbool.h>
#include <stdlib.h>
#include <stdarg.h>
#include "Node.h"
#include "BinaryOperatorNode.h"
#include "BinaryOperator.h"
#include "DefaultBinaryOperatorNode.h"

/**
 * This structure represents the instance data of
 * <code>DefaultBinaryOperatorNode</code> objects.
 */
typedef struct
{
  /**
   * The dispatch table used by this instance.
   */
  const BinaryOperatorNode_DispatchTable_t* pDispatchTable;

  /**
   * The binary operator used by this object.
   */
  BinaryOperator_t operator;

  /**
   * The node representing the left operand. This member is never
   * <code>NULL</code>.
   */
  Node_t* pLeftOperand;

  /**
   * The node representing the right operand. This member is never
   * <code>NULL</code>.
   */
  Node_t* pRightOperand;
} DefaultBinaryOperatorNode_InstanceData_t;

static const BinaryOperatorNode_DispatchTable_t
  gBinaryOperatorNodeDispatchTable;

bool DefaultBinaryOperatorNode_Create(BinaryOperator_t operator,
                                      Node_t* pLeftOperand,
                                      Node_t* pRightOperand,
                                      BinaryOperatorNode_t** ppResult)
{
  DefaultBinaryOperatorNode_InstanceData_t* pInstanceData = NULL;

  bool result =
    (pLeftOperand != NULL) &&
    (pRightOperand != NULL) &&
    (ppResult != NULL);

  if (result)
  {
    pInstanceData =
      malloc(sizeof(DefaultBinaryOperatorNode_InstanceData_t));
    result = pInstanceData != NULL;
  }

  if (result)
  {
    pInstanceData->pDispatchTable = &gBinaryOperatorNodeDispatchTable;
    pInstanceData->operator = operator;
    pInstanceData->pLeftOperand = pLeftOperand;
    pInstanceData->pRightOperand = pRightOperand;
    *ppResult = (BinaryOperatorNode_t*)pInstanceData;
  }
  else if (ppResult != NULL)
  {
    *ppResult = NULL;
  }

  return result;
}

static void DefaultBinaryOperatorNode_Destroy(
  BinaryOperatorNode_t* pBinaryOperatorNode)
{
  if (pBinaryOperatorNode != NULL)
  {
    DefaultBinaryOperatorNode_InstanceData_t* pThis =
      (DefaultBinaryOperatorNode_InstanceData_t*)pBinaryOperatorNode;

    Node_Destroy(pThis->pLeftOperand);
    Node_Destroy(pThis->pRightOperand);

    free(pBinaryOperatorNode);
  }
}

static bool DefaultBinaryOperatorNode_IsConstant(
  BinaryOperatorNode_t* pBinaryOperatorNode, bool* pResult)
{
  bool result = (pBinaryOperatorNode != NULL) && (pResult != NULL);
  DefaultBinaryOperatorNode_InstanceData_t* pThis =
    (DefaultBinaryOperatorNode_InstanceData_t*)pBinaryOperatorNode;
  bool isLeftOperandConstant = false;
  bool isRightOperandConstant = false;

  if (result)
  {
    result = Node_IsConstant(pThis->pLeftOperand,
                             &isLeftOperandConstant);
  }

  if (result)
  {
    result = Node_IsConstant(pThis->pRightOperand,
                             &isRightOperandConstant);
  }

  if (result)
  {
    *pResult = isLeftOperandConstant && isRightOperandConstant;
  }
  else if (pResult != NULL)
  {
    *pResult = false;
  }

  return result;
}

static bool DefaultBinaryOperatorNode_Operator(
  BinaryOperatorNode_t* pBinaryOperatorNode,
  BinaryOperator_t* pResult)
{
  bool result = (pBinaryOperatorNode != NULL) && (pResult != NULL);
  DefaultBinaryOperatorNode_InstanceData_t* pThis =
    (DefaultBinaryOperatorNode_InstanceData_t*)pBinaryOperatorNode;

  if (result)
  {
    *pResult = pThis->operator;
  }
  else if (pResult != NULL)
  {
    *pResult = BinaryOperator_UNDEFINED;
  }

  return result;
}

static bool DefaultBinaryOperatorNode_LeftOperand(
  BinaryOperatorNode_t* pBinaryOperatorNode,
  Node_t** ppResult)
{
  bool result = (pBinaryOperatorNode != NULL) && (ppResult != NULL);
  DefaultBinaryOperatorNode_InstanceData_t* pThis =
    (DefaultBinaryOperatorNode_InstanceData_t*)pBinaryOperatorNode;

  if (result)
  {
    *ppResult = pThis->pLeftOperand;
  }
  else if (ppResult != NULL)
  {
    *ppResult = NULL;
  }

  return result;
}

static bool DefaultBinaryOperatorNode_RightOperand(
  BinaryOperatorNode_t* pBinaryOperatorNode,
  Node_t** ppResult)
{
  bool result = (pBinaryOperatorNode != NULL) && (ppResult != NULL);
  DefaultBinaryOperatorNode_InstanceData_t* pThis =
    (DefaultBinaryOperatorNode_InstanceData_t*)pBinaryOperatorNode;

  if (result)
  {
    *ppResult = pThis->pRightOperand;
  }
  else if (ppResult != NULL)
  {
    *ppResult = NULL;
  }

  return result;
}

/**
 * This is the dispatch table for the <code>BinaryOperatorNode</code>
 * interface implemented by this class.
 */
static const BinaryOperatorNode_DispatchTable_t
  gBinaryOperatorNodeDispatchTable =
{
  DefaultBinaryOperatorNode_Destroy,
  DefaultBinaryOperatorNode_IsConstant,
  DefaultBinaryOperatorNode_Operator,
  DefaultBinaryOperatorNode_LeftOperand,
  DefaultBinaryOperatorNode_RightOperand
};

The DefaultBinaryOperatorNode_InstanceData_t type represents the instance data of DefaultBinaryOperatorNode objects. The first member is a pointer to the dispatch table, in accordance with the BinaryOperatorNode_t type. The members declared after the pointer to the dispatch table are used for the actual instance data.

All instances of a certain class use the same dispatch table, as the only thing that varies between objects of the same class is their state. The behavior of different objects of the same class differs only insofar as they have different state, as they use the same implementation. As such, the dispatch table is statically allocated as the constant structure gBinaryOperatorNodeDispatchTable at the end of Listing 3.8. All instances of a class point to the same global dispatch table. While the formal compile-time type of this variable is BinaryOperatorNode_DispatchTable_t, it is binary compatible with Node_DispatchTable_t as well.

Except for the constructor, all functions are declared static, and as a result their symbol names are not available outside the compilation unit in which they are defined. They are, however, still accessible to other compilation units as their addresses are stored in the dispatch table. Essentially, this approach subverts the standard means that C provides for calling external functions (external linkage), enabling the implementation of dispatch in an object-oriented fashion.

All operations accept a pointer to a variable of the BinaryOperatorNode_t type. This variable needs to be typecast to the DefaultBinaryOperatorNode_InstanceData_t type in order to access the instance data (predictably, the local variable used to access the instance data is called pThis).

This example program essentially implements a complete, if simple, object model through the conventions it adopts. There is room for improvement, though, which is the topic for Chapter 4.

Footnotes

  1. Calls to Node::Destroy(), and other destructors in this thesis, are guaranteed to succeed at all times. Destructors that may fail are difficult to handle well, especially when called from other destructors (if a destructor destroys some of its members before encountering a member destructor that fails, it is difficult to recover).
  2. Having knowledge of the memory layout is necessary, but not sufficient, to make use of objects that conform to this object model. See section 4.4.
  3. This approach only works when extending a single interface, known as single interface inheritance. Extending multiple interfaces is discussed in Chapter 4.