7.2.1 Generating UML sequence diagrams

A number of output formats were considered for the tool generating UML sequence diagrams, including those used by leading CASE tools.1 In the end, the XML-based Scalable Vector Graphics (SVG) format was selected because of its ubiquity and openness. A cursory search at the end of 2005 revealed that no suitable pre-existing sequence diagram generators were available, thus necessitating the creation of a custom solution.

The initial plan called for writing a Perl script converting trace files to a standardized XML-based format for expressing program behavior at runtime. This data would then be converted to SVG using an XSLT stylesheet. This approach was abandoned, as no such standardized format was found other than OMG’s XML Metadata Interchange (XMI), and this format was deemed too complex. The tool was instead fully written in Java.

The sequence diagram generator is structured much like a compiler. The front-end parses trace files, and returns an intermediate representation from which an output document is generated by a back-end. To parse the input data, the front-end uses Java’s built-in support for regular expressions, with an expression similar to the one presented in the footnote in section 7.2.

To infer caller–callee relationships, the front-end maintains a call stack—when operation bodies are entered, an entry is pushed onto the call stack, and popped off when the body is exited. When an entry that signifies that an operation is entered is encountered in the trace file, the operation at the top of the call stack is assumed to have called it. Likewise, when an object creation entry is encountered, the operation at the top of the call stack is assumed to have created it. As not all objects are necessarily part of a trace file, these links are somewhat tenuous in that they are at times indirect. For instance, an operation identified as instantiating a certain object may not have created the object directly, it may rather have called on the services of an object not appearing in the trace file.

There are two back-ends available, one that prints a text-only version of the trace file to standard output and one that generates SVG files. The text-only back-end is mostly used for debugging the generator itself. Its output is significantly more readable than a raw trace file, though, due to its use of indentation to convey call depth.

The SVG back-end uses the open-source Apache Batik library, which allows programs to draw on a standard graphics canvas instead of generating SVG directly—Batik converts the graphics primitives to SVG. The SVG back-end does not attempt to have the Y axis of diagrams reflect real time, as doing so would result in very large diagrams.


  1. Computer-aided software engineering (CASE) is an umbrella term for software tools that help organizations develop software. It is most commonly used to refer to tools that help with modeling and design. CASE tools typically help developers model a domain visually using UML notation.