home site map

Developer's Guide to UML 2 — A UML Tutorial

<previous  1  2  3  

Behavioral Diagrams

Sequence diagrams

As far as object-oriented developers are concerned, the sequence diagram is probably the most important kind of interaction diagram . The sequence diagram has been known in the past as an object interaction diagram, and this name tells us why it's important. Augmenting our use of structure diagrams to plan and depict the types and classes of our system, sequence diagrams allow us to see how the object instances created by, and working within our structures, can actually get the job done.

Interaction diagrams were designed to serve a diverse population of developers and can include a great deal of detail. We will focus on illustrating regular object instances interacting in normal object-oriented ways for typical object-oriented languages. An important function of these diagrams is illuminating how it's all going to work, so one of the things we need to bear in mind is comprehension at a glance. We will want to keep these diagrams easily readable. One of the nice features of a straightforward sequence diagram is that it is pretty intuitive; it almost doesn't need any explanation.

There are two major varieties of interaction diagram: sequence diagrams and communication diagrams , and two minor varieties: interaction overview diagrams and timing diagrams . Communication diagrams were known as collaboration diagrams in UML 1.x. We will be focusing on sequence diagrams which, because of their look are sometimes called ladder diagrams. Communication diagrams are essentially class diagrams that depict instances rather than classifiers, and that show little message arrows alongside the pathways that the messages are taking; a pathway could be a navigable association for example. We will return to communication diagrams at the end of this section and give an example.

The instance or lifeline

The verticals of these diagrams represent object instances, living out their lives, receiving messages, roughly in time order from the top of the diagram to the bottom. The box at the top gives the type of the object and the descending line - the lifeline - will receive the messages that arrive. (We see that, unlike most diagrams where up, down, left and right have no meaning, a sequence diagram gives a meaning to vertical ordering.)

The shape of the box at the top of the lifeline corresponds to the shape of its structure diagram definer - its classifier. So for us, it will be a rectangle reminding us of a type or a class.

An interaction can be owned by a classifier, i.e. like a state machine, an interaction can be part of the specification of a class or a type. In this case the label at the head of an instance of the classifier would read self .

Usually the instance is not an instance of some owning classifier, and so the label needs to give the type ( type or class ) of the instance. As usual we can label the instance box using the identifier: type style label. Notice that the label isn't underlined to indicate that it represents an instance as it would be in a structure diagram (and as it was in UML 1.x's sequence diagrams.)

The nature of the identifier isn't as obvious as it might at first seem. Object instances don't have identifiers. The variables they inhabit or that reference them have identifiers. So, as we are unlikely to be worrying about variable names, what do these identifiers represent? Very often they mean nothing and they are simply omitted, giving what is known as an anonymous object instance. Basic "anonymous" lifeline shows an anonymous instance's lifeline.

The main use of the identifiers is as formal names that serve to relate appearances of an instance that occurs in more than one place in a diagram. It's entirely likely that an object depicted in a sequence diagram was received by one of the other objects as an argument or return in which case the formal name serves to link the appearances, as in Formal names.

The lifeline can be dashed, then it looks just like it did in UML 1.x, but it doesn't have to be dashed.


The messages are the horizontals or "rungs" of the ladder diagram. They bear the kind of description that we are used to from the structural diagrams:

name(parameter-list): return-type

where each parameter in the parameter-list is described with:

parameter-name: type

You might recognize that the descriptions have less detail than their structural diagram counterparts would have had. Theoretically one can always look up the class or type to get the full definition of a message. A good CASE tool will usually offer something like pop-up or mouse-over help with all the details.

It is also possible, as Argument values shows, to give argument values. This is probably of limited value to us. Firstly, there is no easy way to give object argument values; and secondly even with primitive values it's hard to understand what's going on. Even a good CASE tool is going to struggle to show you more information because method overloading, and different literal forms and assumptions, will make it quite tricky to figure out which method is being selected.

Type versus class

Something that would be very useful to be able to show, is the type by which the message receiving object is seen by the message sender. If the head of the lifeline gives us the concrete class of the instance, yet the type of the pointer or reference used to send the message is more abstract, we have no way to show it. Martin Fowler [Fowler 04] makes the eminently sensible suggestion that we show the message going to an interface lollipop protruding from the method box, as shown in Type and class.


You may have wondered, or guessed, at the purpose of the slim rectangle in the last couple of illustrations. It represents the running of a method that is responding to the message impacting it at its top.12 It is quite likely that a method issues messages of its own, to other object instances. Those messages are shown emerging from the method in question, as shown in Methods and messages.

Already you'll have started to notice one drawback to this otherwise splendidly useful and object-oriented diagram - one needs a piece of paper (or screen) with an aspect ratio of HyperCineramaScope proportions - something like 10:1. This is, again, where one hopes that one's CASE tool has some clever tricks up its sleeve to cope with this.

Returns and other message varieties
Found message

As has already been seen, in Figure C.54 for example, when we have a message whose origins are unknown or unimportant, it is known as a found message and shown as originating from a small black disk.

Lost message

Not quite the opposite of a found message, a lost message - one whose destination is a small, black disk - is a message that doesn't reach its destination. This would not be a regular message in a regular object-oriented programming language.

Messages and returns

The message with the solid little arrowhead, shown in the last few illustrations, is called the synchronous message . For our somewhat more specific purposes, we could call it the "blocking message". It is the regular message provided by object-oriented programming languages like Java and C ++ . The message sending method loses the thread of control, which then passes to the responding method. When the responding method executes a return, the thread of control is returned to the point from whence the message came and the sending method wakes up, or unblocks. There is normally no need to show the return as a separate message.

The UML provides a "return" arrow. It is depicted with a dashed line with a stick arrowhead. Although I tend not to use return arrows, finding the implied return entirely adequate, they do bring two gains: through their labels one can omit the return type from the end of a message description, and one can formally name the return as well as type it. See Return arrow.

Asynchronous message

There is another variety of message arrow - the asynchronous message . An asynchronous message is a one-way communication. The sender doesn't block and there can be no reply. If there is a reply, it will be another asynchronous message in the opposite direction. Our example languages don't have such devices built in to them, but a message sent over CORBA, the distributed object message broker, for example, has the potential to be asynchronous.

The original UML 1.x depiction of an asynchronous message was quite intuitive; it was a half stick arrow. Unfortunately UML 2.0 has changed it to an ordinary stick arrowhead on an ordinary line, which is far too easy to draw or read accidentally. If your CASE tool, or whatever, will allow it, it's probably best to turn a blind eye to the suggestion of UML 2.0 and to carry on using the half stick, as shown in Asynchronous message.

Nested messages and methods

We have a little bit of a problem if, during the execution of a method, an object finds itself running another of its methods. This could most obviously be because of a self-message. It could also be because our object messages another object which then messages the first object back again.

We would like to keep time roughly progressing down the page and yet one method is executing within another. We use nested method boxes, as shown in Nested methods.

Older UML documentation used to use the term "recursion" for what I've labeled "self-message" in Nested methods. I would tend to reserve the term recursion for when a method directly messages for itself, rather than another method of the object. True recursion gives us even more of a problem in trying to preserve the flow of time. The most obvious depiction would be that at the left of Depicting recursion, but then the portrayal of time is somewhat compromised. An alternative would be that on the right of Depicting recursion, which also introduces the iteration marker.

Loops and branches

Strictly speaking, UML 2.0 only mentions iteration marker for communication diagrams, which we will be coming back to in a moment. Iterations (and guards) were available to both forms of interaction diagram in UML 1.x. Perhaps they have been dropped from sequence diagrams because 2.0 sequence diagrams have a new way of showing loops and branches using frames. That would be a pity as the new way is rather cumbersome to draw, maintain or understand, and I suspect that few sequence diagrams will beneficially use them.

One of our aims for sequence diagrams was to promote their painless and productive use as one of the main ways to force ourselves to understand whether or not our architectures would work, and to ensure that the structure diagrams were not just "cartoons". We must ensure that sequence diagrams don't become lumbering and heavyweight.

Perhaps it was an oversight. Perhaps we can still use a star to indicate iteration. If we do, we can probably also continue to follow it, only when helpful of course, with the iteration condition in square brackets *[condition] . If the iterations can take place concurrently, we write *// .

Given that one strong and important characteristic of object-orientation is "lists and loops", the iteration marker is going to be quite informative and it doesn't really add much clutter or complication. Branches are another matter however.


Once you start depicting branches in a sequence diagram, you head into realms also covered by activity diagrams - flowchart territory. Just as most use cases tend to focus on a single scenario - the successful scenario or the unsuccessful scenario for example - sequence diagrams (and CRC sessions) should also tend to focus on a single scenario. After all, we are not pretending that our interaction diagrams are a complete picture. Only the structure diagrams can be "completed"; interaction diagrams are illustrations, samples.

Nonetheless, you would probably like to see what the full-blown "frame and fragment" looks like. There will be occasions when a particular critical interaction needs very careful and accurate depiction and then perhaps such a diagram will be worth the time it will take to create, get right and maintain.

We have seen frames already. The main use of a frame is to give a diagram a border and a title. Frames can also be used in sequence diagrams to enclose interaction fragments representing loops and branches. See Frames and fragments.

Each frame has an operator as its "name" in the corner pentagon. The basic ones are:

  • alt

The regions or operands within give mutually exclusive alternatives. Each region has a guard. One region can have a guard of [else] .

  • opt

This is like an alt with just one region and no else . (Or an if with no else , if you like.)

  • par

The regions within can be executed in parallel.

  • critical

The frame represents a critical region. Essentially just one thread is allowed in at a time.

  • loop

The operand can contain a boolean expression or upper and lower integral bounds. The region will be executed multiple times.

Birth and death

The objects arraigned across the top of a sequence diagrams are considered to have existed before the sequence diagram was "entered". The lifelines that continue to the bottom of the page indicate that the objects are still alive and well as the sequence diagram is "exited". Sometimes it is useful to show the creation and destruction of objects.


We simply draw a special "message" from the creating object, pointing at the head of the lifeline - the object box - and labeled new . Construction (initialization) arguments could be added. See Object creation and destruction


With most of our example languages, destruction isn't particularly important. With a huge sigh of relief we note that when an object becomes unreferenced, the system quietly harvests its memory, a process known sometimes as garbage collection. In C ++ though, reclaiming object memory is a programmer responsibility (some would say burden), so a C ++ sequence diagram might get more interested in showing destruction explicitly. Outside of C ++ , if it's important to show that an object has reached the end of its useful life and would be ready for garbage collection, then the destruction cross, as shown in Object creation and destruction, might be useful.

Communication diagrams

A sequence diagram is good at showing message sequencing in an intuitive way - from top to bottom down the page. The forte of the communication diagram, on the other hand, is to show the structure, or architecture, that defines the instances and their relationships; relationships and other structural embellishments like role names are absent from sequence diagrams. Sequence, however, isn't obvious in a communication diagram. Sequence is there, though, because we add nested sequence numbers ("dewey decimal") to the beginning of each message description. The reason for not just using simple integer numbers is to preserve the cause and effect information that the method boxes give us in a sequence diagram. Many CASE tools don't properly support structured sequence numbers and not everyone likes them (although I do) and you may well encounter just a "flat" numbering scheme.

In the example of Communication diagram, note that the link is a structural link like an association (or usage). In UML 1.x we were able to give what I thought was very valuable insight into how it was all working by adding a stereotype to make clear exactly what kind of link was enabling the message passing. I would continue to add such information (indeed I sometimes manage to sneak such information into sequence diagrams as well by drawing stereotyped lines between the heads of the lifelines). As they have all been retired from UML 2, I am now free to make up my own preferred list of such stereotypes, as follows:

  • «characterizer» (association, aggregation or composition)
  • «argument»
  • «return»
  • «local»
  • «global»
  • ( «self» )

In my project standards, «global» wouldn't usually signify actual global visibility like a global variable, because that's going to be vanishingly rare today (isn't it!) It would indicate the (very) occasional singleton, or perhaps a class method or class variable ( static ). Normally self is geometrically obvious. Unofficial implementation stereotype shows examples.

Interaction overview diagrams

This diagram, like the timing diagram, is new to UML 2.0. Only time will tell how useful both turn out to be. One suspects, and one does not seem to be alone in suspecting, that neither will turn out to be all that important. An interaction overview diagram is a combination of activity diagrams and sequence diagrams.

You will probably not be surprised that my advice is to draw either a sequence diagram (most likely) or an activity diagram (less likely).

My original intention with this appendix was to cover only the half-dozen major diagram formats. I seem to have ending up covering eleven of them to some degree of detail or other. I shall therefore leave the details of interaction diagrams to you and the UML documentation (http://www.omg.org). There are no new symbols. You can think of them as sequence diagram fragments connected by diamonds and bars, or you can think of them as activity diagrams in which the activities are sequence diagram fragments.

Timing diagrams

One's doubts here are slightly different. Many disciplines already have perfectly adequate timing diagrams. I can remember being taught motor mechanics in my teens and seeing them there. I leave it to others, more expert than I, to decide if the UML has done an adequate and consistent job of standardizing timing diagrams. Timing diagram and Timing diagram, alternative version do have examples, however.

Time now flows from left to right. Each trace represents an instance and its states. In a sense, timing diagrams are also a little bit of a combination - of state machines and sequence diagrams. There are different ways of portraying a timing diagram. One form emphasizes the existence of the various different states in step form; the other emphasizes timings. You will also note that continuous as well as discrete traces can be depicted.

State machines


A big plus in favor of state machines is that they have a mathematical origin, which means that they are formally sound and well understood. The big minus is that, despite their abstract nature, they are still somewhat low-level, and with no simplifying constraints as to their shape they can become complex and difficult to check for correctness. In a sense, the same arguments that were used against the goto in the 1960s could be leveled at state machines and a simpler, more structured alternative sought.

Many of the approaches and notations of the 1970s and 1980s used state machines in specification and modeling. Two of the main inputs to UML (OMT and Booch) used them, and thus they made their way into the UML, where they were initially known as statecharts.

UML 2.0 has made some small changes to state machines, mostly concerning their consistency and integration with the other diagram types.

On the face of it, the most obvious use for a state machine is to provide an abstract picture of the states of denizens of the model (those things mostly being entities, types and objects as far as we are concerned).13 In an abstract state model one can talk about the different states in which something of interest could find itself, without actually needing to talk about attribute or variable values.

It's a pity that an adjective from the lore of state machines is often dropped - finite state machines. An important characteristic of these models is that they can, and should, portray each and every one of the finite number of states in which something of interest could find itself.14 State machines can be completed; they are not samples or snapshots as, for example, sequence diagrams are.

In a state machine the lines (or arcs or edges) are as important as the boxes (or nodes); and rather than focusing on the states themselves, one can use a state machine to focus on the permitted sequences of events (e.g. messages).

Simple state machine models use states, transitions (between states) and events (inputs that cause transitions). Complicated state machines specify activities (outputs). UML 2.0 has started to emphasize this difference by distinguishing the simpler protocol state machines (states, events and transitions) from the more complicated behavioral state machines which add actions. A protocol state machine is indicated by adding the keyword {protocol} after or below the name of the state machine. (I know, I know: why {protocol} and not «protocol»? I don't have the answer. Sometimes the UML uses keywords in guillemets for content variation - like «utility» versus «implementation class» - and sometimes it seems to use keywords in braces (curly brackets) for content variation - like {abstract} and {protocol} . I wish it were more consistent. It would probably say that {protocol} is a constraint, not a variation; «utility» , however, is also a constraint, so I struggle with this.)

There were several warnings in the main chapters which I will briefly summarize here:

  • There cannot be a single interpretation of the semantics of a state machine. Although UML 2.0 has distinguished protocol state machines from behavioral state machines, which does help, you will still need at least two sets of semantics for the different stages where a state machine might be used - in analysis and in design.
  • Don't imagine that every model creature will require a state machine.
  • Having arranged to have as few state machines as possible, minimize the number of state machines that separate events from actions. Putting it another way, expect to get less trouble from protocol state machines than from behavioral state machines.
States (simple states)

A state is depicted, as shown in Simple state, with a softbox - a box with rounded corners. This represents one of the finite number of abstract states in which the creature modeled by the state machine could find itself. The creature is typically one of some set of objects at some stage of exploration or planning - in this book a subject matter entity, an object type, an interface or a class. The creature in question will always be in a state and exactly one state. (As we shall see later, the depiction of the state might involve more than one softbox, but nonetheless, our mental model is that the creature is always in one and only one state.)

Be careful. The UML, for example, says in places that a state machine "describes a class". While a class can have state in most languages (class or static variables), it would be quite rare to be modeling the state of a class. What we usually mean, and should keep clearly in mind, is that a state machine associated with a subject matter entity, an object type or a class, is specifying the possible abstract states of the instances of the entity, type or class.

The word abstract has been used a couple of times above. We say abstract state because we would not be planning variables and values. Indeed, in a protocol state machine we are typically specifying something that will not have variables; the states in a protocol state machine are purely specification devices. (State machine users from other disciplines are used to this. When a state machine is used to describe a modem, for example, it is working at a more abstract level than voltages and currents. In software engineering, the bit patterns in the variables are often referred to as state, but the state in our state machines is more abstract than actual bit patterns.)

The state is given a name. Please make the name meaningful. Computers that use state machines (like compiler compilers) can handle state names like S137 , S138 , etc., but humans cannot. Think about the state of being of the creature in question. Participles are often good names. In English, participles typically end in -ed (past) or -ing (present).

Transitions and events

Depicted with a line with a stick arrowhead, a straightforward transition represents an allowed change of state - from a source state to a target state. A state can source many different transitions and can be the target of many transitions. The "different" in the last sentence is important. Transitions are labeled with the " events " that provoke them; something happens that the creature is sensitive to; furthermore, the event is one that makes a significant difference to the creature or the progress of a protocol. Transition illustrates a transition on an event. There must not be conflict - no ambiguity as to which transition, if any, will be taken.15 (We will encounter guards in a little while, and we shall see that they can differentiate transitions that might otherwise have been ambiguous.)

There are laughing quotes around "event" because of the different guises that events will take in different models. In an analysis model, an event will represent a significant occurrence in the context of a subject matter entity. In straightforward design models (the kind I suggest you stick to (unless you are doing hard real-time systems)), it will be a message.

If an event did represent a message, then in our conceptual model (common-or-garden object-orientation), we would sometimes expect arguments to accompany the message. You could also provide arguments with an event in an analysis model, but if you decide to follow the significant happening conceptual model suggested here, you wouldn't usually need to or want to, as firstly we would not be considering processing so we wouldn't be interested in some method's input information requirements and secondly we would not be thinking of an event as a message where an argument might contribute to the exact identity of the message (the "signature" of the message/method as OO programmers often term it.)

The depiction of arguments is reasonably obvious and will look pretty familiar to many OO programmers. Arguments illustrates a message "event" with an argument.

State machines are reactive. (The next kind of diagram - an activity diagram - is more "proactive".) It is important, therefore, to understand whether or not there could be such a thing as an event that takes place but that isn't a label of a transition from the current state.

Once again, it is likely that the interpretation differs from one stage of modeling to the next. For a subject matter model in this book, we would say that the subject matter entity simply doesn't sense the event; the occurrence might be significant to some other entity, but it isn't significant to the entity being modeled. For a design model, that won't work. Messages are sent to typical objects. Typical objects can't ignore a message. However, a typical object could respond to a message without going to a different state; and a protocol needn't be advanced by all messages. It is probably best to be explicit about whether some message is accepted without a state change or whether the arrival of the message is a problem in some states.

A transition doesn't have to go to a different state, i.e. it can have the same source and target state. So we can easily model those messages that are accepted but that cause no state change with self transitions , as illustrated in Self transition. And we can therefore say that messages (events) to which a state has no response represent an error.

If (and it's a big and important if) you use behavioral state machines, then there are some complexities involved in a self transition, as well as cluttering up a diagram with uninteresting lines. There is an alternative which is probably a better first consideration: the internal transition . Internal transitions are simply depicted as events listed in a compartment of the state symbol, as shown in Internal transition. The occurrence of such events does not constitute an error and does not cause a state change.

We must still ensure that there's no ambiguity. A particular event couldn't be associated with an external and an internal transition. (Internal transitions can have guards though.)


Guard shows how a guard condition can be attached to a transition. It's a boolean expression. It prevents the transition being taken unless the condition evaluates to true. It is presented in square brackets. When depicting newer examples - protocol state machines usually - the UML puts the square-bracketed guard before the event name, but usually puts it after in other examples; it's not clear if this is significant or trivial. If you had two transitions that were both candidate transitions - two completion transitions, two transitions labeled with the same event - then guards with mutually exclusive values would make this, otherwise illegal condition, legal.

"Guard" now seems to be an unofficial term and precondition the official term.

There are two schools of thought regarding guards. On the one hand, you can argue that they complicate (along with the many other complications the UML offers) the original "simple" mathematical rigor of state machines. On the other hand, their presence can simplify; you can sometimes say with a guard what might have otherwise required several new states and transitions. So, use them when necessary but no more often than that.

What kind of expression does one use? A guard is an instance of the general UML notion of a constraint . The UML leaves it fairly open as to what kind of language you use. It says that "a constraint is described using a specified language, whose syntax and interpretation is a tool responsibility". A predefined language for writing constraints is OCL. You might decide to use the syntax of a programming language that everyone understands, such as Java. The UML suggests that in some situations natural language may be used, although that should perhaps be kept as a last resort.

What kinds of things would you reference in the guard's expression? The most likely things would be accessible values, such as the values of the attributes of an entity or the values of the message arguments and instance variables of an object. In other words a largely macro-state model might occasionally refer to the micro-state if it made the model simpler.

Composite states and hierarchical state machines

We mentioned that state machines are quite low-level. One important consequence of this is that they have a tendency to "take a lot to say a little", be "high bandwidth". Associating separate state machines with individual entities, object types or classes, helps a lot. Our other tactic is hierarchical decomposition.

What looks at first glance like a state turns out to actually be a group of states (or composite state). The UML says that in some cases, it is convenient to hide the decomposition. It's more than just convenient though. Practically speaking, "hiding" is more likely to be the norm: we would show the contents of a composite state in another, separate diagram - a child diagram as it is sometimes called. Otherwise, there's a tendency for composite states to make things even more complicated, rather than less. A typical presentation in a CASE tool, would be for a double-click on a composite state symbol to take you to the child diagram.

So while you might occasionally see substates drawn within a composite state, as with Composite state making a diagram more complicated, if you are using composite states for the purposes of organization and presentation, you would more likely see the detail hidden and a little composite icon instead, as with Composite state simplifying diagrams.

A good way to view state machine hierarchy is as disposable . What this means is that the hierarchical structure could be removed - sometimes called flattening - without losing any information. A composite state is not a real state; instead it is a convenience grouping of the real states. A composite state says that the creature in question is in some state or other, in one of the states in the composite state of course, but in higher level diagrams, you need not worry about exactly which particular state of the group.

We do appear to give composite state many of the same accoutrements as a simple state - such things as entry activities and exit activities, for example - but they are just conveniently holding these things on behalf of all the real states inside the composite state. (We will be detailing entry and exit activities in a moment, but you can easily guess what they are.) If you did carry out the exercise of removing a composite state and replacing it with its component states, you would simply have to make the entry activity of the composite state, for example, an entry activity of each and every component state; this would be a chore but would present no conceptual challenges.

A very important question, however, concerns transitions that appear to be transitions to the composite state - to the group. We will encounter initial states and entry points a little later, but another possibility is a history marker.

History pseudo state

One possibility for saying which would be the target state in a group of states, if the transition was to the composite state's boundary rather than an entry point, is a history symbol.

By including a circled "H*" inside a composite state, you are saying that a transition to the composite state's group of substates is a transition to the state the machine was in when the composite state was last exited. "What about the first time?", I hear you say. That's what the transition from the history symbol is indicating: there must be one and exactly one transition from the history pseudo state to the initial target state. "What if there were composite states in the composite state?" That's what the asterisk is for in "H*". It says that when you enter such a composite state, you enter the state of the "flattened" set of substates that you get if you expand all the contained composite states recursively, that you were in when you last left the composite state. Deep history pseudo state has an example of a history marker, although it doesn't seem right for this example and we must look for something else.

"What about if there were composite states inside a composite state marked with an `H' rather than an `H*'?" I give up. The UML calls it a shallow (rather than deep) history. What does it mean to say you return to a composite state with a shallow history indicator, and the last time you left that composite, you left from a state inside a composite state inside the composite state you've just come back to? (With me so far?) The only interpretation that I can come up with is that you would go to the state marked by the initial pseudo state symbol of the composite within the composite. However, I may be wrong; I have never seen it spelled out such that I can understand it. You can, and I would, avoid the "H" and always go for the "H*".

Initial pseudo state

A composite state might have an initial pseudo state depicted with a solid circular blob. For transitions that terminate on the composite state boundary, the initial pseudo state indicates the target state. Without the initial pseudo state, such a transition would be ambiguous. This question of where a transition, apparently to the composite state, is really going, is typically encountered when a "parent" diagram's composite state has its detail hidden in a "child" diagram.

Notice, in Initial pseudo state, that the blob is marking the initial state; the blob itself isn't a state - hence the "pseudo". (It's really rather confusing to refer to it as any kind of state at all.) As the composite's group of states is entered, there is an inescapable and immediate transition to the target of the transition from the initial pseudo state. If we were to "flatten" the state machine hierarchy we would see all the transitions that had terminated on the composite state boundary going instead to the initial state. If we are doing a behavioral state machine, there can be an activity associated with this initial transition. There must only be one transition from the initial pseudo state.

An outermost state machine might have an initial pseudo state, where it would be marking the initial state of a newly created instance of the creature whose state machine it was.

Entry points and exit points

Transitions to the composite state's group don't have to use the default, initial pseudo state; they can be made more specific. If there is no separate "child" diagram, transitions can pierce the boundary of the composite state and target a particular state of the composite, as shown in Targeting particular substates.

If the parent diagram has hidden the detail of the composite state, then a transition will target one of possibly several named entry points . Entry (and exit) points are new to UML 2 and Entry points shows an example.

Similar considerations apply to transitions from particular substates of a composite state to states outside the composite. In the depiction of the substates the transition goes to one of possibly several named exit points .16 In the depiction of a composite state with its detail hidden, the transition would be seen emerging from an exit point bearing the same name.

Local and external transitions

If a transition is shown originating at the boundary of a composite state, i.e. not from an exit point, it means that it is as though each and every state within the composite state has that transition. As Local and external transitions illustrates, the target of such a transition can be another state within the composite's group of states, in which case it is termed a local transition (and does not run the composite's exit activity in a behavioral state machine),17 or it can be to a state outside the composite, when it is termed an external transition . (If you cannot see the border of the composite state in a particular diagram, there is an alternative notation where a "wildcard" state, with the label "*" is used in place of the composite's border.)

So we see another use for composite states: they can de-clutter the state machine diagram when large numbers of states are all sensitive to some particular event, and they transition to the same target state on that event. Be sparing with this convention because a) it is not absolutely clear to me that UML 2.0 keeps this convention - it seems to imply it, but doesn't state it explicitly enough, and b) the hierarchy is starting to do more than just organize, it is taking on more of a role and becoming less disposable.

Concurrent regions

The UML defines the inside of a composite state as a region , but the composite states we've looked at so far, don't make the term very interesting. There is another kind of composite state however: a composite state that has more than one region, diagrammatically indicated with dashed lines separating the regions. Notice the use of an alternative labeling style which is particularly appropriate here, the label in a tab above the main softbox, as shown in Concurrent regions and an alternative label style.

When a composite state has more than one region, the regions execute concurrently. Straightaway however, I would once again caution you to employ a mental model where the composite state is disposable (although not so easily disposable as that of a one region composite state). I suggest you think of it this way: the context object - the object whose state machine it is - will still be in exactly one state, but that state is depicted with two or more symbols. Now why would you want to do that? Where is the gain in that? For both analysis and design diagrams, there is one, somewhat mechanical, gain: fewer state symbols are needed. Let me explain what might seem like a paradox, via the idea of combinations.

If, as you should, you make an effort over the names of your states, and you find that the honest and precise names involve, explicitly or implicitly, lots of "and"s, you can sometimes carry out a reduction into regions that are concurrent. Instead of the "execution" of a state machine involving the placing of a single, active state, token, you now place a token on exactly one state in each and every concurrent region.

The final effect is the same, but instead of the "and"s being hard-coded into the names of lots of states, they are effectively in the names deduced from fewer states.

In an analysis entity's state machine, such regions can also convey the nature of the entity more clearly: the writer of Concurrent regions would seem to have fingers and stomach that have little to do with each other. (And if we imagine that rather than 2 × 3 = 6 permutations, it was 7 × 7 = 49 permuted states then we would start to see significant savings in diagram complexity.)

In design, however, you might instead be seeing a hint that the object in question could be better implemented as a composite object.

Forked transitions

What you gain in the reduction of the number of state symbols by using concurrent regions, you might lose in the complexity of the transitions. If you had a three region composite state, and there was an event and a transition, then the transition would really be to the AingAndBingAndCing state; and now that A , B and C are separate symbols, the transition has to be a forked transition to all three. Basically every transition into a composite with n regions, when detailed, has to fork into n transitions each targeting one state of each region, as shown in Concurrent regions and forked transitions; and every transition from such a composite has to have n transitions, each from one state of each region merging into one.

Final pseudo states

On the face of it, final pseudo states , like that of Final pseudo state, might simply be thought to be the counterparts of initial pseudo states, but care is required for their interpretation. Mercifully their description has been made a little clearer in UML 2.0, but even there we are making use of implications that are not spelled out. In nearly all cases the average developer can, and should, forego final pseudo states. Let's examine them a little more deeply though; then you can decide if you agree or not.

Firstly let's look at their role in regard to transitions from the boundary of a composite state (like the initial pseudo state does for transitions to the boundary). When there is a final pseudo state inside a composite state and there is one unlabeled transition from the boundary of the composite state - a completion transition - then a transition from a state in the composite group to the final pseudo state, is a transition to the target state of the transition whose source is the composite's boundary. See Final pseudo state and completion transition.

Secondly, what if there is a final pseudo state and there is no transition from the boundary of a containing composite state? These are currently described by the UML as completing that region of the state machine. And if there is only one region, or if all regions are completed, the UML suggests that the context object is terminated. Now that's clear enough on the face of it, but what does it actually mean ? We have to be careful and separate analysis entity state machines from design object state machines. For an analysis entity it's often reasonable to say that the instance has reached the end of its life. But for the typical object-oriented use of a typical object-oriented programming language, it doesn't actually mean anything to say that an object has terminated.

Remember first of all that, for us, a state machine is associated with an object, and not a diagram or an interaction or a sub-system or a system. A state machine either specifies the protocol through which an object can serve some other object, or it specifies the behavior of an object. Well, an object doesn't specify its own termination . Apart from some rarely encountered and fairly "even gurus tread carefully" suicide syntax in C ++ , the typical object does not specify its own demise. An object is either deleted by other objects, and its state machines cannot foresee that, or objects become unreferenced and are "garbage collected".

The best interpretation that I can come up with, for a completion-type final pseudo state, is that the object has become unusable - every subsequent message (event) it senses is an error.

Submachine states

When we discussed entry and exit points, we mentioned that their arrival in UML 2.0 meant that submachine states could be defined independently of their use. If you have defined such submachines and you want to use one of them in another state machine, you put a submachine state symbol into your diagram, with appropriate entry and exit points which you connect up, as shown in Submachine state.

There is very little difference between a submachine state and its use, and a composite state and its use in another diagram. The submachine state's label, however, is in two parts: the first part giving its meaningful name in the containing state machine, and the second part giving the name it was defined with. As is typical with any kind of library, it's a good idea to be able to use the name of the library as part of the name. Like Java's package or C ++ 's (and C#'s) namespace, the UML has namespaces. Many things in UML are namespaces - like classes for example - but the thing that is little else other than a namespace is the package, which was described in the structural diagrams section (Structural Diagrams). In Submachine state, that's why the defined name is styled package::state machine .


Now we get to the main element that behavioral state machines have and protocol state machines don't have. Simple state machines have states, events and transitions. Complicated state machines have states, events, transitions and activities (outputs).

Before we get to the details of activities and behavioral state machines, let's pause and remind ourselves what we can accomplish with protocol state machines. We can describe the life history (the UML says lifecycle) of an entity. We can describe the permissible orderings of messages to an object of a particular type. We can even describe what an instance of a concrete class is going to do, because in ordinary object-oriented programming languages, the message determines the method that an object will run, from its class.

If, despite all those possibilities, you insist on wanting to depict the implementation that an object instance is going to use, and you aren't going to assume that a message will bind to an accessible method with the same signature then (as well as probably being the kind of person who doesn't like to use a can opener preferring instead to use a hammer and a chisel) you can have transition activities , entry activities , exit activities and do activities 18 in a behavioral state machine.

Let's start with transition activities . If we restricted ourselves to these, we'd have the kind of state machine that George Mealy, studying electric circuits, proposed in the 1950s. These are the most flexible state machines since each transition to a state could provoke a different activity. If, however, we found that every transition to some state, or every transition from some state, always provoked the same activity, it would be a chore associating the activity with all those transitions; and, worse, if it really was the state that was significant rather than the particular transition, there would be a risk that we would forget to add the activity to any new transitions put into the model.

Another circuits man - Edward Moore - proposed a kind of state machine where it was the states, rather than the transitions, that determined the activities.

The UML arranges things such that we more or less have both. In addition to transition activities, we have state entry activities and state exit activities . It's not quite as bad as it sounds, though. An entry activity is exactly equivalent to a transition activity on each and every incoming transition, and an exit activity is exactly equivalent to a transition activity on each and every outgoing transition, provided that we remember that self transitions are perfectly regular transitions, and that a self transition triggers the exit activity and then the entry activity.

What about notation? A transition activity is shown as part of the label of the transition. The name of the activity is written after the name of the event, and separated from the event by a forward slash, as shown in Transition activity, where I have also added a carriage return to aid readability.

Entry activities and exit activities are listed in a compartment of the state symbol, preceded with the keyword entry or exit and a forward slash, as shown in State activities.

This would probably be a convenient place to summarize the state symbol's compartments.

  • If there are substantive compartments, the name can go in a compartment of its own, although it can also go in a little tab above the softbox.
  • There might be a compartment for activities.
  • There might be a compartment for internal transitions.
  • A composite state might have a compartment for its substates.

The activities we've considered so far - transition, entry and exit - are not interruptible.19 The main distinguishing feature of the next, and last, kind of activity is that it is interruptible. We reach the most exotic of behavioral state machine frills, flounces and furbelows, the do activity . I'll describe it before I fetch out my soapbox. (I'd like to tell you that you could skip the next few paragraphs, but I can't. You will encounter references to, and uses of, this murkiest corner of the UML and you will need to decide what to do about it.)

When a state is entered, and when the entry activity has completed, which it must, the do activity , if defined, commences. If an event occurs that labels a transition from the doing state, then the do activity is terminated, any exit activity carried out and the transition taken. As of the time of writing (2003, ptc/03-08-02), the UML doesn't say what happens to a do activity when an internal transition's event occurs.

Any do activities are listed in the same compartment as the entry and the exit activity, but preceded with the keyword do .

If we were object-oriented designer/programmers, and we were confronted with a do , how would we handle it? (Processing considerations have a limited payoff for analysis entities, as discussed in the main text. I would expect it to be very rare that I used a behavioral state machine for an analysis entity.)

You see, when all this stuff was first put together, in Harel statecharts, neither objects nor threads were mainstream. If today's typical object-oriented programming languages had active objects - each object with its own little CPU - implementing behavioral state machines would be easier. Production object-oriented programming languages like Java, Smalltalk, C ++ and C#, however, don't have active objects; they all have active threads instead. A typical object can't do anything until a message brings it a thread of control; the object will then use that thread of control to run one of its methods and then it will wave that thread of control goodbye.

So the simple answer is that I don't know how to handle a do . There is nothing obvious or standard that can be suggested. In the literature of those who have deeply studied state machines and their derivatives, the do activity is inevitably either ignored or condemned.

There is also the curious history of the do activity . In one of the UML 1.1 versions (yes, I'm afraid that there was more than one version 1.1 of the UML as it was being taken over by the OMG), do became the keyword that invoked a submachine. In another version I saw, but confess I can't put my hands on so perhaps I imagined it, I would swear that the do got deprecated.

Completion transitions

We have already met this unlabeled transition - one with no trigger event - emerging from a composite state. It was the transition that would be taken when there was a transition to a final pseudo state within the composite (or when every region of a set of concurrent regions had reached a final pseudo state). And the advice was to be sparing and careful with such things.

Completion transitions are also associated with do activities . When a state is the source of a transition with no event label - a complete transition - then as soon as the entry activity (non-interruptible anyway) and any do activity are complete, the completion transition is taken. UML 2 restricts the use of completion transitions to composite states, which strikes me as a little odd. But as I have vowed to move heaven and earth in order to avoid activities in my state machines anyway, I'm not too worried.


A protocol state machine doesn't (cannot) have activities. Someone somewhere must have felt nervous about that. As well as formalizing the distinction between protocol machines and behavioral machines, UML 2.0 added postconditions to transitions especially for protocol state machines. We have already met the precondition , or guard , and the postcondition is almost exactly the same except for being shown, in square brackets, in the position where the transition activity would have been in a behavioral state machine.

State invariants

Both protocol and behavioral state machine states can have boolean expressions attached. They express state invariants - things that must be true whenever in the state in question. They go under the state name and are enclosed in square brackets, as shown in State invariant.

Events revisited

Now that we have introduced the distinction between protocol machines and behavioral machines, and have advised that many developers will be able to get by quite happily without behavioral machines, we just need to revisit events.

First of all the arguments . If you are heading toward a state machine model where you have separated the event from the activity (remember that in typical OO scenarios it's actually much simpler than that: the message is the method), then you will need to ensure that each transition event, including self transition events, brings the arguments that any transition activity might need; and that all transitions taking the machine to a state with an entry activity or a do activity, and that all transitions taking the machine from a state with an exit activity, bring the arguments that the activities in question expect. This is clearly going to be onerous and fragile.

UML 2.0 instituted a generalized notion of a trigger , and events are instances of various kinds of triggers. There are four kinds of triggers:

  • call
  • signal
  • time
  • change

For us object-oriented developers, a call event is, to all intents and purposes a message. It is a received request that provokes the execution of a method that consumes expected, accompanying arguments. This is the kind of event we should try to limit ourselves to in design models if we want a simple and pleasantly mundane life as far as software development is concerned.

A signal event represents the receipt of an asynchronous communication. As such, there is no fallback interpretation in the languages of the day. The nature of a signal event will be platform dependent. (Although not defined as such in the UML, a call event, interpreted as a message, isn't going to be asynchronous; it will be a block-and-wait.) A signal event is the kind of event (significant occurrence) that our analysis model used.

A time event is what its name suggests. Time is important to us and to computer systems. We can assume that many subject matter entities and just about all computers can use time - moment of cultural time or expired duration - as an event. If we wish, then, in both our analysis and (real-time and rather exotic) design state machines, we can say that a transition will occur at a particular time or when a certain amount of time since an event has expired. The model repository would store whether or not a time event was relative, and in the label or the specification of the label, the word after would be included for an expired duration.

That leaves us with a change event . I've not encountered anyone who is even reasonably sure what these are for. So be careful. You should probably avoid them. Their definition says that change events are implicit, represented by boolean expressions and occur whenever the expression changes from false to true. So there.

State machine specialization

State machines can be specialized. You will also know that I have grave doubts about the advisability and value of doing it. I don't think that I can explain or simplify in any way. Here is what the standard (ptc/03-08-02) says - warts, typos and all:

"A state machine is generalizable. A specialized state machine is an extension of the general state machine, in that regions, vertices and transitions may be added, regions and states may be redefined (extended: simple states to composite states and composite states by adding states and transitions), and transitions can be redefined.

As part of a classifier generalization, the classifierBehavior state machine [ sic ] of the general classifier and the method state machines of behavioral features of the general classifier can be redefined (by other state machines). These state machines may be specializations (extensions) of the corresponding state machines of the general classifier or of its behavioral features.

A specialized state machine will have all the elements of the general state machine, and it may have additional elements. Regions may be added. Inherited regions may be redefined by extension: States and vertices are inherited, and states and transitions of the regions of the state machine may be redefined.

A simple state can be redefined (extended) to a composite state, by adding one or more regions.

A composite state can be redefined (extended) by either extending inherited regions or by adding regions. A region is extended by adding vertices, states and transitions and by redefining states and transitions.

A submachine state may be redefined. The submachine state machine may be replaced by another submachine state machine, provided that it has the same entry/exit points as the redefined submachine state machine, but it may add entry/exit points.

Transitions can have their content and target state replaced, while the source state and trigger is preserved. In case of multiple general classifiers, extension implies that the extension state machine gets orthogonal regions for each of the state machines of the general classifiers in addition to the one of the specific classifier."

State machines can have names, as we saw in the section on submachine states. An alternative permitted depiction is with a classifier box displaying the keyword «statemachine» . Perhaps one could use a generalization arrow between two such boxes.

A specialized state machine has {extended} as part of its name, as shown in Extended state machine.

The constraint {final} can be added to states and transitions to indicate that any specializing state machines cannot be redefined or extended.

Inherited states in a specialized state machine can be shown with a dashed outline or with a gray outline.

Deferred events

Straying even further into the minefields of state machine exotica, we next reach deferred events. Once again, I think the best thing to do is to quote what the UML 2 (ptc/03-08-02) specification says:

"A state may specify a set of event types that may be deferred in that state. An event that does not trigger any transitions in the current state, will not be dispatched if its type matches one of the types in the deferred event set of that state. Instead, it remains in the event pool while another non-deferred event is dispatched instead. This situation persists until a state is reached where either the event is no longer deferred or where the event triggers a transition."


"Composite states introduce potential event deferral conflicts. Each of the substates may defer or consume an event, potentially conflicting with the composite state, e.g. a substate defers an event while the composite state consumes it, or vice versa. In case of a composite orthogonal state, substates of orthogonal regions may also introduce deferral conflicts. The conflict resolution follows the triggering priorities, where nested states override enclosing states. In case of a conflict between states in different orthogonal regions, a consumer state overrides a deferring state."

If, for example, you are doing hard real-time, you have specialist languages or language extensions, and you do have a need for such elements, please ensure that your interpretation of them is clearly written up in the project standards.

Activity diagrams

Our next behavioral diagram is the activity diagram. Activity diagrams are not really object-oriented.


Activity diagrams have become a lot more sensible in UML 2.0. Any minor complaints I might voice should be seen merely as a little griping icing on a grateful cake.

Anyone who is familiar with flowcharts, data flow diagrams and Petri nets, is off to a good start. That might seem like a large list of inputs for activity diagrams. You might be glumly thinking that activity diagrams seem to have as many input cultures as C ++ did. You might be worrying that activity diagrams are going to be the Ada or C ++ of diagrams - everything including the kitchen sink. All I can say to comfort you, is that it used to be worse. UML 1.x attempted to make that unholy mixture appear to be just a special form of state machine.

One of the big drivers of the effort that's gone into activity diagram revisions in UML 2.0 has been to make activity diagrams more consistent and easier to understand by moving from semantics rooted in state machines toward semantics rooted more in Petri nets . Petri nets, like that illustrated in A classic Petri net, were invented by Carl Petri in the 1960s. Their unique selling point was concurrency. They are a formal approach, meaning that we understand their mathematics (automata theory again). They have been found useful, for example, in manufacturing, process control, real-time systems, networks and in workflow modeling. There have been many extensions and variations over the years.

Tokens form the basis of the mental model - the semantics. Straight away you can see that we have stepped back in time, or down a level. Tokens were important to software engineering modeling; they were all you had available if you were trying to understand a pre-structured, pre-object-oriented program: "The cpu is here, you would say, placing your finger or token on the listing. Or here and here and here, if you were multi-tasking/multi-processing."

(The calls and routines of structured design and the messages, methods and objects of object technology enabled us to work at a higher level and say that, "the bestResponse routine has called the assessRisk routine", or that "the response object has messaged the building object to enumerateExits ".)

So we have, in an activity diagram, a picture of a network . In the network are places - nodes as they are called - where a token can be. You, or a tool, can actually "animate" such a model, and it does help, and it's probably why Petri nets were successful in their field: you draw the network on paper and move "snakes and ladders" or "tiddlywinks" tokens around.

The nodes are connected, and the tokens can travel along the connections. These connections are officially called edges believe it or not - blame graph theorists. We, like much of the text of the UML will typically refer to the connections as flows . (Technically, an activity diagram forms a complete flow graph .)

Action nodes

The principal node is an action node - something that is done. Action node shows an action node.

I know, I know. Why, if they were trying to simplify and move away from state machines have they changed UML 2.0 to use the same symbol as a state in a state machine! The only explanations I've seen so far don't work for me.

Every action will be an instance of one of about 30 predefined action types (action classes in the metamodel), see Action definitions from UML super-structure document from the UML (ptc/03-08-02). The label is to give your particular instance of the action an informal name that is more descriptive than the defined action type name. (These action definitions appeared first in UML 1.5. One reason they're in the UML is to support general-purpose, visual programming.)

For example, in Action node, create witness statement is the descriptive name of a CreateObjectAction , CreateObjectAction being one of the 30 or so prescribed kinds of actions (see Action definitions from UML super-structure document). As yet there is no standard way to describe precisely how the action is actually carried out. The developer or, more likely, the project must choose.

One important thing to draw your attention to is the differences between actions and behaviors . The actions are predefined and listed in Action definitions from UML super-structure document; we could think of them as the built-in machine code instructions. Two of these actions are invoke behaviors, they are CallBehaviorAction and CallOperationAction . Behaviors are the units of application functionality that the developers define; let's think of them as functions or methods. From the simple viewpoint of this book, the CallBehaviorAction chooses a behavior directly - essentially it's a call of a function - whereas CallOperationAction asks an object to invoke one of its behaviors - essentially it's a message to an object. (You have to remember that the UML (still in 2.0) has the rather unhelpful term operation to stand for the client-side perception of the service rendered by an object. Why one can't use the perfectly sensible and intuitive term service , I don't know. I certainly always think "service" whenever I read "operation".)

So in the UML, in various places, and in various ways, we can use behaviors. We can define behaviors with activity diagrams. An activity (diagram) contains, among other things, actions. Some of those actions could invoke behaviors, and such a behavior might itself be defined as an activity (diagram), containing actions ... .


Flows are shown between actions. Flows can either be control flows or data flows .

As you can see in Flows, a flow can have a name label. This would seem to be purely for human readability.

To help with layout, flows can be drawn discontinuously by putting shared labels in little connector circles. There must be no ambiguity or implied forks. Exactly one destination connector must have the same label as each source connector, as shown in Connectors.

The same stick arrowhead is used for both kinds of flow. To determine whether a flow is control or data, one looks at the nodes at the ends of the flow. If one finds data pins, then the flow is a data flow. "Pins?". Read on.


To make clear exactly what data might be leaving or entering a node via a data flow, a node can exhibit input and output pins where the data flows arrive and depart.

Sometimes a flow will be implemented with a call of a routine, in which case the data pins represent the arguments.

There is only ever one kind of control that arrives at a node - permission to proceed. So control flows have no pins associated with them. Pins illustrates a control flow and a data flow with pins.

Object (or data) nodes

If the input and output pins at the ends of a data flow are of same type, the pins can be omitted and an object node drawn in the middle of the data flow, as shown in Object node.

Like pins, an object node is just a place in the activity network for a token in transit to rest. Both pins and object nodes have the semantics of "push". Data, or control, is pushed down the flow and the recipient has no choice but to become more ready to react. (We can't just say that the recipient must react. When concurrency is introduced in a moment, we will see that more than one flow might be bringing tokens, and the recipient will react (Petri net people say "fire") when each and every incoming flows' token has arrived.)

There is no difference, graphically, between an object node representing data (passive data) and an object node representing an object (active data).

If an object node represents more than one instance of a type, then its name begins with set of , as illustrated in A set object node.

Data stores

To support "pull" semantics and to support those who want to draw the UML equivalent of the data flow diagrams of old (the mainstay of structured analysis), a data flow can be sent to a data store . In Data store you can see that the symbol is the same shape as an object node, but the «datastore» stereotype is added.

Unlike data or an object arriving at action node, as an activating token, data or objects sent down a data flow to a data store accumulate in the store. The data flow from a data store to an action indicates that the store is read by the action at the action's own convenience.

The data items or objects in the data store are never duplicated or deleted. An "object" entering the data would replace any identical "object" already in the store. When an action accesses an "object" from the store it accesses a copy and the "object" remains in the store.

Soapbox interlude

At their simplest, activity diagrams can be used to portray the kinds of things that the ancient flowcharts portrayed. At a more sophisticated level, they can portray the kinds of things that the concurrent descendent of the flowchart - the Petri net - portrayed. More layers of sophistication add the capability to portray the kinds of things that the hierarchical diagrams of the structured era portrayed, such as data flow diagrams. (These layerings are official: the contents of activity diagrams are defined in the UML metamodel in several sub-packages.)

Activity diagrams are, therefore, the least object-oriented of our diagrams. They are for developers who are not using object technology, or for the rare occasions in the lives of experienced object-oriented developers when they are trying to see the processing wood for the object trees.

Some of the literature associated with UML 2.0 talks about the new activity diagrams having support for projects that are incrementally adopting object technology. Now you can certainly re-orient a system to object technology, you can even re-orient a subsystem to object technology, but I would be very worried if anyone thought that the new activity diagrams legitimized any piece of software that was a bit "object-oriented". You do not want to leave all the tricky aspects of, say, structured design, and add the tricky aspects of object-oriented design; you want to replace the tricky aspects of structure design or whatever, with the tricky aspects of object-oriented design.

Decisions and merges

If one is considering even the simplest flow of control, then one needs branches based on decisions .20 As you can see in Decision node, the symbol is the traditional symbol, although not quite as big as it used to be - a diamond.

There must be one flow entering a decision node and there would typically be several flows exiting a decision node. Each outgoing flow is labeled with a guard . These guards are evaluated; they must be mutually exclusive and one must be true (unless there's an [else] ). If no guard is true, or if more than one guard is true, then the behavior is undefined. Guards can be expressed in OCL (object constraint language). One and only one flow exiting a decision may be labeled with the predefined guard [else] . If there is a flow labeled with [else] then it will be taken if none of the guards of the other flows are true.

A merge is effectively the opposite of a decision. It is typically used to emphasize that irrespective of which of a mutually exclusive set of flows from a decision has been taken all those possible flows bring one to the same point, as shown in Merge node.

If a merge is followed immediately by a decision then the two diamonds can be combined, as shown in Combined merge and decision node.

Notice that there is no label given to the decision node itself (unlike traditional flowcharts). We can sometimes reduce redundancy in the guard expressions, however, by writing the question in a stereotyped note and reducing the guards to the answers. See Decision input

Forks and joins

Up until now, there has always been a single token traversing our activity diagram network. If we move up a gear and add forks and joins we can move from diagrams like flowcharts up to diagrams like Petri nets. The symbol is the Petri net symbol of old - a bar, as shown in Fork, where it is depicting a fork .

There is one entry flow and several exit flows. However, unlike a decision, the token arriving at a fork is replicated and a copy of the input token flows down each outgoing flow.

The actions that are carried out on the flow exiting a fork can be carried out concurrently. The activity diagram doesn't force things to happen at the same time. An implementation might still use a single thread of control and carry out actions on parallel flows one at a time. What the parallel flows in a UML activity diagram do say, is that while there are sequential relationships among activities on a particular leg of a fork, one can make no assumptions about the sequencing of actions on different forks.

There may come a time when you require that two or more activities that may have executed concurrently are all completed before you proceed to something else. That is what the join is for. (The terms "join" and "merge" are too similar for my liking. I tend to think "rendezvous" rather than "join".) As the tokens arrive at a join they wait until they are all present and then just one token emerges from the join. See Fork and join

When the incoming are all control tokens, it is obvious that one control token emerges - all control tokens are the same. When the incoming include data tokens, all the data tokens emerge.

It's possible that the very next thing following a join is another fork. In this case, as with the merge/decision before, the two can be combined, as shown in Combined join and fork node.


At the time of writing (ptc/03-08-02), a loop node is introduced and described. It would seem to be a diagrammatic representation of a C-like (and therefore C++, Java and C# like) for construction. No notation or examples are given, however.

Expansion regions

Also introduced in UML 2.0 is the expansion region. Again its description is incomplete in ptc/03-08-02. It is shown with a dashed softbox, as in Expansion region.

An expansion region is for processing several undistinguished elements of a set. It defines the processing that is carried out on input sets, possibly producing output sets, and defining whether the processing is carried out concurrently, iteratively or as a stream. The exact difference between iterative and stream is not clear yet.

Initial nodes and final nodes

An initial node is a small filled-in circle indicating where things begin. You can also think of an initial node as a control token generator. You can have more than one initial node; although it might be clearer to use one initial node with a flow to a fork. Initial nodes illustrates both.

It isn't necessary to have any initial nodes. As we shall see in a minute, actions are packaged up into activities and the activity representation can indicate where it all begins.

There are two kinds of final node: one that terminates a single flow - a flow final node , shown with a circle with a cross - and another than stops the whole activity, i.e. all the flows, wherever they've got to - an activity final node , shown with a bullseye. The cross symbol is fairly easy to remember (for me at least); to remember the bullseye you can think of a target the activity is aiming at, or you can think of a plughole. Final nodes has an example of each.

The flow final is the simpler (and new in UML 2.0). Such a flow simply comes to an end and its token disappears.

An activity might have more than one activity final node. In that case the first one that is reached stops the whole activity. An activity final node can have more than one incoming flow. Again, the first token to arrive stops the activity. An activity can have an output as we shall see in a moment; therefore an activity final node can have an outgoing flow to a data node.


Activities are behaviors. Thus the behaviors invoked by the action CallBehaviorAction or, eventually via an object, by CallOperationAction could be represented as activities. And behaviors can be represented in activity diagrams as activities, which contain actions, which can invoke activities. It is here then, that we find the support for hierarchical organization of activity diagrams. (If your previous experience includes structured programming, you might see this in terms of steps that turn out to be sub-processes. If your previous experience includes data flow diagrams, then you might see this in terms of functions that turn out to be child data flow diagrams.)

Part of the greater consistency of UML 2.0 is that the notion of an activity being invoked as a behavior, is a general one. Behaviors aren't just found in activity diagrams. While a behavior might be defined in an activity diagram, it can, for example, also provide a method of a class, or an entry or exit behavior of a state in a state machine. Actions on the other hand are specific to activity diagrams and are not themselves behaviors.


One of the reasons that the activity itself gets a symbol, and isn't just the name of a diagram, is that activities, like all behaviors, can be parameterized.

In An activity, the incarcerate suspect behavior is being documented as an activity, in an activity diagram. The incarcerate suspect behavior needs a suspect ; any old suspect , we don't need to depict individual suspects; that is the kind of thing we mean when we say parameterization. The parameter is indicated with an object node breaking the periphery of the activity. The first action find cell doesn't use the parameter but the second action does. In action-speak, the book suspect to cell action can't fire immediately, even though its data token is ready; it is waiting for the control token that will flow from find cell when find cell has completed.

Behaviors, and therefore activities, can result in outputs - something would be returned if an activity were implemented as a function or method. Outputs are depicted as output parameters: an object node is shown on the periphery, as illustrated in An activity with an output, but in contrast to an input parameter, an output parameter object node has an input and no outputs (read that a couple of times and hopefully it will make sense).

Streaming parameters

With a stream, the upstream activity is a producer and the downstream activity is a consumer. What distinguishes a stream from a regular data flow is that data tokens will be appearing at the producer's output and the consumer's input while the producer continues producing. Some subtle concurrency is going on here. I have a nagging suspicion I can hear a can of worms being opened. For example, we have to have these rules, and I quote:

  • "All non-stream inputs must arrive for the behavior to be invoked. If there are only stream inputs, then at least one must arrive for the behavior to be invoked.
  • All inputs must arrive for the behavior to finish, that is, for all inputs must arrive [ sic ] for non-stream outputs to be posted and control edges traversed out of the invocation of the behavior.
  • Either all non-stream outputs must be posted when an activity is finished, or one of the exception outputs must be."

For some reason, there are a plethora of ways to indicate a stream. A keyword can be used, the pin can be filled in or the arrowhead can be filled in. It's probably best to avoid the last one unless you have unqualified faith in your typesetting and rendering processes. Streaming shows a couple of the possibilities.

Exception output parameters

If an activity posts data to an exception parameter then the activity is aborted - all flows are terminated. Apart from any data tokens that have already gone to streaming output parameters, any data tokens posted to non-stream outputs never leave the activity.

A little, point-up triangle is placed at the source pin of an exception, as in Exceptions.

Programmers (e.g. Java and C ++ ) should be careful. These exceptions bear little resemblance to their languages' exception mechanisms. As usual, we remind ourselves that these diagrams are more for process modeling than the majority intended audience of this book.

Local preconditions and postconditions

Comment symbols and stereotype labels can associate preconditions and postconditions with actions. Preconditions and postconditions have been mentioned before. They're a useful contribution to declarative specifications, increasing knowledge about, in this case, some processing, without actually nailing down how to do the processing. See Local preconditions and postconditions.


Known previously as "swimlanes", activity partitions are a kind of activity group - a group whose constituents share some common characteristic. For example, in a workflow activity diagram, different activity partitions could represent different departments. See Activity partitions.

If an activity diagram were being used to document something object-oriented, although an activity diagram tends to suppress the objects, allowing one to focus on the processing irrespective of who's doing it, activity partitions can be used to indicate the actions being carried out by the different objects.


There is more. UML 2's activity diagrams are the most complex of all the diagrams, but I've certainly covered enough to support the needs of this book.

Use cases

In this book, our main interest in use cases is in their forming part of the requirements. They are not, therefore, a primary subject for this book but they do form an input to the main topics of this book, and developers need to know what to expect of use cases. Also, developers sometimes have to wade in and help with the creation or refinement of use cases. We are interested, then, in use cases that explain how users, devices and other systems will interact with the system-to-be, the system under development.

The most important thing about a use case is not presented graphically in diagrams. Use case diagrams might tell us about the existence of use cases and their names; they might tell us about relationships between use cases and the users of the use cases; and they might tell us about relationships between the use cases. However, use case diagrams don't actually give us the behavior of the use case itself - the "script" or "story" - what actually happens during the use case.

This is often quite adequately described, informally, in text. In my opinion, this is one of those times when a picture isn't usually worth a thousand words. I tend to favor text. Use cases can also be described with preconditions and postconditions, or via interaction diagrams, activity diagrams or state diagrams.

Use case diagrams start to become useful when there are significant numbers of use cases, and those use cases require organizing.

System boundary

Although it usually doesn't advance our understanding by much, a rectangle can be used to separate the use cases from the elements of the outside world. Use case has an example. If a strong picture has already formed as to subsystem packaging of the system-to-be, then named boundaries start to become a little more interesting.

Use cases

The use case itself is represented by an ellipse bearing the name of the use case, as shown in Use case. If the use case is describing an interaction with the system-to-be and a system boundary is showing, the use case is shown inside the system boundary.


A use case will involve elements of the context of the system-to-be, elements outside of the system-to-be. They are interactors or actors and are represented by stick figures. Although it's a small point, it's quite astonishing how many people slip into assuming that interactors are always human. That's why I prefer the term interactor and dislike the stick figure symbol. I prefer (more boring I'll grant you) the box with a stereotype. Interactors shows both possibilities.

As you can also see in Interactors, the actors are connected to the use cases they interact with.

There is a much more important concern regarding actors. It is important to remember that use cases are part of the requirements. They are not part of the analysis or design. Please do not fall into the trap of thinking that actors are entities or objects. If they are, it will be coincidence. There will be many actors that would make very bad objects and there will be many excellent objects that were never actors. Actors are there to give a point of reference for readers whose experience is with the subject matter rather than with software engineering.

And please don't fall into the trap of thinking that the use cases have any obvious or one-to-one correspondence with anything in the design.


Actors can be related with generalizations and so can use cases. It is not at all clear what it means exactly, so it probably best thought of as an informal classification scheme. It is shown with the usual symbol. Generalization relationship has examples.


It is easy to confuse generalization and extension . If you know C++ or Java, you can think of generalization as being closer to "is a kind of" relationships - like interfaces extending interfaces - and extension as being closer to templates or generics.

Unlike a general use case, an extendable use case knows that it is incomplete and that extending use cases might be along to complete it. An extendable use case defines extension points that extending use cases fill in.

The extension points can be listed in a compartment of the use case ellipse. The scripts of the use cases would locate the extension points and the extensions.

The representation is the dependency - the dashed line with stick arrowhead - with a "stereotype" keyword «extend» , as shown in Extend relationship.


Again, a programming analogy might help understand this relationship between use cases. This is like the subroutine call. One use case is proceeding and then at a defined point it "runs" or includes another use case. Again the dependency arrow is used, this time with a "stereotype" keyword of «include» , as shown in Include relationship.

Well that's it. I hope you found the guide useful. As I said at the outset, I'm currently updating the book appendix from whence it came it to reflect the latest (and final) version of UML 2. As to how and when it will appear on line, well that's partly in the hands of my publisher, and whether they elect to do a reprint with an unusually large number of changes for a reprint, or whether they elect to go for a new edition (or do nothing). I will make it clear near the start of this document if it represents an update, so you might check back here in a few months' time.

About The Author

Currently I write about, and deliver training courses on, analysis and design, and in object technology, mainly to financial and physics organizations, including over ten years' work at CERN, the high-energy physics institute and the creator of the web, as well as companies such as banks and telecommunication providers.

I guess it's just the whims of business as to why I've ended up with such an apparently polar client base. However, after finishing a science degree I did work for a year or so in accountancy before moving on to software development, and eventually returning to science via teaching software engineering and OOA / OOD to high-energy physics establishments like CERN, Rutherford-Appleton and DESY.

In the 1980s, I spent a couple of years at a London polytechnic teaching analysis and design, programming and artificial intelligence.

The software development I have done ranges over application areas from TV to double-entry bookkeeping, over jobs from programming to project management, and over architectural levels from boot code to frameworks.

(And - odd coincidence department should you know your rock bands - I was involved in the music industry for a brief while and do play bass:-)


John Deacon

<previous  1  2  3  

End notes

12. These rectangles - methods or member functions to you and I - were known as activations in UML 1.x. UML 2 has made a determined effort to make sequence diagrams much more complicated, proffering lots of rope, some slightly obscure noose-tying instructions and nearby branches [pun intended - see later]. Firstly, be careful - make your sequence diagrams as simple as you can, but no more simple than that. Getting to the point though, methods are now sometimes referred to by UML 2 as method activations in the explanatory text and as execution occurrences in the formal sections.

13. Prior to the object-oriented era, state machines could have been used to express the state of a system or of a subsystem. With the object-oriented approaches we are usually associating state machines with objects in one way or another. This has helped tame state machines a little.

14. In UML 1.x, state machines (or statecharts as they were then known) were often referred to as dynamic rather than static, a classification which has now been dropped. One of the ways in which this used to confuse things, was that dynamic often carried a connotation of uncompleteable. An important quality of state machines, however, is that they are complete.

15. We mentioned that in mathematics we talk of finite state machines or finite state automata. Well, to be even more precise, we are talking here of deterministic finite state automata.

16. UML 1.x continued to show transitions to a state within a composite by piercing the boundary, even when the states within the composite were not being shown. Such transitions terminated on a little "stub" symbol. Changing over to using named entry (and exit) points means that it is easy to encapsulate and create reusable submachine states. We will return to these a little later.

17. Thus removing a composite state changes the model. Given a local transition, flattening out the composite would cause the exit actions of the states within to start running. Keep composites simple and disposable .

18. UML 1.x referred to transition , entry and exit actions , and to do activities . This was one of the most confusing and underspecified parts of UML 1.x. UML 2.0 provides much needed clarification.

19. UML 1.x said that transition, entry and exit activities took no noticeable time, but UML 2.0's phrasing is probably better.

20. Decision, merge, fork and join nodes, along with various start and stop nodes are all classified as control nodes .