FSM Error Handling

FSM requires flexibility in handling errors because in many cases errors are not severe enough to disrupt the ability of the FSM to proceed normally. Theoretically, states within the FSM should be used to handle all errors instead of propagating error status codes back through the calling stack. While this will always work, the strict adherence to the use of error-states within a FSM may be too restrictive, so implementations based on the FSM library can use a variety of error-handling procedures. The real error-handling dilemma deals with non-catastrophic errors. For example, suppose the FSM finds one event-state-transition match, but the guardian fails, so the transition fails. Is this an error? Probably not – but good programming practice would insure that some event-condition matched for completeness. Alternatively, multiple event-state-transition matches could have all guardians succeed. This is most likely an error. Thus, depending on the circumstances, there are shades of interpretations of what an error is.

The FSM library offers several facilities for handling errors.

Error Policy

Errors can range in degree of severity. Some errors can be catastrophic, while others can be benign where a simple warning or in many cases even ignoring the error would suffice.  The FSM library provides a policy mechanism defined by CFSMPolicy, which provides you the opportunity to handle or ignore specific errors, and provides control over the following error handling situations:

See FSMPolicy for more details on detecting and handling errors through the policy mechanism.

Illegal Event Detection

If the FSM execution model is asynchronous, the FSM library currently offers only one level of event mismatch error detection.  Exposing all the events as class methods provides a correctness mechanism to prevent the FSM from receiving events that it never supports. This does not guarantee that the FSM will not accept events without a corresponding transition in a given state. Strict event-state transition mismatch is only easy handled when there is no queuing of events. Event queuing means a delay will occur before event service, and the direct correspondence between events to states is lost. Further, an observer pattern will be required to notify the client of event-state transition mismatch, since the mismatch will be detected at some arbitrary time in the future, not at the time of the event method call.

 

If the FSM execution model is synchronous, or all event are processed using processEvent, clients can query the FSM to determine if an event is accepted in the current state, that is, there is a corresponding transition matching the event in the given state. The methods

virtual boolean  canHandleEvent(String event);

 

are available to determine an event is valid in the current state. In general, the method

virtual boolean  canHandleEvent(String stateName, String event);

 

can be used to determine if an event is valid for the given state.

Failure Return Status

During the FSM update, errors are reported as HRESULT status codes that are funneled back to the calling client that updates the FSM. Another “cleaner” error handling mechanism, C exceptions could be thrown within the actions in lieu of returning status codes. Fortunately, error handling using the FSM library can be customized. For example, derived FSM can use the basic CFiniteStateMachine update mechanism, and then detect failures to allow default error handling or trap exceptions, in lieu of HRESULT status codes. This allows customization based on the severity of the errors, so that different error handling procedures could be employed depending on the application, from ignoring the error to program aborting.

The basic Microsoft error status codes potentially used within the FSM library are:

Value

Meaning

S_OK

Function succeeded. Also used for functions that semantically return a Boolean TRUE result to indicate that the function succeeded.

S_FALSE

Used for functions that semantically return a Boolean FALSE result to indicate that the function succeeded.

E_NOINTERFACE

QueryInterface did not recognize the requested interface.

E_NOTIMPL

Member function contains no implementation.

E_FAIL

Unspecified failure.

E_OUTOFMEMORY

Function failed to allocate necessary memory.

The FSM library error status codes used within the FSM library are:

Value

Meaning

E_STATE_INVALID_TRANSITION

Used where an invalid event occurs that is never supported by any state in the FSM.

E_INTERNAL_PROBLEM_TRANSITION

Used where an event triggers duplicate matches in two or more state transitions.

E_INVALID_EVENT                              

Used for the case of a non-supported event in the current state.

E_INVALID_FSM_OPERATIONAL_MODE

FSM library does not support this operational mode. (Should never occur.)

 

The simplest way to customize error handling is based on overriding the updateFSM method. In this case, the return value is monitored and appropriate diagnostic measure is taken. The policy for this error handing mechanism would be to ignore all errors, and have them propagate back to the outer overriding updateFSM. 

Class MyFSM : public CFiniteStateMachine

{

  virtual updateFSM()

  {

       if(FAILED(CFiniteStateMachine::updateFSM())

       {

        . . . handle error as desired . .  .

       }

  }

 

};

 

The CFiniteStateMachine class offers the ErrorMessage(HRESULT error) method, which translates error codes into a FSM String message text, which can be converted to any string type (ASCII, or WIDE) through an conversion operator. The method relies on the Win32FormatMessage function, for Windows error codes.

       HRESULT hr;

       if(FAILED(hr=CFiniteStateMachine::updateFSM())

       {

               fprintf(“My FSM encountered error: %S”, (BSTR) ErrorMessage(hr));

              // continue, abort, …

       }     

CFSMFailedEvent wrapping for a Failed Event

FSM library provides the CFSMFailedEvent class, which is a specialization of the Event class that defines a failed event that is tagged with an error message. In cases where a failure has occurred, a transition to the failed state is required. To pass the error message in a generic fashion the CFSMFailedEvent class was defined to wrap the error message in an event. CFSMFailedEvent handles errors where recovery is impossible. Other mechanisms could be developed to perform error recovery, or continue or abort given an error.

As an example of using the CFSMFailedEvent class, assume one is processing an action during an internal or external transition within a given state. The action has detected an error. The action creates a new CFSMFailedEvent that wraps the error message within the constructor and uses the handleEvent method to queue the event.

HRESULT eventHandler(CFiniteStateMachine * fsm, CFSMEvent * event )

{

       // . . .  detected error

       handleEvent(new CFSMFailedEvent("fail", "Failed calculation"));

}

 

A separate generic “fail” action is required as part of the transition to the failed state. It accepts the failed event, verifies that it is of type CFSMFailedEvent, and if so extracts the text to be displayed as a messaged, logged, “core dumped”, or whatever failure procedure is necessary. In this example, the error message is displayed in a Windows message box, and the done flag is set to an error condition (-1).

HRESULT fail(CFiniteStateMachine * fsm, CFSMEvent * event)

{

       if(event->IsKindOf( RUNTIME_CLASS( CFSMFailedEvent ) ) )

       {

              //if IsKindOf is true, then cast is all right

              CFSMFailedEvent * failedevent= (CFSMFailedEvent *) event;

              String errmessage= failedevent->errmessage;

       }

       else {

              String errmessage=L"Fail: Non Specific Reason";

       }

       AfxMessageBox((char *) errmessage);

       m_doneFlag=-1;

       return S_OK;

}

See CFSMFailedEvent for more details.