Zeta Workflow 02: Failure Handling

Here is what you need to know about workflow failure and exception handling:

  • If you resume the workflow execution after the failure, it will resume with the failing node.
  • Zeta Workflow has no provision for skipping a failing node and resuming execution at the next node. If you are stuck, you are stuck, and that is that.
  • When the workflow fails and/or throws an exception, state should be saved up to the point of the failure.

So, how do we deal with failure processing? I think we need to deal with it outside the workflow. There is no such thing as try/catch processing inside the workflow. There is no such concept as an error handler or signal handler.

After each action step, you could in theory examine the result and branch based on error status or some such. But since every path in the workflow is hard wired, you’d be quickly building up this monstrous wiring diagram to handle all possible error scenarios, all possible scenarios within scenarios, and so on.

I think the most logical way to deal with failure processing is to cancel the current workflow and fire up the suitable failure workflow. This would also be how we jump from one workflow to another. 

Meanwhile, the workflow itself may need to communicate what is supposed to happen next. For example, we may decide that it’s time to terminate this workflow and proceed with “offer group six.”

That’s where the plugins could come in. We could use the plugin to check node execution status, or whatever, and communicate to the outside world as to what needs to happen next.

To be sure, a Service Object can communicate with the outside world. It could, for example, update the User State Object. However, a Service Object is called at a specific point in the flow. If you’re not at that point, you don’t hit the Service Object.

A plugin, on the other hand, can be called after every node is executed. The plugin can suspend, end, or cancel the execution. The plugin can examine whatever state needs to be examined and update the User State Object accordingly.

For that matter, the plugin can simply communicate with the outside world via its own internal state. Remember that to begin with, we created the plugin as an object, and then attached that object to the workflow execution. Each time the workflow returns control back to us (via suspend, end, cancel, throw exception), we can query the plugin as to what to do next.

We might decide that the correct way for the plugin to communicate back is to throw an exception. We catch the PluginException, query the plugin, and proceed accordingly.

Before we come up with a design for failure handling, we would do well to understand the nuances of how failure handling actually works. The next post is a code walk through: http://otscripts.com/zeta-workflow-03-failure-handling-walkthrough/.