Welcome, Unregistered Visitor
Welcome, Unregistered Visitor
3 registered users
in last 24 hours
Workflow Design Documentation
Original text by Lars Pind and Khy Huang.
This document describes the software design decisions behind the ]project-open[ workflow.
Related documents include:
II. IntroductionThis package is intended to:
It's not going to be possible to build all of these applications just by doing some UI tweaking on top of the workflow package alone. Rather, these packages will be made much skinnier and development much faster and less risky by outsourcing some of the more intricate parts to the worklflow package.
There's one part of the workflow package for each of the goals above. The engine is implemented as a data model and a PL/SQL API, with a thin wrapper for the API in Tcl. The user-interfaces are implemented as AOLserver Tcl pages.
Discussion is currently being undertaken of the scope of the workflow package. We currently fulfill all the requirements of the Content Management System, which only uses the engine data model and API. The user-interface part is not yet done, and we need to polish the requirements and design for that.
III. Historical ConsiderationsOne of the classic claims of the OpenACS is that it is "data models and workflow". That's not coincidental. For collaborative software, there's going to be a lot of coordination of people working towards the same goal. So workflow is a key part of the problem space that OpenACS is designed to solve.
Historically, every module has implemented its own mini-version of workflow enactment code. Ticket-tracker is a prime example, but other examples include order fulfillment and customer service in the ecommerce module, the recruiting pipeline in the intranet module, and user approval in the OpenACS core.
This has the benefit of making it possible to completely tailorize the workflow to the needs of the application. But there are many disadvantages: The user has to visit several different URLs to figure out what's on his plate; the user experience will be different for each module, since each module developer will have done things slightly differently; you can't alter the process without hacking the module code directly; the module developer needs to write custom code to provide administrative features such as checking on the general performance of a workflow process; and generally, we can spend the time saved by speeding up development of workflow-centered applications on providing even better end-user and administrative tools that will benefit users of any workflow-based application.
IV. Competitive AnalysisThe only open-source competitor, I've been able to locate is wftk by Vivtek. They have a fairly nice interface for defining workflow processes that we could probably learn from. However, they've chosen a different model with explicit routing constructs, such as "sequential", "parallel", etc. It does limit the expressive power and require some special constructs like raising a "situation", and having a handler for that situation.
We've also looked at, and strive to adhere to, the standards set forth by the Workflow Management Coalition.
There's also a host of commercial workflow software implementations. We haven't looked at them, but we'd very much like to. I'm not sure how easy it'd be, given that they're commercial, but at least we should have a chance to look at Oracle Workflow.
V. Design TradeoffsThere are many design tradeoffs in the workflow package. We want it to be both easy, to provide as much out-of-the-box functionality for the end-user as possible, and at the same time make it powerful enough to handle the workflows that other application developers need to model. Here are the most important tradeoffs made:
For the underlying conceptual model we considered one based on finite state machines (FSM) and one based on Petri Nets. We decided against finite state machines because FSMs don't support the routing constructs that we needed (parallel, implicit and explicit conditional routing). We also considered some extension of FSMs, but couldn't find any well-defined ones, and we didn't want to invent some model where the semantics might not have been well-considered enough. A workflow is really a distributed protocol, and as any computer scientist knows, these can be really hard to formalize in a way, that ensures there are no dead-locks, dead states, etc. Petri Nets provides a formal mathematical model with an abundance of proven analysis techniques that can be used to both ensure static and dynamic properties of the workflow. Besides, W.M.P. van der Aalst made such a good argument in The Application of Petri Nets to Workflow Management that we felt confident it was the right decision. Parts of this argument is also that there are standard extensions to Petri Nets that are very useful for use with workflows, such as color, time and inheritance.
One downside of the Petri Net approach is that the model is more difficult to grasp than the finite state machine. We've tried to accommodate that problem by differentiating between a "simple" and a "non-simple" (or complex) workflow process. "Simple" basically means we can provide a very intuitive, easy-to-use and understand UI for defining workflows, that doesn't require the user to invest any time in learning how the workflow package operates. The idea is that if you only want to do simple, straight-forward things, we can abstract away from the complexity. However, if you want to do more complex things, then there is no way to avoid knowing what you're doing. Not because our model is hard, but because the problem of formalizing a workflow so it accurately models what you want without introducing problems such as deadlocks or dead states.
Since the engine will need to be tightly integrated with and customized to suit a specific package, we need some callback mechanism that'll allow custom code to get executed by the engine. The code had to be executed in the database, since we wanted to allow for non-AOLserver/Tcl clients to use it. We considered three general approaches to that: (1) take an arbitrary chunk of PL/SQL code as a varchar and pass it to
We have the concept of workflow attributes, which directs the execution of the workflow. For example, if one of the tasks is to approve or reject something, we'll typically have an
At some point, we implemented manual, per-case assignments by setting and querying a workflow attribute. Conceptually, this is very nice, but with the current implementation of workflow attributes, it doesn't allow multiple assignments to the same task, so we had to scrap that. Since then we introduce the notion of roles to provide more flexibility in assignment of users to task. Roles are assigned to parties and they are associated with a case and/or task. This adds the functionality to group a set of task together for assignment (i.e. assign a role to a set of task)
We've decided to make the workflow engine have a PL/SQL-only API, with the Tcl API being simple, small wrappers around the PL/SQL API. This has the benefit of being easy to port to an ACS/Java version. The downside is that it's going to be harder to port to PostgreSQL. On the other hand, there is one Tcl-callback, but this is one that's used purely for UI purposes, so that seemed acceptable.
We've decided to keep all historical information about the execution of a workflow, i.e. enabled transitions, tokens, workflow attributes, and tie all that into the journal. This obviously consumes more space and takes more querying time when there's non-current information in the database, but it adds tremendously to our ability to troubleshoot, backtrack and analyze the performance of the workflow.
Some things, such as assignments, may be defined cascadingly at several levels, such as statically for all workflows, manually for each individual case, or programatically, by executing a callback that determines who to assign based on roles. For example, a task such as "send out check" would always be assigned to the billing department. A task such as "author article" would be assigned manually by the editor based on his knowledge of the skills of the individual authors in "author" role. In the ticket-tracker, the task of fixing a problem would typically be assigned to a party based on who owns the corresponding project or feature area, and would be implemented by a callback from the workflow package into the ticket-tracker package. The callback would include the "module owner" role as its parameter.
VI. APIThe API comes in two parts. There's one for manipulating workflow process definitions (the
The workflow package currently only implement the most basic operations: create/drop a workflow, create/drop an attribute of a workflow, determine whether a workflow is simple or not, and delete all cases of a workflow (useful for dropping workflows).
To modify other static properties of workflows, such as the places, transitions and arcs, etc., you'll need to access the tables directly. There might not be much gained by providing a procedural API for this, since the basic design is considered very stable.
Much more thorough is the API for dealing with workflow cases. This provides the handles to interact with the execution of a workflow, which consists of: Altering its top-level state, i.e. start, suspend, resume or cancel a whole workflow case at a time; interact with tasks, i.e. start, cancel, finish or comment on a task; and finally, inform the workflow package that some external action that the workflow package is waiting for has occurred.
Some notes on the internals is in place: The key word is encapsulation of logic.
All the execution of callbacks, which involve use of
The cascading features, such as task assignments and task deadlines (static per-context setting, manual per-case setting, programmtic setting), the logic is encapsulated in their own PL/SQL functions,
The token manipulation procedures (produce, lock, release, consume tokens) encapsulate the actions on tokens and ensure the information we keep for the history of a token is consistent.
For workflow to be really functional, we rely on the OpenACS kernel at some point implementing a system for automatically generating forms based on metadata. This is used for setting workflow attributes as part of performing a taks. Currently, we do it ourselves in a very ad-hoc fashion. Speaking of attributes, we store a
VII. Data Model DiscussionThe knowledge level data model is quite straight-forward given the underlying Petri Net model. There's a table for each of the elements, i.e. places, transitions and arcs. There's also a mapping table between transitions and workflow attributes, saying which attributes should be the output of which transitions. And there's the now-obsololeted
The context-level is less obvious. We've decided to store all the callback information at the context level, assuming that they might need to change depending on the context. This is not necessarily the right decision, so we might have to move them to the workflow level later. The code review also raised the issue of whether these should be normalized into a generic
The operational level datamodel also follows the Petri Net pretty closely. There's a table,
Finally, there are a number of views intended to abstract away from whether something is defined at the workflow or context level, to query the actual workflow state (used by
There's a specific process for defining workflows, i.e. the order in which inserts should happen. This is described in the "Defining Your Workflow" section of the Developer's Guide.
Data Model Changes as per Nov 8, 2000 (Planned, but not implemented)For permissions, the right solution seems to be to make at least
If we also make transitions, places and arcs ACS Objects, we'll have a more consistent data model, and the callback signatures will be easier and more streamlined.
In order to store the attributes and take advantage of the kernel services (generic/specific storage handling, auditing, automatic form-generation, etc.), and since the attributes differ from process to process, we'll have to make the process an object type, in addition to being an object. Should we make every version of a process its own object type, or should we have only one object type per process? This is a trade-off. If we want to be able to alter/delete attributes, we'll have to make each version its own object type. If we're more concerned with quickly scraping up all the cases of a particular process, regardless of version, one object type per case is more relevant. But that can be solved with a foreign key. In any event, we'll have to be clever to come up with a good UI and code for summarizing all cases of a process, regardless of version. Finally, there's the uglyness of creating all those object types. I guess the conclusion is: we'll make each version an object type, it's cleaner.
General class hierarchy: Workflow is the abstract ancestor object type of all workflows. Each process, e.g. "
In other words, we have this class hierarchy:
Why? We need the version to be an object type if we want to use the kernel's generic attribute storage, or even the automatic form-generation facilities to come. We need the process itself to be an object type if we want a quick way to get all the cases of a specific process, across all versions. Finally, the master generic obejct type is a good match for the top-levelworkflow => expenses_wf => expenses_wf_123
Contexts are replaced with package instances. Currently, a context is nothing but a token that allows us to exchange some parts of the workflow definition, such as assignments, and make them different for different contexts. The idea is that the same process can be used with different static assignments in two different departments of the same company, or within different companies on the same website. The way this would work is that you install multiple instances of the target application (e.g. ticket-tracker), one per department or company (or, in general, per subsite).
In addition, we'll allow you to install multiple versions of the workflow package. Each version will only be able to see the workflows created in that specific instance. It would then make sense to split up the workflow package into a worklist package and a workflow package: The basic idea of the worklist is that it always shows all the items on your personal worklist, regardless of what process or whatever they belong to. This would make even more sense, because there's no reason that there has to be an associated workflow process just to add a task to your worklist. So the worklist package would be more generic. That split takes a bit more thought, though, as we'd have to develop the worklist as a standalone application and figure out the exact interaction between that and the workflow package.
We want to clean up the current callback mechanism in a few ways.
Here are the different types of callbacks (i.e., different signatures), and the concrete callbacks of these types.
/* This is the callback repository */ create table wf_callbacks ( callback_id integer pretty_name varchar description varchar callback_proc varchar type enum(task_date, task_notify, task_assign, guard) ); /* This is the list of custom arguments for these callbacks i.e., arguments not required by the signature. */ create table wf_callback_args ( arg_id integer callback_id integer pretty_name varchar description varchar position integer type enum(integer, number, string, attribute, transition) ); /* The argument type is mainly a UI thing. If you say 'integer', 'number' or 'string', we ask the user for such a value, and validate that it is in fact a valid integer/number/string. If you say attribute, we present the user with a box where he can select one of the existing attributes of this workflow. If you say transition, we ask the user to select one of the existing transitions, etc. We can easily add more types, e.g. places, arcs, maybe an enumeration (a list of options), etc. This is all just so that we can create a nice UI for the callbacks. */ /* This is a generic invocation of a callback, which basically just means that we've filled in the custom arguments with static values */ create table wf_callback_invocation ( invocation_id integer callback_id integer compiled_call varchar ); /* Compiled call is the string to be passed to execute immediate in Oracle. It'll contain bind variables for the arguments covered by the signature, and the static values for the custom arguments will be put directly into the string (don't forget to quote and escape properly). */ /* This table holds the static values to the custom arguments. We don't need to query this, once the 'compiled_call' column has been set. */ create table wf_callback_invocation_args ( invocation_id integer arg_id integer value varchar ); /* This is how we apply the callback to a transition */ create table wf_transition_invocation_map ( transition_id integer type enum(enable, fire, assignment, time, deadline, hold_timeout) sort_order integer /* the order, if there are more than one per transition invocation_id integer ); /* Here's the guard */ create table wf_arcs ( ... guard_id integer /* points to a row in wf_callback_invocation */ );
Finally, the table
Note:The describe changes, as of Nov 8, are not implemented. These design changes will be factored into the next release of workflow. For the next release the most significant change is reimplementing the engine in Java. It will require changes the way guards, callbacks, attributes, and object transitions are handled. Don't worry as part of the next release we'll include, hopefully, a straight foward upgrade path.
VIII. User InterfaceThe user-interface falls in these separate parts:
IX. Configuration/ParametersThere are two optional parameters under the APM, graphviz_dot_path and tmp_path. These parameters are required for the GraphVizv program used to generated the diagrams. Please read the installation guide for more details.
X. Future Improvements/Areas of Likely ChangeTwo major things: As said before, we'll need to work permissions into this system. And we'll need to amend the core datamodel with a concept of "revisions" of workflow process definitions, so it's possible to revise a workflow without deleting all the old cases that are still using earlier revisions.
For more far-fetched future plans, refer to Future Plans.
Related Workflow Topics
Related Object Types