Workflow engines

From Logic Wiki
Jump to: navigation, search


Evaluating Elsa vs Workflow Core for PCLS Automation

Introduction

The purpose of this case study is to evaluate two workflow engines, Elsa and Workflow Core, to determine the best fit for integrating workflow automation into PCLS.

The primary challenge was identifying a solution that was feature-rich, easy to use, and well-documented to support rapid implementation.

This spike aimed to reduce uncertainty by testing both tools and assessing their suitability for the project.

Background

The project requires a robust workflow engine to handle complex business processes and checks within Benefits PCLS. Elsa and Workflow Core were shortlisted due to their popularity as open-soruce .NET-based workflow engines.

While both tools appeared promising at first glance, their usability, documentation, and ease of integration needed further investigation.

Objectives

The spike aimed to:

Compare Elsa and Workflow Core based on key criteria such as documentation, ease of use, features, performance and reliability.

Build working demos with each tool to assess their practicality in real-world scenarios.

Identify risks or challenges associated with each tool.

Recommend the most suitable option for the project

Methodology

  • Research: Reviewed official documentation, tutorials, community feedback for both tools.
  • Prototyping: Attempted to build simple workflow demos using Elsa (versions 2 and 3) and Workflow Core.
  • Evaluation Criteria:
    • Quality of documentation
    • Ease of setup and implementation
    • Available features to support manual bypass and resumptions of failed workflows, and handling external events
    • Integration capabilities
    • Community support
    • Performance and scalability

Tools used: JetBrains Rider, .NET Framework, sample workflows Process Flows

Comparison

Criteria Elsa Workflow Core
Ease of Use Difficult due to lack of clear documentation for versions 2 and 3. Straightforward setup with clear examples provided
Documentation Insufficient; struggled to find working examples or guidance Well-documented with comprehensive guides and examples
Features Rich set of features but hard to explore due to poor documentation Adequate features that are easy to implement and extend
Integration Potentially powerful with limited integration but unclear due to limited resources Seamless integration with existing .NET applications
Performance Unable to test thoroughly due to demo issues Performed well in testing scenarios
Community Support Limited support; few helpful resources online Active community with helpful discussions on GitHub and forums
Scalability Promising but untested due to demo challenges Scalable; tested successfully in sample workflow

Findings

Challenges with Elsa:

  • Struggled to create a working and complete demo using both versions 2 and 3 due to insufficient documentation.
  • Lack of clear examples or tutorials made it difficult to explore the tools potential.
  • Community resources were limited, which slowed down troubleshooting.

= Strengths of Workflow Core:

  • Easy-to-follow documentation enabled quick setup and prototyping.
  • Provided sufficient features for the project’s needs without unnecessary complexity.
  • Active community support helped resolved minor issues during testing.
  • Demonstrated reliable performance in sample workflows

Recommendations

  • Workflow Core is the recommended choice for this project due to its strong documentation, ease of use, and active community support.
  • While Elsa has potential, its lack of accessible resources makes it unsuitable for projects requiring rapid implementation or minimal ramp-up-time.
  • If Elsa improves its documentation in the future, it may be worth revisiting for projects requiring advanced features and the designer dashboard offerings.

Implementation Plan

  1. Conduct a Snyk vulnerability check on the Workflow Core NuGet package
  2. Begin integrating Workflow Core into the Benefits PCLS Project
    1. Setup initial workflows based on tested prototypes or leaner samples
    2. Train developers on extending Workflow Core functionality as needed
    3. Establish unit testing patterns that are better suitable for the project for easier maintainability and project scale
    4. Monitor performance during implementation and adjust workflows as necessary
    5. Document lessons learned during integration for future reference.

Conclusion

While Elsa showed promise with its feature set, its poor documentation created significant barriers during testing. Workflow Core’s strong documentation, ease of use, and reliable performance make it a better fit for this projects requirements.

By choosing Workflow Core, we reduce implementation risk and ensure smoother integration into Benefits PCLS, with simple approaches to troubleshooting and debugging due to its verbose API.

Appendix A: Quick links

Appendix B: Evaluation of Additional Workflow Core Features

  1. Horizontal Scalability
    1. Workflow Core supports horizontal scalability by enabling multi-node clusters. This is achieved through thee configuration of queuing mechanisms and distributed lock managers to coordinate workflow execution across multiple nodes. Redis is a recommended provider for both queues and distributed locking Redis is a recommended provider for both queues and distributed locking due to its reliability, performance and agnostic platform support.
    2. By default, Workflow Core uses a built-in single node queue provider, however, Workflow Core comes with support for Azure Storage Queues, Redis, RabbitMQ, and AWS SQS for multi-node cluster configurations
    3. Distributed locking ensures that only one node processes a specific workflow at a time. This prevents race conditions or duplication in multi-node environments. Redis’s low-latency performance makes it an ideal choice for high-throughput applications.
    4. Observation: Workflow Core’s ability to leverage Redis as both a queue provider and a distributed lock manager ensures robust horizontal scalability. This configuration makes Workflow Core suitable for high-load environments or cloud-based deployments.
  2. Fault Tolerance
    1. Error handling at Step level: Each workflow step can be configured with specific error-handling behaviors, such as retrying, suspending or terminating the workflow. This allows workflows to automatically retry failed steps after a specified interval or take alternative actions.
    2. Global error handling: The WorkflowHost service provides an OnStepError event that can intercept exceptions globally, enabling centralized error management across all workflows.
    3. Persistence providers for High Availability: Queue and lock providers, enhance fault tolerance by supporting automatic failover in multi-node setups. E.g. if a node fails, Redis ensures that another node in the cluster can seamlessly take over processing.#
    4. Continuation from recovery: Workflow Core
  3. Ability to Wait for external Input
    1. Workflow Core supports waiting for external input through its .WaitFor() method. This allows workflows to pause execution until an external event is triggered.
    2. Events can be published using the PublishEvent method, enabling real-time interaction with external systems.
  4. Ability to bypass/Re-trigger Failed Automated Actions
    1. Workflow Core does not natively support manual bypassing or re-triggering of failed steps. However this functionality can be implemented by adding logic within the workflow steps.
    2. Your approach:
    3. Observation: While manual intervention is not directly supported out-of-the-box, Workflow Core’s flexibility allows custom implementations to meet this requirement.et this requirement.
  5. Behavior When Workflow Definition Changes
    1. Current Step Handling:
      1. If a node is stopped or restarted while a workflow is running without a graceful shutdown, the current step remains in a pending state. When the node restarts (or another worker picks it up), execution resumes from the pending step without issues.
      2. If a node is stopped or restarted, the workflow host doesn’t start any more workflows being triggered, the application waits for the step to complete before shutting down and the status of the step is recorded in the workflow state
    2. Impact of Restructuring Steps:
      1. Restructuring (e.g. adding/removing/moving steps) does not automatically break running workflows. The workflow continues from the current step as determined by its saved state in the persistence layer.
      2. The current step is not bound to the steps definition but rather its position in the execution order. e.g. When a workflow is disturbed, whilst executing the second step, and that step is replaced by either a control structure, an inline step or a custom step, that new definition will be executed and the execution pointer will continue with the rest of the steps from there.
    3. Versioning Workflows:
      1. To ensure existing workflows execution pointer will continue with the rest of the steps from there.
      2. To ensure existing workflows are not disturbed by changes to the definition:
        1. Create a new workflow class with the same ID but incremented version e.g. MyWorkflowV2
        2. New workflows will use the updated definition ( new version ), while old existing workflows will use their original definitions
    4. Observation:
      1. Workflow Core handles changes gracefully as long as workflow versions are managed properly. This ensures that workflows are not disrupted while allowing new workflows to adopt updated logic

Summary of Findings

Feature Support in Workflow Core Notes
Horizontal Scalability Fully supported with external queue providers Recommended configuration: Redis for queues and distributed locking
Fault Tolerance Supported with recovery mechanisms and configurable error handling Ensures reliability in case of server or step failures
Waiting for External Input Supported using .WaitFor()method and event publishing Ideal for long running workflows requiring external triggers
Manual Bypass/Re-triggering Not natively supported, requires custom step logic. Custom implementation needed to handle bypass or re-trigger scenarios effectively.
Handling workflow definition changes Supported with versioning of workflow definitions Running workflows continue unaffected; new versions apply only apply to new instances.