Article by Teodor Petrov28.01.2026

Algorithm for Automatic Arrangement
of BPMN Elements

"Beat the Human" Project Team
Teodor Petrov, Noah Agostinis,
Mike Zwicky, Ömer Aktas, Visar Bektesi

This paper details the design and implementation of an algorithm for the automatic arrangement of Business Process Model and Notation (BPMN 2.0) diagrams. Initially developed to solve layout challenges at Mettex, the solution addresses the inefficiency of manual diagram arrangement by providing a high-performance automated layout algorithm.

The arrangement of process diagrams fundamentally requires solving multiple interconnected optimization problems. Mathematically, these problems belong to a class of high computational complexity, where finding a perfect solution is practically infeasible. To address this complexity within practical time limits, the solution uses heuristic methods based on the Sugiyama layered framework. This standard approach was significantly extended with custom algorithms tailored specifically to BPMN, allowing the system to optimize visual properties while strictly enforcing semantic constraints.

The implementation is realized as a modular Java pipeline utilizing the Eclipse Layout Kernel (ELK) and Flowable libraries. Performance benchmarks demonstrate an average execution time of under 20\,ms per diagram, showing that the system meets the performance and quality requirements for practical integration.

1 Introduction

Business Process Model and Notation (BPMN 2.0) provides a standardized graphical language for describing business processes in a way that is both precise and easy to understand. The clarity of a BPMN diagram depends on the correctness of its logic and the quality of its visual arrangement. When diagrams contain too many line crossings, irregular edge routing, and other visually confusing structures, they quickly become difficult to read and interpret.

In many industrial contexts, BPMN diagrams are often arranged largely by hand. Although some modeling tools include an automatic layout function, their results often require extensive manual adjustments before the diagrams reach an acceptable level of visual arrangement quality. This manual work consumes time and reduces efficiency in process modeling. The aim of this research is to present an improved automatic arrangement algorithm that can produce BPMN diagrams of comparable or better quality than those arranged manually, while maintaining practical performance.

The purpose of this paper is to describe the problem in detail, define the specific constraints and optimization objectives required for high-quality BPMN visualization, and provide a comprehensive specification of the proposed algorithmic pipeline and its underlying strategies.

1.1 Problem Definition

A BPMN diagram can be represented as a directed graph \(G = (V, E)\), where the nodes \(V\) correspond to BPMN elements such as activities, events, and gateways, and the edges \(E\) represent the sequence and message flows between them. The goal is to determine positions \((x_i, y_i)\) for all nodes and locations of bends for the edges such that the resulting layout is visually and structurally "equivalent to or better" than an arrangement created manually by humans in "at least \(95\%\) of cases". Additionally, the algorithm must be compliant with BPMN constraints and run in the order of milliseconds for average-sized BPMN diagrams.

One of the central objectives is the minimization of edge crossings. Every pair of intersecting edges makes the diagram harder to follow, so reducing crossings is the primary arrangement quality target. Formally, if \(C(G)\) denotes the number of crossings in a BPMN diagram \(G\), the main optimization goal can be expressed as \[\min_{D \in \mathcal{D}(G)} C(D)\] Here, \(D \in \mathcal{D}(G)\) denotes a possible drawing (arrangement) of the graph \(G\). Minimizing crossings alone may not always yield the most readable or compact diagram. Other layout characteristics, such as process flow direction and structure, edge bend minimization, or maintaining compactness, can influence visual arrangement quality as well. These aspects are also explored during implementation and evaluation. For now, as an introduction, the layout problem can be understood as a multi-objective optimization where crossing minimization is heavily emphasized.

1.2 Constraints and BPMN-Specific Considerations

The layout must also respect the semantic and structural rules of BPMN (e.g., elements must remain inside their assigned pools and lanes). Furthermore, the diagram should maintain a clear structure and directional orientation so that the process can be followed naturally, similar to human-arranged BPMN diagrams.

The requirements define a set of hard constraints that cannot be violated, and also soft objectives such as minimizing crossings or edge bends, which guide visual optimization. Together they form a constrained optimization problem: the algorithm must always produce a valid BPMN diagram while improving its arrangement quality as far as possible within those limits.

This framework establishes the foundation for the development of the algorithm. The next section discusses the computational complexity of the problem, why exact solutions are practically impossible, and the motivation for using heuristic, layered approaches that are better suited for real-world BPMN diagrams.

1.3 Algorithm Inputs and Outputs

The automatic arrangement algorithm operates on a graph representation of a BPMN process model.

Input \[G = (V, E)\]

The input is a directed graph \(G\) derived from a BPMN diagram, where:

\(V\) is the set of BPMN elements (nodes), including events, activities, and gateways. Each node carries attributes such as its BPMN type and container information (e.g., lane).
\(E\) is the set of directed edges representing control or communication flows. Each edge corresponds to either a sequence flow (within a process) or a message flow (between pools).

All input data is assumed to comply with BPMN 2.0 and to represent a semantically valid process model.

Output \[G' = (V', E')\]

where \(G'\) is identical to \(G\) in structure and semantics, but includes additional visual attributes:

\[V' = \{ (v_i, \text{data}(v_i), (x_i, y_i)) \mid v_i \in V \},\] \[E' = \{ (e_j, \text{data}(e_j), \pi_j) \mid e_j \in E \}.\]

Here:

\(\text{data}(v_i)\) and \(\text{data}(e_j)\) denote the original BPMN information (identifiers, types, labels, etc.).
\((x_i, y_i)\) are the computed coordinates of each BPMN element in the layout space.
\(\pi_j\) defines the routed path of each flow, including any bend points and attachment positions on connected elements.

The resulting graph \(G'\) preserves all original BPMN semantics while adding layout data that can be exported to BPMN XML.

2 Computational Complexity and Motivation
for Heuristic Approaches

Designing an algorithm that automatically arranges BPMN diagrams in a good arrangement is a challenging task that involves solving several interconnected optimization problems that are inherently too complex to be solved exactly in a reasonable amount of time. Understanding this fundamental difficulty helps explain why practical solutions must rely on heuristic methods.

2.1 The Graph Crossing Number Problem

To illustrate this complexity, we consider the central subproblem - edge crossing minimization. It is very similar to the well-known Graph Crossing Number Problem:

Given a graph \(G = (V, E)\), find a drawing in the plane that achieves the minimum crossing number \(\min_{D \in \mathcal{D}(G)} C(D)\).

Determining a drawing that minimizes the number of edge crossings is NP-hard, and even deciding whether there exists a drawing with at most \(k\) crossings is NP-complete.

This means that even for moderate-sized BPMN diagrams, finding a globally optimal layout through exact computation is impractical. For instance, finding the optimal arrangement for a 100‑element BPMN diagram using a modern computer could take an extraordinary amount of computation time.

2.2 Other NP-Hard Subproblems in Layout Generation

The crossing minimization challenge is not the only source of computational hardness. Several additional subproblems commonly appear in layout pipelines and also apply to BPMN diagrams. These include the feedback arc set problem (cycle removal), two-layer crossing minimization in layered layouts, and orthogonal edge routing with bend minimization.

2.3 The Sugiyama-Style Layered Framework

Among established layout paradigms, the Sugiyama framework (also known as the layered or hierarchical graph layout) provides an especially suitable foundation for BPMN diagrams. Its fundamental design philosophy is built around flow directionality . Since BPMN is a language designed to visualize the temporal sequence of a business process (typically flowing from left to right), a layout algorithm that strictly enforces this hierarchy is essential for readability.

This framework decomposes the complex layout problem into a sequence of manageable steps:

Cycle removal to ensure acyclicity
Layer assignment to distribute nodes according to the process flow
Edge subdivision to ensure that each edge connects only nodes in adjacent layers
Crossing minimization by reordering nodes within layers
Coordinate assignment to compute actual positions

Many of these steps use heuristic or approximate methods, producing a good overall result efficiently. This layered approach aligns naturally with BPMN’s left-to-right or top-to-bottom flow and can be extended with additional rules for pools, lanes, gateways, and message flows. A later section on the core algorithm will describe how this general framework will be adapted to the specific needs of BPMN. The deterministic and highly structured nature of the layered approach allows for essential customizations. It supports the injection of hard constraints, such as keeping elements within specific lanes or pools, without breaking the fundamental logic of the algorithm.

3 Algorithm and Implementation

The implementation of the automatic arrangement algorithm is realized as a modular Java application. The solution integrates established open-source libraries that handle data parsing, object modeling, and implement specific graph algorithms.

The core dependencies and their specific roles in the architecture are as follows:

Eclipse Layout Kernel (ELK) : ELK provides the underlying graph layout infrastructure. The implementation specifically uses the Layered algorithm (based on the Sugiyama framework discussed in Section 2).
Flowable : This library provides a lightweight, standards-compliant Java API for creating and modifying BPMN 2.0 models. Flowable is used to bridge the gap between the algorithm and the standard input/output format. It manages the BPMN Diagram Interchange (DI) data, allowing the algorithm to inject computed coordinates and edge waypoints directly.

All selected external libraries are distributed under permissive open-source licenses - specifically the Apache License 2.0 (Flowable) and the Eclipse Public License 2.0 (ELK). These licenses explicitly permit the use, modification, and redistribution of the software, ensuring that the solution is fully compliant for broad academic and commercial integration.

3.1 System Architecture and Pipeline Overview

The system is designed as a linear transformation pipeline that decouples the semantic definition of a process from its visual representation. The central data structure used throughout this pipeline is the Flowable BpmnModel, which provides an object-oriented representation of BPMN 2.0 elements.

The system consists of three distinct stages:

Data Ingestion and Model Instantiation
The pipeline begins by loading the process definition. The system parses a standard BPMN XML file and maps the structures to a valid BpmnModel instance. At this stage, the model contains all semantic information (activities, gateways, flows, etc.) but lacks visual layout data.
Layout Computation and DI Injection
The core Layouter class processes the BpmnModel. It extracts the graph structure and delegates specific sub-problems (layering, crossing minimization, etc.) to the algorithmic components described in the following sections.

The algorithm populates the BPMN Diagram Interchange (DI) layer within the Flowable model.
- For nodes (activities, events, gateways), the algorithm assigns GraphicInfo containing absolute \(x, y\) coordinates, width, and height.
- For edges (sequence flows), it generates a list of waypoints representing the routed path.
Serialization
Once the model is enriched with DI information, the BpmnXMLConverter serializes the object graph back into standard BPMN 2.0 XML. The result is a fully compliant file that includes both the process logic and the generated visual arrangement.

3.2 Layouter

The Layouter serves as the central orchestrator of the entire arrangement pipeline. It manages the complete lifecycle of the layout process and executes the algorithmic steps in a strict dependency order.

Each step relies on the structural data computed in the previous steps:

Graph Abstraction: The process iterates over the input model. It converts Flowable BPMN objects (Activities, Gateways, Events) into generic ElkNodes and sequence flows into ElkEdges. This step strips away BPMN-specific complexity to create a pure topological graph suitable for analysis.
Global Layering: The ELK engine is configured and performs the initial global Sugiyama run. The result is used to determine the horizontal node layer index.
Lane-Aware Crossing Minimization: The LaneCrossMinimizer is invoked. It takes the layered graph and permutes the vertical order of nodes to minimize edge crossings, respecting BPMN container constraints.
Grid Compression: The Compressor works on a grid of nodes. It uses iterative application of rules to create a more compact diagram.
Coordinate Application: The abstract grid positions are translated into absolute pixel coordinates accounting for element-specific geometry and spacing rules. The calculated positions (\(x, y, width, height\)) are written into the Flowable model, completing the transformation from a purely semantic definition to a visual diagram.
Edge Routing: With all nodes fixed in space, the Router calculates the orthogonal paths (waypoints) for all sequence and message flows.
Label Positioning: Finally, the Labeler finds optimal positions for text labels.

The layout strategy is built upon the Sugiyama framework (layered graph drawing) . The fundamental principle of this framework is that nodes are arranged in distinct vertical layers. In this context, a "layer" refers to a vertical column of elements sharing the same horizontal rank. Since the BPMN standard dictates a left-to-right flow progression, the primary responsibility of the Layouter is to assign every node to a specific layer (\(x\)-coordinate index).

To achieve this, the algorithm utilizes the Eclipse Layout Kernel (ELK) for an initial global analysis. This first pass ignores "swimlanes" (pools and lanes) entirely. By treating the process as a free-form graph, the algorithm can determine the optimal logical depth of every element without being constrained by vertical lane boundaries.

3.2.1 Cycle Removal: The Depth-First Search (DFS) Strategy

The Sugiyama framework requires a Directed Acyclic Graph (DAG) to function. However, business processes inherently contain cycles (loops) representing rework or repeating steps. Therefore, the first step is to temporarily reverse specific edges to break these cycles.

The algorithm employs a Depth-First Search (DFS) strategy for cycle breaking . This method is mathematically superior for process modeling because it mimics the "narrative flow" of a human-designed process.

To better describe the DFS algorithm we can use the three-color system. DFS traverses the graph by marking nodes with three states: White (unvisited), Gray (active in the current recursion stack), and Black (finished).

The algorithm dives as deep as possible into the graph hierarchy.
If it encounters an edge pointing to a Gray node, it has detected a loop (a path returning to an active ancestor). This specific edge is identified as a "back-edge" and reversed.
If it encounters an edge pointing to a Black node, it treats it as a standard forward edge or shortcut.

DFS is ideal for BPMN because it finds the chain of events that represents the primary process flow (often called the "happy path"). Consider a process with a main sequence \(A \rightarrow B \rightarrow C \rightarrow D\) and a shortcut \(A \rightarrow D\). A DFS approach acts like a storyteller: it follows the main thread (\(A\) to \(B\) to \(C\)) to its conclusion before backtracking to consider alternatives. By holding the node \(A\) "open" in the recursion stack until \(D\) is reached via the long path, DFS ensures that \(D\) is placed in a layer after \(C\). This guarantees that the main sequence is drawn linearly from left to right, while the shortcut is rendered correctly as a "jump" ahead. Without this deep traversal, elements on the main path could be pulled prematurely to earlier layers, causing the primary flow to visually reverse direction.

3.2.2 Layering Strategy: Network Simplex

Once the graph is acyclic, the algorithm must assign every node to a discrete integer rank (layer). In a left-to-right BPMN layout, this rank determines the horizontal layer in which a node resides. The Layouter utilizes the Network Simplex layering algorithm , which is widely regarded as the gold standard for hierarchical graph drawing. The Network Simplex algorithm formulates the layering as a linear optimization problem. Its objective is to minimize the total weighted length of all edges in the graph while preserving the hierarchy. This approach results in a graph drawing that looks compact and balanced.

Formally, the problem is defined as: \[\text{Minimize } \sum_{(u,v) \in E} \omega(u,v) \cdot (\text{rank}(v) - \text{rank}(u))\] Subject to the constraint that for every edge \((u,v)\), the process flow direction is respected: \[\text{rank}(v) \geq \text{rank}(u) + \delta(u,v)\] where \(\delta(u,v)\) represents the minimum length of an edge.

We can think of every edge as a rubber band that wants to contract, pulling the connected nodes as close together as possible. The Network Simplex algorithm iteratively adjusts the positions of the nodes until the tension in this system is minimized. By minimizing total edge length, the algorithm naturally reduces whitespace. It prevents the "spaghetti effect" where edges span across multiple empty layers unnecessarily, resulting in a diagram that is concise and easier to read. Although cycles were broken in the previous step, the layout must eventually draw the return edges (back-loops). By keeping the forward graph compact, the algorithm implicitly ensures that these return edges are short, making iteration loops easy to follow.

With the initial global layering complete, every element has a defined horizontal layer/rank. However, the vertical ordering within these layers is undefined. The next step must strictly enforce BPMN lane constraints while optimizing to reduce edge crossings, which is performed by the LaneCrossMinimizer.

3.3 LaneCrossMinimizer

Standard implementations of the Sugiyama framework typically treat the graph as a single, uniform space. However, BPMN diagrams are strictly partitioned into horizontal containers (Pools and Lanes). A standard crossing minimization pass would freely mix nodes from different lanes to reduce crossings, destroying the semantic meaning of the diagram.

The LaneCrossMinimizer solves this by implementing a Constrained Whole-Layer Sweeping algorithm. The goal is to minimize edge crossings globally while strictly adhering to two hard constraints:

A node assigned to Lane \(A\) must never be moved to Lane \(B\).
The vertical order of the lanes and pools themselves (e.g., Lane \(A\) above Lane \(B\)) is fixed and must be preserved.

The LaneCrossMinimizer only ever changes the relative vertical order (the \(y\) position) of nodes within their assigned layers.

3.3.1 Edge Subdivision and Dummy Nodes

Before optimization begins, the graph is normalized. In a layered graph, edges that span across multiple non-adjacent layers (long edges) must be subdivided into chains of segments. This is achieved by inserting "Dummy Nodes" at every intermediate layer.

In this lane-aware implementation, dummy nodes are not placed arbitrarily. They are assigned to the source lane of the edge. Additionally, a heuristic is applied to the placement of these dummies:

If an edge flows downward (to a lane with a higher index), dummies are appended naturally.
If an edge flows upward (to a lane with a lower index), dummies are flagged to "prefer the top."

3.3.2 Constrained Whole-Layer Barycenter Heuristic

The core of the crossing minimization is the Barycenter Heuristic. The algorithm performs iterative "sweeps" (alternating left-to-right and right-to-left) across the layers. The total number of sweeps is fixed to an odd number. This ensures that the optimization process always concludes with a forward sweep (left-to-right), prioritizing the natural reading direction of the diagram and aligning nodes according to their upstream dependencies, which is visually more intuitive for process flows. In a standard approach, nodes are sorted purely by the average position of their neighbors. We define a Global Vertical Index for every position in a layer, formed by concatenating the available slots of all lanes in their fixed order. When calculating the barycenter for a node \(v\), the algorithm considers the positions of neighbors in all lanes. This allows a node in Lane A to "see" that it is connected to a node in Lane B and try to align itself vertically with it.

Formally, the barycenter \(b_i(v)\) is calculated as: \[b_i(v) = \frac{1}{|\Gamma_{i-1}(v)|} \sum_{u \in \Gamma_{i-1}(v)} p_{i-1}(u)\] Where:

\(v \in L_i\) is a node in the current layer.
\(\Gamma_{i-1}(v)\) is the set of neighbors in the adjacent layer (regardless of lane).
\(p_{i-1}(u)\) is the global index of the neighbor \(u\).

While the calculation is global, the reordering is locally constrained. A node with a calculated barycenter pointing to a different lane is "capped" at the boundary of its own lane. It moves as close as possible to its desired global position without violating the hard lane constraint.

3.3.3 Vertical Alignment Strategy: The Run and Anchor Method

A common issue with standard barycenter sorting in swimlane layouts is that nodes tend to "float" to the top or center of the lane, creating "wobbly" lines for straight sequences. To resolve this, the algorithm employs a custom alignment logic rather than simple sorting.

Since the height of a lane is determined by its most populated layer, simpler layers have excess empty slots (\(H - n\) empty spaces). The algorithm utilizes this slack to align nodes perfectly.

Desired Index: The calculated barycenter is rounded to the nearest integer to obtain a target slot index.
Run Detection: The algorithm identifies "Runs". A Run is a sequence of nodes in the sorted list that compete for the exact same desired integer index.
Anchor Selection: For each run, an "Anchor" is identified. This is the node within the run whose exact floating-point barycenter is closest to the integer target.
Shift Down: The algorithm iterates from the top of the lane. If a run’s Anchor is currently positioned higher (lower index) than its desired target, the entire run, and all nodes below it, are shifted down.

This logic ensures that if a straight chain of tasks exists (Sequence \(A \rightarrow B \rightarrow C\)), they will yield identical integer barycenters. The shifting mechanism will then push them into exact horizontal alignment, utilizing the empty vertical space in the lane to create straight, readable process lines.

3.3.4 Output Representation and Graph Invariants

The result of the crossing minimization step is stored in the LanesInfo data structure. This class maps each Lane ID to a list of LaneNodes - lightweight wrappers that store the element’s ID, its integer layer index, and its relative integer rank within that lane.

Embedded within this structure are the AdjacencyMaps. It is crucial to note that these maps abstract away the semantic direction of the BPMN flows. They classify connections purely based on proximity: a "Previous Layer Neighbor" is any connected node residing in layer \(x-1\), and a "Next Layer Neighbor" is any node in layer \(x+1\). Consequently, in the context of the layout algorithms, a "parent" simply refers to a neighbor to the left, regardless of whether the actual BPMN edge represents a forward sequence flow (\(A \rightarrow B\)) or a backward loop (\(B \rightarrow A\)).

At this stage, the graph has been transformed into a grid. The horizontal axis represents the global layers, and the vertical axis is a composite of all stacked lanes. It is important to note that the dimensions of this grid are rigid. The height of a specific lane \(L\) is fixed across the entire diagram, determined by the maximum number of nodes that occupy that lane in any single layer (\(H_{max}\)). This means that even in sparse areas of the process, the lane reserves \(H_{max}\) integer slots, effectively creating a sparse matrix representation of the diagram.

The resulting graph provides a strict set of invariants:

Every edge connects a node at layer \(i\) exactly to a node at layer \(i+1\). There are no edges spanning multiple layers. All long jumps have been resolved into chains of dummy nodes.
Every node (real or dummy) resides strictly within the integer vertical bounds of its assigned lane.
Each intersection of a layer and a vertical slot contains at most one node.

This highly structured grid provides a predictable foundation for the Compressor.

3.4 Compressor

Following the execution of the LaneCrossMinimizer, the graph exists in a valid, lane-compliant state where edge crossings are minimized. However, the resulting grid is typically sparse, using more space than necessary.

The Compressor is responsible for compacting this diagram. The primary operation of this step is the manipulation of the horizontal coordinates of nodes. It applies a "Left-Shift" strategy, iteratively moving nodes to earlier layers (\(x - 1\)) to reduce the total width of the diagram and shorten edge lengths.

Where structurally possible, the Compressor attempts to move a child node into the same horizontal layer as its parent (resulting in \(x_{parent} = x_{child}\)). This transformation converts diagonal connections between adjacent layers into direct vertical adjacencies within a single column. This stacking approach is considered visually cleaner, as it groups related elements tightly (e.g., placing a parent directly above its child) and significantly reduces the horizontal footprint of the diagram.

The Compressor operates on a grid structure and strictly enforces that no structural constraints are violated and no new edge crossings are introduced during shifts.

3.4.1 Global Grid Construction

To perform compression efficiently, the Layouter first transforms the LanesInfo structure into a unified Grid. While the previous step treated lanes as separate vertical entities, the Compressor flattens this structure into a single global coordinate system.

The algorithm calculates a vertical offset for each lane based on the accumulated height of all preceding lanes. \[y_{global} = \text{offset}(L) + y_{local}\]

This transformation produces the Grid, a data structure implemented as a two-dimensional matrix of GridNode objects with dimensions \([Layers \times GlobalHeight]\). It is important to emphasize that this grid uses a discrete, abstract coordinate system. The coordinates \((x, y)\) represent logical integer slots, not pixel positions. Each GridNode populates a specific slot in this grid and caches topological data (references to parent and child IDs) required for rapid validation. This abstraction allows the algorithm to check adjacency and occupancy, such as verifying if slot \((x-1, y)\) is null, in constant time \(O(1)\).

3.4.2 Horizontal Compression Strategy

The core compression logic is an iterative algorithm that scans the grid from left to right (Layer \(1\) to \(N\)) and top to bottom. It attempts to move every node \(v\) at position \((x, y)\) to the slot \((x-1, y)\).

The algorithm runs in a loop until the grid stabilizes (no nodes are moved in a full pass). It executes two distinct phases:

Strict Phase: Moves are only permitted if there are strictly no crossings.
Relaxed Phase: Moves are permitted even if there are existing crossings, provided no new crossings are created.

For a node to be successfully shifted to the left, it must pass a rigorous set of five validation rules derived from the graph topology:

Immutability of Dummies: Dummy nodes (segments of long edges) are not moved. But they can be absorbed or deleted if a node moves into their position.
Same-Layer Dependency: If a node has a parent residing in the same layer (as a result of a previous left move), it cannot move left, as this would visually reverse the edge (\(x_{child} < x_{parent}\)).
Target Slot Occupancy and Absorption: The target slot \((x-1, y)\) must be empty. However, a special exception exists: Dummy Absorption. If the target slot contains a dummy node, and that dummy is the sole parent of the current node, the node can "absorb" the dummy. This effectively removes an unnecessary edge segment and places the real node in the dummy’s former position.
Vertical Corridor Constraint: If the node has a parent \(p\) in layer \(x-1\), the algorithm scans the vertical column at \(x-1\) between \(y_{p}\) and \(y_{node}\). If any real node exists in these slots, the move is rejected. This prevents nodes from visually passing "through" unrelated solid blocks.
Crossing Preservation
This step performs a rigorous check to ensure that moving a node \(v\) to a new position \((x_v-1, y_v)\) does not introduce edge crossings with existing nodes in that layer. The algorithm treats the edges as straight line segments and verifies that the new edges originating from \(v\) do not intersect the existing edges of any neighbor node \(u\) residing in the same layer (\(x-1\)).

Let the layer at \((x - 1)\) be the target layer and \(v\) be the candidate node attempting to move to coordinates \((x_{target}, y_v)\). Let \(U\) be the set of all existing nodes currently residing in layer \(x_{target}\). For every neighbor \(u \in U\) and for every pair of children \((c_v, c_u)\) belonging to \(v\) and \(u\) respectively, the algorithm validates three specific constraints:
1. The Vertical Barrier Constraint (Same-Layer Edges)
  A distinct case arises if a neighbor \(u\) has a child \(w\) residing in the same layer (\(x_u = x_w\)). This edge forms a vertical line segment (a "wall") within the layer. The candidate node \(v\) cannot be placed inside this segment, as it would sit directly on top of the edge connecting \(u\) and \(w\).
  
  The move is rejected if \(y_v\) lies strictly between the vertical positions of the neighbor and its child: \[\min(y_u, y_w) < y_v < \max(y_u, y_w)\]
2. Standard Edge Order Preservation
  In the general case where children reside in different rows (\(y_{c_v} \neq y_{c_u}\)), the algorithm enforces that the vertical order of the parents must match the vertical order of the children. If the candidate node \(v\) is "above" the neighbor \(u\), its child \(c_v\) must also be "above" the neighbor’s child \(c_u\) to avoid an intersection (an ’X’ shape).
  
  Let the function \(pos(n)\) denote the vertical \(y\)-coordinate of node \(n\). The move is valid only if the relative ordering is preserved: \[(pos(v) < pos(u)) \iff (pos(c_v) < pos(c_u))\] If \(v\) is above \(u\) but connects to a child below \(c_u\), the edges inevitably cross.
3. Co-linear Target Depth Constraint
  A subtle edge case occurs when the children \(c_v\) and \(c_u\) share the same vertical row (\(y_{c_v} = y_{c_u}\)), usually meaning they are in the same visual "slot" but at different horizontal depths (\(x\)-layers).
  
  Let \(depth(n)\) denote the horizontal layer index (\(x\)) of node \(n\).
  - Case A: Children are vertically above the parents.
    To avoid crossing, the "higher" parent must connect to the "further" (deeper) child. The edge from the top parent must reach over the edge of the bottom parent. \[\text{If } pos(v) < pos(u) \implies \text{Require } depth(c_v) > depth(c_u)\]
  - Case B: Children are vertically below the parents.
    Conversely, if the edges go downwards, the geometry inverts. The "higher" parent must connect to the "closer" (shallower) child to avoid cutting through the lower parent’s edge. \[\text{If } pos(v) < pos(u) \implies \text{Require } depth(c_v) < depth(c_u)\]
If any of these checks fail for any neighbor \(u\) in the target layer, the proposed move for node \(v\) is rejected.

3.4.3 Grid Cleanup and Post-Processing

Once horizontal compression stabilizes, the grid enters a cleanup phase. First, any remaining dummy nodes are removed, and the graph edges are rewired to connect real nodes directly. Following this, the algorithm applies BPMN-specific heuristics to improve readability:

Start Events typically have no parents and exactly one child. If a Start Event is positioned far to the left of its child, the algorithm pulls it to the layer immediately preceding the child (\(x_{child} - 1\)). This keeps the start of the process tight against the first activity.
If a single parent has many children (\(>3\)) and some are cramped into the parent layer, the algorithm attempts to push these children to the right (\(x+1\)) if space permits. This rule makes the common case of a node with many children have all its children in the next layer, making it visually easy to follow.

Finally, the grid is re-indexed. Rows and columns that become empty during the compression (due to nodes moving) are removed, resulting in a compact grid.

3.4.4 Vertical Local Sweeps

The final step of the Compressor optimizes the vertical alignment within the compressed layers. While the LaneCrossMinimizer determined the relative order, there might still be small misalignments (e.g., "staircase" effects on edges).

The algorithm performs three local sweeps over the grid. An odd number is chosen again to ensure the final pass is a forward sweep. For each node, it calculates a Desired Vertical Position based on the median position of its connected neighbors (parents in the forward sweep, children in the backward sweep). This step never changes the relative order of nodes within a lane. A node is only moved to its desired position if the slot is empty and doing so does not "jump" over a neighbor in the same lane. This results in straighter edges and a visually cleaner diagram before the final coordinates are calculated.

The completion of the compression step marks the end of the structural optimization process. The algorithm has transformed the input graph into a lane-compliant grid where every node is assigned a final, immutable integer tuple \((x_{grid}, y_{grid})\). These logical coordinates are the definitive blueprint of the diagram. They capture the layer assignment and vertical ordering required to satisfy all constraints.

4 Performance and Evaluation

The results generated from the provided examples serve as a comprehensive Proof of Concept (PoC). They validate that the theoretical framework of a constrained Sugiyama approach is mathematically sound and also viable for practical application.

4.1 Runtime Performance

One of the primary non-functional requirements of the algorithm was the ability to arrange diagrams in the order of milliseconds. To evaluate this, the algorithm was tested against a dataset of over 50 real-world BPMN process definitions. These samples ranged from simple linear sequences to complex diagrams with looping structures.

Performance benchmarks were conducted on a standard personal development laptop within the Integrated Development Environment (IDE). The results demonstrate highly efficient scaling:

The algorithm achieves an average computation time of less than \(20\,\text{ms}\) per diagram.
The pipeline is suitable for real-time applications or batch processing of large diagram repositories without introducing disrupting latency for the end-user.

To complement these empirical measurements, the algorithmic complexity analysis confirms that the system scales efficiently. Let \(N\) be the number of nodes and \(E\) the number of edges. The worst-case time complexity is estimated at: \[\mathcal{O}(E \cdot N^2 \log N)\] While this upper bound accounts for highly dense and pathologically complex graphs, execution times observed in practice are significantly lower. This difference comes from real-world BPMN diagrams typically exhibiting sparse topologies where the number of edges grows linearly with the number of nodes (\(E \in \mathcal{O}(N)\)). Furthermore, BPMN models generally have a clear directional flow with localized complexity which allows the heuristics to converge much faster than the theoretical limit.

Detailed profiling confirms that the edge routing step often dominates the execution time, primarily due to the computational cost of orthogonal pathfinding with obstacle and crossing avoidance.

The performance characteristics of the algorithm confirm that the decision to use heuristic approaches successfully balances optimization quality with computational speed.

4.2 Quality and Scalability

The generated diagrams were evaluated against the structural and visual quality targets set forth in the problem definition.

The algorithm strictly meets the requirement that it "must always produce a valid BPMN diagram." By taking advantage of the highly customizable Sugiyama framework, implementing custom BPMN-tailored algorithm components, and using the Flowable model API, the system enforces hard constraints that guarantee strict BPMN compliance.

The layout quality confirms the advantages of the Sugiyama framework. The algorithm produces diagrams that expose the logical hierarchy of the business process. The visual flow matches the flow of the process, which is the essential factor for human readability. The algorithm scales effectively to large, complex diagrams, maintaining clarity even when the process logic becomes convoluted.

5 Conclusion and Remarks

The automatic arrangement of diagrammatic structures is a problem that often appears deceptively simple to the observer but represents a formidable challenge in computer science. As outlined in this concept, the task touches upon fundamental open problems in mathematics, specifically within the domain of NP-hard optimization.

Consequently, this project required extensive theoretical research and significant custom development. It was not sufficient to simply apply a standard graph layout library. The generic Sugiyama framework had to be deconstructed and rebuilt with specific heuristics (such as the LaneCrossMinimizer and Compressor) to accommodate the semantic rigidity of business processes.

During the research phase, it was noted that while robust commercial off-the-shelf solutions capable of handling specific BPMN constraints exist, they often come with significant barriers to entry. These include restrictive commercial licensing models, substantial financial costs, and monolithic architectures that can be difficult to decouple. By developing a custom, modular solution, this research provides a cost-effective, maintainable, and adaptable architecture specifically for BPMN semantics.

The initial project goals aimed for a layout quality "equivalent to or better than human arrangement in at least \(95\%\) of cases." While strictly quantifying this percentage is infeasible due to the subjective nature of "human quality," the qualitative evaluation has been positive. Domain experts reviewed the outputs generated from the sample dataset, confirming that the algorithm consistently produces standardized, professional visualizations that meet business needs.

In conclusion, this paper demostrates that BPMN layout generation can be solved efficiently. The resulting algorithm bridges the gap between abstract graph theory and practical business application, creating a tool that significantly enhances process modeling efficiency.

References

N. S. Nikolov, “Sugiyama Algorithm,” in Encyclopedia of Algorithms, M.-Y. Kao, Ed. New York, NY: Springer, 2016, pp. 2162–2166. https://doi.org/10.1007/978-1-4939-2864-4_649

Object Management Group, “Business Process Model and Notation (BPMN), Version 2.0.2,” Dec. 2013. [Online]. Available: https://www.omg.org/spec/BPMN. [Accessed: Jan. 2026].

Flowable, “Flowable Open Source Documentation.” [Online]. Available: https://www.flowable.com/open-source. [Accessed: Jan. 2026].

The Apache Software Foundation, “Apache License, Version 2.0,” Jan. 2004. [Online]. Available: https://www.apache.org/licenses/LICENSE-2.0. [Accessed: Jan. 2026].

The Eclipse Foundation, “Eclipse Public License - v 2.0,” Aug. 2017. [Online]. Available: https://www.eclipse.org/legal/epl-2.0/. [Accessed: Jan. 2026].

The Eclipse Foundation, “Eclipse Layout Kernel (ELK).” [Online]. Available: https://eclipse.dev/elk/. [Accessed: Jan. 2026].

The Eclipse Foundation, “ELK Layered Algorithm Strategy.” [Online]. Available: https://eclipse.dev/elk/reference/algorithms/org-eclipse-elk-layered.html. [Accessed: Jan. 2026].

The Eclipse Foundation, “ELK Layered: Cycle Breaking Strategy.” [Online]. Available: https://eclipse.dev/elk/reference/options/org-eclipse-elk-layered-cycleBreaking-strategy.html. [Accessed: Jan. 2026].

The Eclipse Foundation, “ELK Layered: Layering Strategy.” [Online]. Available: https://eclipse.dev/elk/reference/options/org-eclipse-elk-layered-layering-strategy.html. [Accessed: Jan. 2026].

yWorks GmbH, “yFiles - The Software Development Kit for Diagramming Applications.” [Online]. Available: https://www.yfiles.com/. [Accessed: Jan. 2026].

yWorks GmbH, “yFiles for HTML - Business Process Management.” [Online]. Available: https://www.yfiles.com/solutions/use-cases/business-process-management. [Accessed: Jan. 2026].

Algorithm for Automatic Arrangement of BPMN Elements

Table of Contents