Custom Dataflow Pass-through Workers
All workers have the inherent ability to contain or use child workers, including dataflow child workers, to implement their desired functionality, either ad-hoc, or in a custom worker.
If you can use existing workers to implement your functionality (rather than coding it from scratch), code size and implementation effort is often much smaller (see e.g. Slowly Changing Dimension Example), so it's well worth considering.
Pass-through workers extend the above functionality by also allowing child dataflow workers to directly send and receive rows on the parent dataflow worker's ports, i.e. the child (dataflow) workers provide the implementation for the parent worker dataflow ports.
E.g., this parent source worker uses a control flow child and a dataflow child to provide its implementation, plus a PortPassThroughTarget<TInputOutput> child to pass the child output rows to its own output port (see code example):
And this parent target worker uses a PortPassThroughSource<TOutput> child to pass on the parent incoming rows to the child downstream workers, which performs the required processing (see code example):
Finally, you can use both above source and target approaches to create a custom transform (see further details):
Note
- A source parent worker port requires a PortPassThroughTarget<TInputOutput> child worker.
- A target parent worker port requires a PortPassThroughSource<TOutput> child worker.
- A transform parent worker requires both of the above child workers.
Pass-through provides excellent encapsulation, and improves reusability, since an arbitrarily complex set of regular and dataflow (child) workers can be presented to the user as a single dataflow (parent) worker. Depending on which ports the parent worker has, it becomes either a source, transform, or target worker.
The parent worker can also have its own implementation (in RunAsync() or in callbacks), in addition to the child workers, e.g. to perform pre- and/or post-processing, and to generate and/or monitor child workers.
Pass-through vs. Alternatives
Using pass-through has many benefits:
- Pros:
- Creates reusable worker
- Can add members (i.e. properties and methods) to the worker
- Encapsulates and hides the implementation
- Reuses the implementation of other workers
- Supports an arbitrary number of input, output, and error ports
- Cons:
- When compared to using the same child workers directly without encapsulation, and if only using it once, pass-through requires somewhat more code compared to not encapsulating the required workers
The main alternatives are to use the same child workers directly, without encapsulation, or to implement the functionality from scratch, without reusing existing workers.
Note
- Pass-through is a Pro feature, see Licensing for details
- The parent worker and child worker ports are not 'linked' per se, and do not have the same parent worker which would otherwise be required. Instead, the parent and child dataflows interact as if the other is an (extremely fast) external data source.
Examples
Please see the following code examples for more details:
- Compose Source with Pass-through
- Compose Transform with Pass-through
- Compose Target with Pass-through