Transform Workers
A transform worker has both input and data output ports. It consumes and processes the input data rows, and outputs the same (modified or unmodified) rows it received, and/or generates new rows to output. The output ports can optionally have a different schema vs. the input ports.
Note
- The library user instantiates transforms via factory methods, see Worker Instantiation for details
- The type parameters describe what ports the transform has, e.g.
*Error*
means it has an error output port
Also see Source Workers and Target Workers, as well as Non-Dataflow Workers.
ActionTransform
, ActionTwoInputTransform
Execute an Action
or Func
asynchronous callback once that consumes rows from the
upstream worker and sends rows to the downstream worker. The workers have one
or two input ports, respectively.
- Factories:
- Workers:
- Example: Custom Dataflow Transform
AggregateTransform
Aggregate and optionally group input rows, and output at most one row for all input rows
(or one per grouping), calculating column aggregations (Sum
, Average
, Max
etc.),
row aggregations (First
, Last
, Single
etc.), and custom aggregations.
- Factory: AggregateTransformFactory
- Workers:
- Examples:
CrossJoinTransform
Perform a CROSS JOIN on two inputs and send to the downstream worker.
- Factory: CrossJoinTransformFactory
- Worker: CrossJoinTransform<TLeftInput, TRightInput, TOutput>
- Example: See the similar Dataflow Column Mapping using INNER JOIN
DictionaryLookupTransform
Lookup a key in a dictionary (either provided or loaded from a DictionaryInput
)
for each row, with full or partial caching. Modify and redirect the row to the Output
or ErrorOutput
.
- Factory: DictionaryLookupTransformFactory
- Workers:
- Example: Dataflow Lookups
DictionaryLookupSplitTransform
Lookup a key in a dictionary (either provided or loaded from a DictionaryInput
) for each row,
with full or partial caching. Modify and redirect the row to the FoundOutput
,
NotFoundOutput
, or ErrorOutput
.
- Factory: DictionaryLookupSplitTransformFactory
- Workers:
- Example: Dataflow Lookups
FullJoinMergeSortedTransform
Perform a FULL JOIN on two presorted inputs and send to the downstream worker.
- Factory: FullJoinMergeSortedTransformFactory
- Worker: FullJoinMergeSortedTransform<TLeftInput, TRightInput, TOutput>
- Example: See the similar Dataflow Column Mapping using INNER JOIN
InnerJoinMergeSortedTransform
Perform an INNER JOIN on two presorted inputs and send to the downstream worker.
- Factory: InnerJoinMergeSortedTransformFactory
- Worker: InnerJoinMergeSortedTransform<TLeftInput, TRightInput, TOutput>
- Example: Dataflow Column Mapping
LeftJoinMergeSortedTransform
Perform a LEFT JOIN on two presorted inputs and send to the downstream worker.
- Factory: LeftJoinMergeSortedTransformFactory
- Worker: LeftJoinMergeSortedTransform<TLeftInput, TRightInput, TOutput>
- Example: See the similar Dataflow Column Mapping using INNER JOIN
MergeSortedTransform
Merge multiple presorted inputs of the same type into a single sorted output. Any duplicates are preserved.
- Factory: MergeSortedTransformFactory
- Worker: MergeSortedTransform<TInputOutput>
MulticastTransform
Clones input rows to one or more outputs, all of the same type.
- Factory: MulticastTransformFactory
- Worker: MulticastTransform<TInputOutput>
- Examples:
RightJoinMergeSortedTransform
Perform a RIGHT JOIN on two presorted inputs and send to the downstream worker.
- Factory: RightJoinMergeSortedTransformFactory
- Worker: RightJoinMergeSortedTransform<TLeftInput, TRightInput, TOutput>
- Example: See the similar Dataflow Column Mapping using INNER JOIN
RowActionTransform
Execute an Action
or Func
callback for each input row before passing it to the downstream worker.
- Factory: RowActionTransformFactory
- Workers:
- Examples:
- Custom Dataflow Transform
- Dataflow
- Three examples in Dataflow Row Errors
- Custom Dataflow Target
RowsActionTransform
Repeatedly execute an Action
or Func
callback when there is both rows to consume from the
upstream worker and demand available from the downstream worker.
- Factory: RowsActionTransformFactory
- Workers:
- Example: Custom Dataflow Transform
RowsTransformBase
Execute a virtual method when there are input rows and output demand available. Must be inherited.
- Workers:
- Example: Custom Dataflow Transform
RowTransformBase
Execute a virtual method for each input row before passing it to the downstream worker. Must be inherited.
- Workers:
- Examples:
SortTransform
Sort the incoming rows and pass them to the downstream worker, optionally removing duplicates.
- Factory: SortTransformFactory
- Worker: SortTransform<TInputOutput>
- Examples:
SplitTransform
A worker with one input and one or more outputs, all of the same type, that sends each input row to a specific output, or discards the row, based on a supplied function.
- Factory: SplitTransformFactory
- Worker: SplitTransform<TInputOutputError>
- Examples:
TransformBase
, TwoInputTransformBase
Workers with one or two inputs, and one output, for deriving dataflow transforms. Must be inherited.
- Workers:
- Example: Custom Dataflow Transform
UnionAllTransform
Combine multiple inputs of the same type into a single output. Any duplicates are preserved.
- Factory: UnionAllTransformFactory
- Worker: UnionAllTransform<TInputOutput>
- Examples: