Class SortTransform<TInputOutput>
A dataflow worker with one Input
port and one Output
port, that sorts the incoming
rows, optionally removing duplicates, before passing them to the downstream worker.
Note: Use the factory methods in SortTransformFactory to create instances of this class.
SortTransform
uses an in-memory sort. This is a fully blocking transform, i.e. it must receive
and buffer all rows before outputting any rows, which can consume large amounts of memory.
For large datasets that risks exhausting available memory and heavily page to disk,
consider performing the sort in a database, before bringing the data into or back to
the dataflow. Please see
Buffering and Memory Consumption
for more details.
The sort algorithm is unstable, i.e. two rows that compare as equal are not guaranteed to keep their relative ordering.
Implements
Inherited Members
Namespace: actionETL
Assembly: actionETL.dll
Syntax
public class SortTransform<TInputOutput> : WorkerBase<SortTransform<TInputOutput>>, IDisposeOnFinished where TInputOutput : class
Type Parameters
Name | Description |
---|---|
TInputOutput | The type of each |
Properties
Input
Gets the input port for consuming rows from the upstream worker.
Declaration
public InputPort<TInputOutput> Input { get; }
Property Value
Type | Description |
---|---|
InputPort<TInputOutput> |
Output
Gets the output port for sending rows to the downstream worker.
Declaration
public OutputPort<TInputOutput> Output { get; }
Property Value
Type | Description |
---|---|
OutputPort<TInputOutput> |
RemoveDuplicates
Gets or sets a value indicating whether to remove duplicates. Cannot be set after the worker has started running.
Note: This property is thread-safe.
Declaration
public bool RemoveDuplicates { get; set; }
Property Value
Type | Description |
---|---|
Boolean |
|
Exceptions
Type | Condition |
---|---|
InvalidOperationException | Cannot set the value after the worker has started running. |
Methods
RunAsync()
This method can be overridden to add custom functionality to the derived worker that runs before
and after the row processing. In this case, the base class base.RunAsync()
must
be called for the worker to function correctly.
Typically, this worker is used without overriding this method.
Declaration
protected override async Task<OutcomeStatus> RunAsync()
Returns
Type | Description |
---|---|
Task<OutcomeStatus> | A |